Physics-informed Neural Networks with Periodic Activation Functions for Solute Transport in Heterogeneous Porous Media

Solute transport in porous media is relevant to a wide range of applications in hydrogeology, geothermal energy, underground CO2 storage, and a variety of chemical engineering systems. Due to the complexity of solute transport in heterogeneous porous media, traditional solvers require high resolution meshing and are therefore expensive computationally. This study explores the application of a mesh-free method based on deep learning to accelerate the simulation of solute transport. We employ Physics-informed Neural Networks (PiNN) to solve solute transport problems in homogeneous and heterogeneous porous media governed by the advection-dispersion equation. Unlike traditional neural networks that learn from large training datasets, PiNNs only leverage the strong form mathematical models to simultaneously solve for multiple dependent or independent ﬁeld variables (e.g., pressure and solute concentration ﬁelds). In this study, we construct PiNN using a periodic activation function to better represent the complex physical signals (i.e., pressure) and their derivatives (i.e., velocity). Several case studies are designed with the intention of investigating the proposed PiNN’s capability to handle diﬀerent degrees of complexity. A manual hyperparameter tuning method is used to ﬁnd the best PiNN architecture for each test case. Point-wise error and mean square error (MSE) measures are employed to assess the performance of PiNNs’ predictions against the ground truth solutions obtained analytically or numerically using the ﬁnite element method. Our ﬁndings show that the predictions of PiNN are in good agreement with the ground truth solutions while reducing computational complexity and cost by, at least, three orders of magnitude.


Introduction
Solute transport in porous media is crucial in many environmental and reservoir engineering applications, including risk and safety assessment of groundwater contamination [1], secondary recovery processes in petroleum reservoirs [2], geological storage of radioactive waste [3], hydrogen [4], and carbon dioxide [5], just to name a few [6].It is a complicated process due to the varied temporal and spatial transport characteristics of the flow network [7].These characteristics can be divided into flowing areas influenced primarily by advection and stagnant regions governed mainly by dispersion [8].The process is considerably more complicated in heterogeneous porous media because all available pore space does not contribute to the flow uniformly; certain locations are dead ends, or transport is extremely slow due to inadequate connection to the main flow channels [9].Despite a conceptual understanding of the significance of multi-component solute transport mechanisms in porous media in general, attempts to conduct reliable experimental investigations are hindered by several barriers, resulting in a dearth of published information.The major causes are the technical difficulty in creating flow conditions and pore space connections under regulated laboratory or in situ test circumstances [10].Hence, a system of field equations, which are numerically represented by partial differential equations (PDEs), is practically the only tool to predict the behavior of such non-linear flows in interconnected systems.In solute transport studies, traditional solvers, such as finite element (FEM), finite difference (FDM), and finite volume methods (FVM), have been extensively used to solve the governing PDEs.For example, Zhang et al. [11] suggested a 1D model based on FEM for multi-component solute transport in saturated soil.Bagalkot and Suresh Kumar [12] presented a 1D numerical evaluation of the FDM for multi-species radionuclide transport in a single horizontally coupled fracture-matrix system.Mostaghimi et al. [13] and Maheshwari et al. [14] investigated the effect of pore structure heterogeneity on reactive transport using a 3D pore scale model based on FVM.Although most large-scale calculations still use FEM, FDM, or FVM, they require high-resolution meshing to describe the geometry of the domain being represented.Because the size of an elementary mesh is normally too large, discrete domain descriptions may also need the availability of efficient transport equations at the mesh scale and the consideration of unresolved subgrid-scale effects [15].Hence, new methodologies and algorithms, especially mesh-free methods based on deep learning, are required to accelerate numerical simulations of solute transport through porous media.Deep Learning (DL) methods have the potential to offer mesh-free solvers and address some of the aforementioned challenges.Although most applications use DL (e.g., neural networks) to solve a lack of effective data modeling processes, explore vast design domains, and identify multidimensional connections [16,17], there is growing interest in using neural networks to solve PDEs [18,19,20,21].In general, there are several main neural network frameworks to augment scientific computing: Physics-guided Neural Networks (PgNN), Physics-informed Neural Networks (PiNN), Physics-encoded Neural Networks (PeNN), and Neural Operators (NOs).The readers are referred to a recent review by Faroughi et al. [22], where different neural network frameworks are compared head-to-head, and their challenges and limitations are thoroughly discussed.In this study, we adopt PiNN because of its straightforward mechanism to integrate the underlying physics compared to other approaches.By combining a loss function composed of the residuals of the physics equations, initial conditions, and boundary restrictions, PiNN-based models adhere to the physical laws.They employ automated differentiation to differentiate the output of neural networks with respect to their input (that is, spatio-temporal coordinates and model parameters) [23].The network can estimate the solution with high precision by minimizing the loss function [24].Therefore, PiNN provides the foundation for a mesh-free solver that incorporates long-standing advances in mathematical physics into DL [18].
Raissi et al. [25] developed PiNN as a new computing paradigm for forward and inverse modeling in a series of studies [18,25,26].Raissi et al. [26] developed a PiNN framework, dubbed hidden fluid mechanics (HFM), to encode the physical laws governing fluid flows, i.e., the Navier-Stokes equations.They leveraged underlying conservation laws to derive hidden quantities of objective functions such as velocity and pressure fields from spatiotemporal visualizations of a passive scalar concentration in arbitrarily complex domains.Their technique accurately predicted 2D and 3D pressure and velocity fields in benchmark problems inspired by real-world applications.Since then, PiNN and its different variants have been applied aggressively to different fields, including porous media flows.Almajid and Abu-Al-Saud [27] employed the PiNN framework to solve both the forward and inverse problems of the Buckley-Leverett PDE equation representing two-phase flow in porous media.To test their implementation, they applied the classic problem of gas drainage through a porous water-filled medium.Several cases were examined to demonstrate the significance of the connectivity between observable data and PiNNs for various parameter spaces.According to their results, PiNNs are capable of capturing the solution's broad trend even without observed data, but their precision and accuracy improve significantly with observed data.Hanna et al. [28] employed PiNN to simulate one-dimensional (1D) and two-dimensional (2D) two-phase flow in porous media based on a new residual-based adaptive algorithm.Applying the PDE residual to build a probability density function from which additional collocation points are taken and added to the training set was fundamental to their work.The approach was applied individually to each PDE in the coupled system, considering the different collocation points for each PDE.Furthermore, the method was applied to enrich the points used to capture the initial and boundary conditions.They claimed that their approach yielded superior results with less generalization error than conventional PiNN with fixed collocation points.Haghighat et al. [29] introduced a PiNN method to solve coupled flow and deformation equations in porous media for single-phase and multiphase flow.Due to the problem's dynamic nature, they reported a dimensionless form of the coupled governing equations for the optimizer.In addition, they presented a way for sequential training based on the stress-split algorithms of poromechanics.They demonstrated that sequential training based on stress-split performs well for a variety of problems, whereas the conventional strain-split algorithm exhibits instability comparable to that observed for FEM solvers.He et al. [30] extended a PiNN-based parameter estimation method to integrate multiphysics measurement.They studied a subsurface transport problem with sparse conductivity, a hydraulic head, and solute concentration.In their methodology, they employed the Darcy and advection-dispersion equations in conjunction with the data to train deep neural networks, reflecting space-dependent conductivity, head, and concentration fields.They proved that the proposed PiNN method considerably increased the accuracy of parameter and state estimates for sparse data compared to conventional deep neural networks trained with data alone.He and Tartakovsky [31] also suggested a discretization-free technique based on PiNN for solving coupled advectiondispersion equations and the Darcy flow equation with space-dependent hydraulic conductivity.They used PiNN for 1D and 2D forward advection-dispersion equations and compared its performance for various Peclet numbers (Pe) with analytical and numerical solutions.They found that PiNN was accurate with errors of less than 1% and outperformed other discretization-based approaches for large Pe.In addition, they proved that PiNN remained accurate for the backward advection-dispersion equations, with relative errors below 5% in the majority of instances.In another study, Vadyala et al. [32] implemented PiNN with the use of a machine learning framework such that it can be employed in reduced-order models to reduce the epistemic (modelform) ambiguity associated with the advection equation.They demonstrated that PiNN provided an accurate and consistent approximation with PDEs.Furthermore, they showed that PiNN could transform the physics simulation field by enabling real-time physics simulation and geometry optimization on large supercomputers.
Although there has been an increase in the application of PiNN to fluid flow studies in porous media [33,34,35,36], the application of PiNN to the advection-dispersion equation in heterogeneous porous media has received less attention due to the difficulty in predicting the velocity field imposed by the variation in the permeability field.The main objective of this study is to develop a PiNN model capable of accurately predicting transient solute transport in heterogeneous porous media.Due to the presence of the second-order derivative in the governing equations, selecting the activation function for PiNN becomes a critical step [37].To this end, a PiNN is constructed using periodic activation functions, and its convergence rate and accuracy are compared against a PiNN using tanh to model transport phenomena in porous media.Although it is theoretically believed that neural networks with periodic activation functions, e.g., sinusoids, may be harder to train, it is shown in several cases that periodicity is intuitively beneficial and can learn faster and better than other activation functions [38,37].In this study, using 1D and 2D test cases, we also show that a PiNN with a sine activation function provides more accurate predictions for solute transport in heterogeneous porous media while reducing the training time.
This paper is structured as follows: In Section 2, the underlying physics for solute transport in a porous medium is presented.Section 3 discusses PiNN's algorithm with periodic activation functions to predict solute transport.In Section 4, the PiNN is deployed to several one-dimensional (1D) and two-dimensional (2D) case studies, and its predictions are examined by comparing them with analytical and/or FEM solutions.Finally, in Section 5, we summarize the main conclusions of this work.

Underlying Physics
Solute transport in porous media is governed by three main mechanisms: advection, diffusion, and dispersion [39,40,41].The mathematical formulation of the advection-diffusion-dispersion equation is explained briefly here in terms of implementing the flow and transport factors that affect the transport process.The mass conservation equation serves as the basis for the solute transport equation [42,43] that reads as, where J [M L −2 T −1 ] is the solute mass flux, ϕ is the porosity, and C [M L −3 ] is the solute concentration.The flux J contains two mechanisms, where the first term refers to the diffusion flux, and the second term refers to the advection flux [44,45].In Eq. ( 2), D [L 2 T −1 ] is the molecular diffusion coefficient (D = D 0 ), and u [LT −1 ] is the velocity vector of the pore fluid flow computed using Darcy's law, where q [LT −1 ] is the Darcy's flux, P [M L −1 T −2 ] refers to the pressure field, g [LT −2 ] is the gravity, k [L 2 ] is the permeability, and µ [M L −1 T −1 ] and ρ [M L −3 ] are the fluid viscosity and density, respectively.To include the impact of dispersion, the diffusion coefficient in Eq. 2 can be changed to where α is known as the dynamic dispersivity, D is now called the hydrodynamic dispersion coefficient, and U is the magnitude of fluid velocity defined as U = |u|.In this work, we assume that longitudinal and transverse dispersion are identical; however, we allow U to vary spatially for a heterogeneous porous medium where u varies due to permeability changes [46,47].The reason for this assumption is to focus on the impact of permeability and how PiNN can handle it when supplied with different activation functions.Substituting Eq. (2) into Eq.( 1) leads to which can be rewritten as, that incorporates diffusion, dispersion, and advection transports and serves as the foundation of solute transport in porous media [48].If the solute transport process has no effect on fluid density (i.e., incompressible flow, ∇.u = 0), the velocity field can be calculated independently of the solute concentration using Darcy's law, and Eq. ( 6) reduces to, The flow incompressibility condition generates an equation for computing the pressure field as, that reads as, for two-dimensional (2D) flow problems in x and y direction assuming g x = g y = 0, where Equations ( 7) and ( 9) together form the governing equations for the concentration and pressure fields in a porous medium with a heterogeneous permeability distribution [48].

Methodology
In this study, Physics-informed Neural Networks (PiNN) [18,17] are leveraged to resolve the solute transport in porous media, which is governed by a strong mathematical form consisting of the pressure and advection-dispersion equations as well as the relevant initial and boundary conditions.Unlike Physics-guided Neural Networks (PgNNs), PiNN is a solution learning method that does not require labeled datasets [22].In PiNNs, the underlying physics is included outside of the neural network architecture to constrain the model during training and ensure that the outputs follow given physical laws.The most common method to emulate this process is through a weakly imposed penalty loss that penalizes the network when it does not follow the physical constraints [25,49,50].The network digests spatiotemporal coordinates, (x ,t), as inputs to predict a solution set, s, as an approximate to the ground truth solution, ŝ.The automatic differentiation (AD) is then used to generate the derivatives of the predicted solution s with respect to inputs.These derivatives are used to formulate the residuals of the governing equations in the loss function weighted by different coefficients.θ and λ are the learnable parameters for weights/biases and unknown PDE parameters, respectively, that can be learned simultaneously while minimizing the loss function.
A schematic representation of our PiNN architecture is illustrated in Fig. 1.As illustrated, a PiNN consists of three elements: a neural network (NN), an automatic differentiation (AD) layer, and a feedback mechanism informed by physics [51].A neural network is first built in order to digest the spatiotemporal features (i.e., x and t) as input parameters and approximate the solution, s(x, t), of a phenomenon described by PDEs with known boundary values.Assume that N L (x) : R din → R dout be a L-layer neural network, or a (L − 1)-hidden layer neural network, with N l neurons in the l th layer (N 0 = d in , N L = d out ) [52].Each layer in the NN is associated with its own weight matrix W l ∈ R N l ×N l−1 and bias vector b l ∈ R N l .The NN is recursively defined using an element-wise nonlinear activation function, σ, as follows, [53], Hidden layers: Output layer: where σ is commonly specified as the logistic sigmoid σ(x) = 1/(1 − exp(−x)), hyperbolic tangent σ(x) = tanh (x), or the rectified linear unit, σ(x) = max (x, 0) [52].In this work, we propose to use a sine (i.e., periodic) activation function, and compare its performance against tanh, as one of the most used activation functions for PiNNs [37], in simulating solute transport in porous media.Periodic activation functions are ideally suited for representing complex physical signals and their derivatives while converging faster than baseline architectures [38,54].This superior behavior, in comparison to tanh, is attributed to (i) the presence of simple sum and diff tasks for which a sine function can outperform hyperbolic tangent function [38], and (ii) the derivative of a network with sine activation function behaves like the network itself (i.e., the derivative of the sine is a cosine that is a phase-shifted sine) that potentially speeds up the learning process [54], especially in the solute transport problem that contains the second order derivatives in the governing equations.The NN outputs are then fed into the AD layer, which is the central property of PiNNs.AD is employed to assess the derivatives of the network outputs with respect to the network inputs [55,52,23,56].Consider a strong mathematical form with a PDE specified in the domain Ω and parameterized by λ, and boundary conditions (e.g., Dirichlet, Neumann, Robin boundary condition, etc.) specified on the boundary of the domain where s is the unknown solution, and ∆ represents the linear or nonlinear differential operator (e.g., ∂ ∂t , ∂ ∂x , ∂ 2 ∂x 2 , etc.).The initial condition can be considered as a kind of Dirichlet boundary condition in the spatiotemporal domain.AD is used to apply the ∆ and ψ operators to the neural network with respect to inputs, i.e., x and t, and generate the required terms in the loss function in order to optimize the PDE solution, s.Lastly, a feedback mechanism is constructed to minimize the loss terms through optimizations.The total loss term is defined as [57,56,58], where w i , w b , w d , and w p , are referred to as the weights for the loss due to the initial conditions, boundary conditions, labeled data, if any, and PDEs, respectively.The individual loss terms are computed as, where t 0 is the initial time, s is the neural network prediction, ŝ is the ground truth solution, N i , N b , N d , and N p are, respectively, the number of spatiotemporal points representing the initial conditions, boundary conditions, labeled data, and collocation points (i.e., spatiotemporal points within the domain where the neural network prediction, s(x, t) is checked against the constraints of PDEs).PiNN with weights θ can then be trained by optimizing, leading to the PiNN with weights θ ′ that generates reasonable predictions.N * represents the total number of input-output pairs in the training process.PiNN training is more difficult than PgNN training, because PiNNs are composed of sophisticated non-convex and multi-objective loss functions, which may cause instability during optimization [59,60].The selection of weights for the loss terms is ad-hoc and problem (PDE) dependent.The weights are adjusted using trials and errors in the training phase to reach the minimum minimization error or mitigate the instability of the solution [22].In this study, we start with identical weights for all loss terms, and as needed (especially for heterogeneous cases where solutions encounter instability issues due to variation in the permeability field), we reduce the weight for PDE loss terms, ω p , to fully respect the boundary conditions and mitigate the instability in the solution.Also, we set w d = 0 as we do not provide labeled data to PiNN.Finally, a gradient-based optimizer such as Adam method [61] and Limited-memory Broyden-Fletcher Goldfarb-Shanno with box constraints, L-BFGS-B [62], should be used to minimize the loss function.It is found that the PiNN predictions strongly depend on which of these two algorithms is selected [53], and several studies suggested a two-step optimization algorithm that starts with the Adam method for a prescribed number of iterations, and then continues with the L-BFGS-B method until convergence [30,31].In this work, L-BFGS-B, as a second-order, optimizer is used.It is observed that L-BFGS-B finds a satisfactory solution for smooth PDEs faster than Adam and with fewer iterations [63,64,65].This is aligned with the goal of this work, which compares PiNN's predictions when trained using different activation functions, without adding the complexity of alternating between optimizer schemes.Using L-BFGS-B, at each iteration, the loss is checked against L < ϵ, where ϵ is a specified tolerance.If the condition is not met, error backpropagation is implemented to update the learnable parameters (θ and/or λ).The entire cycle is repeated for a given number of iterations until the PiNN model produces learnable parameters with a loss error less than ϵ [66].
The proposed PiNN is deployed to model solute transport in porous media by predicting pressure and concentration fields, i.e., s = (P, C).We select the PiNN architectures using an iterative random-search hyperparameter tuning [67] (e.g., selecting the number of layers, neurons per layer, collocation points, and weights randomly within a specified range) for different case studies in this work due to differences in spatial dimension, boundary conditions, and permeability field.We ensure that the selected PiNN has enough layers and width to estimate each target field.There are two critical normalization steps in the implementation of PiNNs that can be followed to ensure faster convergence to the correct solution [68], similar to the common practice of scaling and dimensionless analysis in other mesh-based computational techniques, e.g., FVM, FEM, etc.The first step is to map the network input and output variables to the interval [0, 1] ∈ R. The second step is to scale the pressure and concentration equations such that all terms are of the same order.In this work, we use the absolute point error (APE) and the mean squared error (MSE) defined as, as statistical measurands to assess the accuracy of PiNNs' predictions, s, for a certain time-step against the ground truth solutions, ŝ, obtained analytically or numerically using the finite element method.Here, N is the total number of inference points (resembling the spatiotemporal mesh) to generate the solutions for comparison.

Computational Experiments
In this section, the proposed PiNN model is applied to solve the 1D and 2D solute transport phenomena under different conditions (i.e., a total of seven computational experiments with homogeneous and heterogeneous domains).We compare the PiNN models with sin and tanh activation functions and validate them against analytical and/or FEM solutions to examine their capability and accuracy.The comparisons are presented in terms of accuracy and training time.

Case 1: 1D solute transport with constant velocity
This test case explores solute transport in a 1D domain (x ∈ [0, L]) representing an isotropic porous medium with a constant velocity field.As the steady state velocity field is known, the governing equation, Eq. ( 7), reduced to, where L = 1 is the length of the domain, u x = 0.5 m/s is the steady state velocity field, and D x = 0.02 m 2 /s refers to the hydrodynamic dispersion coefficient in the x direction.For this case, we assume the following initial and boundary conditions, where C 0 = 1.0 kg/m 3 is the injected concentration at x = 0 (i.e., Dirichlet boundary condition).This 1D advection-dispersion can be solved analytically [69].The analytical solution is, where β i are the roots of, Equations ( 27) and ( 28) can be approximated using [42], where A(x, t) is computed as, (30) A PiNN with sin activation function and randomly distributed collocation points (with a uniform distribution scheme in the specified domain) is used to predict the solute transport in 1D domain described by Eq. ( 25) with the initial and boundary conditions given in Eq. ( 26).An iterative random-search hyperparameter tuning process is practiced to find the best architecture based on the MSE accuracy measures.This process yielded a network with four hidden layers with {32, 32, 16, 16} neurons per layer, 5,000 spatiotemporal random points within 0 ≤ x ≤ 1 and 0 ≤ t ≤ 10 as the collocation points to enforce the PDE loss term, 5,000 spatiotemporal random points to collectively represent the boundary and initial conditions informing the loss terms, and identical weights for all loss terms (w i = w b = w p = 1).Another PiNN with similar architecture but using the tanh activation function is also trained and tuned for comparison purposes, see Table 1.For this PiNN model, the number of layers, neurons per layer, and collocation points are fixed, but the weights are allowed to change for better optimization.These assumptions make the comparisons fair in terms of both accuracy and training time.
Figure 2 represents the concentration field in the (x, t) domain predicted by PiNN using sin activation function.It also shows the PiNNs' prediction compared with the ground truth (analytical solution given by Eq. ( 30) at different simulation times.The comparison demonstrates a good agreement yielding M SE = 1.15 × 10 −6 using sin activation function and M SE = 1.21 × 10 −6 using tanh activation function for the entire spatiotemporal domain.As reported in Table 1, for this 1D case, the PiNN with tanh activation function approximates the solution with the same order of magnitude accuracy, but with nearly 25% increase in training time captured based on three runs using a system with 3.00-GHz 48-core Intel Xeon Gold 6248R CPU, Nvidia Quadro RTX 8000 GPU, and 128 GB of RAM.In the next test cases, to better differentiate the models, the PiNNs are applied to more complex problems representing the complexity of solute transport in porous media.
Table 1: A comparison of PiNN models using sin and tanh activation functions in terms of accuracy and training time to predict the concentration field in homogeneous 1D and 2D porous media, validated against the analytical solutions.The MSE is calculated using Eq.24 in (t,x) domain for Case 1, and in (x,y) domain at t = 1.0 (s) for Case 2 assuming u = (ux, uy) = (0.0, 0.50) m/s.The training time of the PiNNs is measured through three runs on a system equipped with a 3.00-GHz 48-core Intel Xeon Gold 6248R CPU, Nvidia Quadro RTX 8000 GPU, and 128 GB of RAM.Additionally, the number of collection points in the domain and the weights assigned to the loss terms are reported for each PiNN model.

Case 1
Case 2 sin tanh sin tanh Collocation Points, N p 5000 5000 8000 8000 Weights: w i , w b , w p 1.0, 1.0, 1.0 1.0, 1.0, 0.9 1.0, 1.0, 1.0 1.0, 1.0, 0. ) representing an isotropic porous medium with a constant velocity field, u = (u x , u y ).Since the velocity field is given, the governing equation for solute transport, Eq. ( 7), is simplified to, where D y = D x = 0.02 m 2 /s refers to the hydrodynamic dispersion coefficient in the y and x directions.As schematically shown in Fig. 3, the following initial and boundary conditions, are considered for this test case.The blue points on the left boundary bounded by y 1 and y 2 represent the positions of point-source solute injection at the rate of C 0 = 0.2 kg/m 3 .The other three sides of the domain are assigned as zero-gradient boundaries, assuming a free outflow of solute.This 2D advection-dispersion problem defined by Eqs. ( 31) and ( 32) can be solved analytically [69], where L n is defined as, and P n is computed as, where η and ζ, respectively, are, First, a PiNN is employed with sin activation function and randomly distributed collocation points to predict the solute transport in the 2D domain (L = W = 1 m) considering the dispersion only with D x = D y = 0.02 m 2 /s and u = (u x , u y ) = (0.0, 0.0) m/s in Eq. ( 31) and the initial and boundary conditions given by Eq. (32).The solute injection rate is set to C 0 = 0.2 kg/m 3 between y 1 = 0.3 m and y 2 = 0.7 m.To determine the best architecture, similar to the previous case, an iterative random-search hyperparameter tuning approach is practiced.This approach produced a network with four hidden layers and {32,16,16,16} neurons per layer, 8,000 spatiotemporal random points within 0 ≤ x, y ≤ 1, 0 ≤ t ≤ 1 as collocation points to enforce the PDE loss term, 8,000 spatiotemporal random points to collectively represent the boundary and initial conditions informing the loss terms, and identical weights for all loss terms (w i = w b = w p = 1).The red dots represent sample collocation points inside the domain corresponding to the loss term associated with the 2D solute transport PDE, and the blue, green, and black dots represent the points on the boundary of the domain corresponding to the loss terms associated with the boundary conditions.The lower panel shows the comparison between the PiNNs' predictions for the concentration field at t = 1.00 s and the ground truth obtained analytically using Eq. ( 33).The absolute point error shows the mismatch between the solutions, and the total loss plot depicts the difference in the convergence rate of PiNNs.
Figure 3 shows the comparison of the 2D concentration fields predicted by PiNNs with sin and tanh activation functions and the analytical solution given by Eq. (33).The absolute point error is also shown, which illustrates a good agreement between the analytic solution and PiNN's prediction.Based on the total loss plot, it can be observed that the PiNN with sin activation function is faster to converge and more accurate.The mismatch with the analytical solution in both cases is considerable at locations where extremely high concentration gradients exist (e.g., close to points (0.0, 0.30) and (0.0, 0.70)).This mismatch is due to a difficult-to-minimize approximation error, i.e., PiNN struggles to converge to ground truth solutions close to those points due to the inherent complexity that arises in high-gradient areas and the limited capacity of the network architecture and training procedure [70,71].A PiNN with a deeper network may resolve those areas and converge to the correct solution, but that makes minimization error a harder task.
Two PiNN models with sin and tanh activation functions and the same architecture as discussed above are then employed to model the advection-dispersion solute transport in 2D domains with D x = D y = 0.02 m 2 /s and u = (u x , u y ) = (0.5, 0.0) m/s.The solute is injected again at a rate of C 0 = 0.2 kg/m 3 between y 1 = 0.3 m and y 2 = 0.7 m.Table 1 reports the results of the comparison of these two models against the ground truth.The concentration fields predicted by PiNN with sin activation function at different times (t = 0.25 s, 0.50 s, 0.75 s, 1.00 s) are shown in Fig. 4. The predictions of PiNNs for the concentration field at t = 1.00 s are also compared with the analytical solution given by Eq. (31).The absolute point error shows that there is a good match between the PiNNs' predictions, with sin, and the analytical solution within the entire domain yielding M SE = 1.54 × 10 −6 that is about an order of magnitude more accurate compared to the PiNN with tanh activation function while being nearly 31% faster to be trained.33).The absolute point error shows the mismatch between the solutions yielding the MSEs reported in Table 1.

Case 3: 2D solute transport in homogeneous porous media
In this test case, the solute transport in a 2D homogeneous porous medium (x ∈ [0, L], y ∈ [0, W ]) is investigated.For this problem, assuming ζ(x, y) = 1, the pressure equation defined by Eq. ( 9) reduces to, that must be solved to determine the velocity field using Eq. ( 3).The governing equation for the solute concentration remains the same as Eq. ( 31).The following boundary conditions are considered for the pressure field, P (x, y) = 1.0; y = 0 or y = W, and 0 ≤ x ≤ L, P (x, y) = 0.1; x = 0 or x = L, and 0 ≤ y ≤ W, (38) and for the concentration field, with C(x, y, t = 0) = 0 as the initial condition.The pressure value at the corner points that belong to two perpendicular sides is set to an average value of 0.55, as additional constraints, to maintain symmetry in the solution.The ground truth solutions for pressure and concentration fields are obtained using FEM with 100 × 100 quadrilateral elements assuming L = W = 1 m, D x = D y = 0.02 m 2 /s, and C 0 = 0.2 kg/m 3 .A PiNN with sin activation function and randomly distributed collocation points is employed to predict the solute transport in the 2D porous media described above.The PiNN's inputs are the spatiotemporal coordinates, (x, y, t), and the outputs are the pressure and concentration fields, i.e., s = {P, C} in Fig. 1.The pressure and concentration fields are decoupled, therefore one may use two PiNNs side-by-side to solve this problem; however, training a PiNN with multiple outputs is more desirable in this study to minimize hyperparameter tuning.The iterative random-search hyperparameter tuning process is again practiced to obtain the best PiNN architecture based on the MSE accuracy measure.This approach resulted in a network with five hidden layers and {32, 16, 16, 8, 8} neurons per layer using 12,000 spatiotemporal random points within 0 ≤ x, y ≤ 1, 0 ≤ t ≤ 1 as collocation points to enforce the pressure and concentration PDEs' loss terms, 12,000 spatiotemporal random points to collectively represent the boundary and initial conditions loss terms (sample selected points are shown in Fig. 5), and identical weights for all loss terms (w i = w b = w p = 1).Again, a PiNN with similar architecture but using tanh activation function is also trained and tuned for comparison purposes, see Table 2.
Table 2: A comparison of PiNN models using sin and tanh activation functions in terms of accuracy and training time to predict the pressure and concentration fields in homogeneous 2D porous media, validated against the ground truth solutions obtained using FEM with 100 × 100 quadrilateral elements.The MSE is calculated in (x,y) domain using Eq.24 for the pressure field at steady state, and for the concentration field at t = 1.00 (s).The training time of the PiNNs is measured through three runs on a system equipped with a 3.00-GHz 48-core Intel Xeon Gold 6248R CPU, Nvidia Quadro RTX 8000 GPU, and 128 GB of RAM.Additionally, the number of collection points in the domain and the weights assigned to the loss terms are reported for each PiNN model.The upper panels show the domain with pressure and concentration boundary conditions, as well as the comparison of total loss vs. iteration for PiNNs using sin and tanh activation functions.The lower panels show the comparison between the PiNN's predictions and ground truth solutions obtained using FEM with 100 × 100 quadrilateral elements for the pressure field, the velocity field in the x direction, and the total velocity field under steady state conditions as well as the concentration field at t = 1.00 s.The flow direction of the pore fluid is also represented by the streamlines.The absolute point error is also shown for each field to illustrate the mismatch between the solutions.
Figure 5 depicts a comparison between ground truth (FEM solutions) and PiNN's predictions, with sin activation function, for the pressure field, the velocity field in the x direction, the total velocity field, and the flow streamlines in a homogeneous porous medium.The absolute point error is also displayed for all fields, indicating good agreement between PiNN and FEM solutions.Figure 5 also demonstrates the comparison between the PiNN's prediction and FEM solution for the concentration field at t = 1.00s.The absolute point error indicates the agreement between both solutions.The results reported in Table 2 reveal that the PiNN with sin activation function is almost an order of magnitude more accurate than its counterpart (the PiNN with tan activation function), while its training time is reduced by 78%.Furthermore, the faster convergence of PiNN with the sin activation function is noticeable in the total loss plot versus iteration, as illustrated in Fig. 5. Based on these results, it can be inferred that the proposed PiNN is capable of simultaneously solving for the pressure and solute concentration fields in a 2D homogeneous porous medium with high accuracy.The next section of the study explores the application of PiNN and the benefit of using a periodic activation function to solve a more complex problem of solute transport imposed by permeability heterogeneity in porous media.

Case 4: 2D solute transport in heterogeneous porous media
In this test case, the solute transport is analyzed in a 2D rectangular domain (x ∈ [0, L], y ∈ [0, W ]) representing a heterogeneous porous medium, i.e., the permeability k(x, y), and hence, ζ(x, y) in Eq. ( 9) vary spatially.Three different ζ(x, y) fields, i.e., scaled permeability fields, are considered, as shown in Fig. 6.These fields are designed to exhibit structural features with distinct length scales relative to the size of the domain.As shown in Fig. 6, from left to right, i.e., Case 4A, 4B, and 4C, the length scale of the structural features in the permeability field decreases, leading to more complex problems to simulate using PiNNs owing to the increase in high-frequency features.The initial and boundary conditions for these cases are identical to those reported for Case 3 (refer to Section 4.3).The pressure constraints at the corner points that correspond to the two perpendicular sides are not maintained due to variations in permeability.For each case, the ground truth solutions for the pressure and concentration fields are determined using FEM with 100 × 100 quadrilateral elements considering L = W = 1 m, α = 0, D x = D y = 0.02 m 2 /s, µϕ = 0.001, and C 0 = 0.2 kg/m 3 .Two PiNN models with five hidden layers and {32, 32, 16, 16, 16} neurons per layer are trained using sin and tanh activation functions, respectively, to resolve the solute transport in these three heterogeneous cases (i.e., predict the pressure and concentration fields).For each case, due to changes in the permeability field, the hyperparameter tuning process leads to different randomly distributed collocation points.For Case 4A, a total of 30,000 spatiotemporal random points (15,000 points within 0 ≤ x, y ≤ 1, 0 ≤ t ≤ 1 and 15,000 points on the boundaries) are used as collocation points to enforce the PDE, initial, and boundary loss terms.For Case 4B and Case 4C, the hyperparameter tuning process leads to a total of 36,000 and 40,000 spatiotemporal random points as collocation points, respectively.The weights for the loss terms are also tuned for both PiNN models to achieve solution convergence while meeting the boundary condition requirements satisfactorily.Table 3 reports the collocation points within the domain and the weights of the loss terms used to predict the pressure and concentration fields.It also reports the comparison of the PiNN models, in terms of accuracy and training time, to predict the ground truth solution.Figures 7 and 8 illustrate the comparison between the PiNNs' predictions with the ground truth (FEM solutions) obtained for solute transport in heterogeneous porous media.The comparisons are shown for the steady state pressure field (Fig. 7) and the concentration field at t = 1.00 s (Fig. 8).
Table 3: A comparison of PiNN models using sin and tanh activation functions to predict the pressure and concentration fields in heterogeneous porous media, validated against the FEM solution with a 100 × 100 quadrilateral element grid.The MSE is calculated in (x,y) domain using Eq.24 for the pressure field at steady state and for the concentration field at t = 1.00 s.The training time of the PiNNs is measured through three runs on a system equipped with a 3.00-GHz 48-core Intel Xeon Gold 6248R CPU, Nvidia Quadro RTX 8000 GPU, and 128 GB of RAM.Additionally, the number of collection points in the domain and the weights assigned to the loss terms are reported for each PiNN model.Figure 7 shows a comparison of PiNN models using sin and tanh activation functions in predicting the pressure field in heterogeneous porous media represented by Case 4A, 4B, and 4C permeability fields (Fig. 10).The absolute point error for each case, compared to the ground truth obtained using FEM with 100×100 quadrilateral elements, shows that the PiNN with the sin activation function provides a more accurate prediction for the pressure field.Table 3 reports the comparison in terms of MSE (Eq.24).Based on the MSE reported, the PiNN model with sin activation function is an order of magnitude more accurate in predicting the pressure field.As can be seen in Fig. 7, the PiNN model with the tanh activation function encounters more difficulty in converging to the ground truth solution as the variation of permeability in the domain increases (i.e., as the length-scale of structural features decreases).In contrast, this issue is less pronounced for the PiNN model with sin activation function.Figure 8 shows a comparison of PiNN models using sin and tanh activation functions in predicting the concentration field in heterogeneous porous media represented by Case 4A, 4B, and 4C scaled permeability fields (Fig. 6).The PiNNs' predictions are validated against the FEM solution with a 100 × 100 quadrilateral element grid.The comparison is illustrated for the concentration field at t = 1.00 s considering D x = D y = 0.02 m 2 /s, α = 0, µϕ = 0.001, and C 0 = 0.2 kg/m 3 .The absolute point error is also shown for each PiNN model to illustrate the mismatch between the prediction and ground truth solution.Table 3 reports the MSE calculated using Eq. 24, and the training time.It is clearly observed that the mismatch for both PiNN models increases as the length-scale of the structural features in the permeability field decreases (from left to right in Fig. 8), which is attributed to the rapid variation of the velocity field caused by heterogeneities.As shown, the PiNN model with sin activation function encounters less difficulty in converging to the ground truth solution, and this is more noticeable as the variation of permeability in the domain increases.In contrast, the mismatch is smaller for the PiNN model with sin activation function, making it two orders of magnitude more accurate and nearly 2x faster to train compared to the PiNN model using tanh activation function.This order of magnitude for accuracy is consistent for all time steps.Figure 9 shows an example comparison between the PiNN's predictions (with sin activation function) and ground truth solutions for transient solute transport in a heterogeneous porous medium with a permeability field described by Case 4B.The MSE is calculated using Eq.24 for the concentration fields obtained at each given time.These comparative findings clearly show that a PiNN with a periodic activation function can solve for the pressure and solute concentration fields in 2D heterogeneous media with a higher degree of accuracy compared to tanh activation function.The results also show that a PiNN with the periodic activation function reduces the training time.4 compares the inference time for PiNN with the sin activation function and FEM resolving solute transport in 2D homogeneous (case 3) and heterogeneous (Case 4) porous media.The speed-up factor achieved is around three orders of magnitude and can considerably escalate when a higher-resolution mesh (i.e., a higher number of elements) is used by the FEM solver; a situation that is often faced in cases with higher levels of heterogeneities.In fact, in such cases, the number of collocation points used by PiNN also increases, which could negatively impact the cost of training but not the inference time.Therefore, the computational efficiency of PiNN is certainly superior to that of a traditional solver (e.g., FEM) in domains with complex geometry and heterogeneity.It is also notable that PiNN comes with several drawbacks.The two most common drawbacks that were also faced in this study are: (i) PiNN's training can face gradient vanishing problems, which can prohibitively slow down the learning process; and (ii) PiNN did not always converge due to competing nonlinear loss terms (e.g., PDE loss terms related to the pressure and concentration fields in addition to the loss terms related to initial/boundary conditions for both of the fields).It requires trials and errors or other adaptive techniques to adjust the loss terms' weight functions to mitigate the instability.The lack of a theoretical condition or constraint (e.g., Courant number in traditional computational fluid dynamics [72]) to ensure convergence is an open research area for investigation.Overall, considering the effectiveness of PiNN in combining scientific computing and DL and accelerating computation, the addition of PiNNs to the simulations of porous media flow should be further investigated.For example, a future study could be the assessment of PiNNs to model non-isothermal reactive flows with anisotropic dispersion coefficient in highly heterogeneous and deformable porous media.

Conclusions
In this study, we demonstrated the use of physics-informed neural networks (PiNNs) and the advantages of employing a periodic activation function in addressing the solute transport problem in both homogeneous and heterogeneous porous media, as governed by the advection-dispersion equation.We constructed PiNNs using the sin and tanh activation functions to predict pressure and concentration fields within a given spatiotemporal domain.To evaluate the capabilities of PiNN models, we conducted seven case studies (1D and 2D) with varying degrees of permeability heterogeneity.We utilized an iterative random-search hyperparameter tuning method to determine the optimal architecture for each PiNN model in the test cases.The accuracy of the PiNNs' predictions was evaluated using absolute point error and mean square error metrics, and compared to the ground truth solutions obtained analytically or through FEM numerical methods.Our results showed that the PiNN with sin activation function was capable of accurately predicting the behavior of the solute transport under a variety of conditions.The PiNN model employing the sin activation function was found to be up to two orders of magnitude more accurate and up to two times faster to train than the commonly used tanh activation function when applied to 2D homogeneous and heterogeneous porous media exhibiting structural features with distinct length scales relative to the domain size.Furthermore, the inference speedup results showed that the PiNN's simultaneous predictions of pressure and concentration fields can reduce computational time by three orders of magnitude compared to FEM simulations.

Figure 1 :
Figure1: A schematic architecture of Physics-informed Neural Networks (PiNNs).The network digests spatiotemporal coordinates, (x ,t), as inputs to predict a solution set, s, as an approximate to the ground truth solution, ŝ.The automatic differentiation (AD) is then used to generate the derivatives of the predicted solution s with respect to inputs.These derivatives are used to formulate the residuals of the governing equations in the loss function weighted by different coefficients.θ and λ are the learnable parameters for weights/biases and unknown PDE parameters, respectively, that can be learned simultaneously while minimizing the loss function.

Figure 2 :
Figure 2: One-dimensional solute transport with a constant velocity field, ux = 0.5 m/s.The upper panel shows the concentration field predicted by PiNN with sin activation function within the spatiotemporal domain (0 ≤ x ≤ 1, 0 ≤ t ≤ 10) at an injection rate of C 0 = 1kg/m 3 .The lower panels show the comparison of PiNNs' prediction with the ground truth obtained analytically using Eq.(29) at three different times.The comparison demonstrates a good agreement yielding M SE = 1.15 × 10 −6 using sin activation function and M SE = 1.21 × 10 −6 using tanh activation function for the entire spatiotemporal domain.

Figure 3 :
Figure 3: A comparison between the predictions of PiNNs with sin and tanh activation functions and ground truth for solute transport in a 2D domain representing an isotropic porous medium considering Dx = Dy = 0.02 m 2 /s, u = (ux, uy) = (0.0, 0.0) m/s, and a solute injection rate of C 0 = 0.2 kg/m 3 between y 1 = 0.3 m and y 2 = 0.7 m.The upper panels show the domain with boundary conditions and distributions of randomly selected points on which different terms in the loss function are evaluated.The red dots represent sample collocation points inside the domain corresponding to the loss term associated with the 2D solute transport PDE, and the blue, green, and black dots represent the points on the boundary of the domain corresponding to the loss terms associated with the boundary conditions.The lower panel shows the comparison between the PiNNs' predictions for the concentration field at t = 1.00 s and the ground truth obtained analytically using Eq.(33).The absolute point error shows the mismatch between the solutions, and the total loss plot depicts the difference in the convergence rate of PiNNs.

Figure 4 :
Figure 4: A comparison between the prediction of PiNN with sin and tanh activation functions and ground truth obtained analytically for the concentration field considering Dx = Dy = 0.02 m 2 /s, u = (ux, uy) = (0.5, 0.0) m/s, and an injection rate of C 0 = 0.2 kg/m 3 between y 1 = 0.3 m and y 2 = 0.7 m.The upper panels show the predictions of the PiNN with sin activation function at t = 0.25 s, 0.50 s, 0.75 s,and 1.00 s.At t = 1.00 s, the PiNNs' predictions are compared with the analytical solution obtained using Eq.(33).The absolute point error shows the mismatch between the solutions yielding the MSEs reported in Table1.

Figure 5 :
Figure 5: A comparison between the prediction of the PiNN with sin activation function and the ground truth for the pressure field in a 2D homogeneous porous medium considering Dx = Dy = 0.02 m 2 /s and C 0 = 0.2 kg/m 3 .The upper panels show the domain with pressure and concentration boundary conditions, as well as the comparison of total loss vs. iteration for PiNNs using sin and tanh activation functions.The lower panels show the comparison between the PiNN's predictions and ground truth solutions obtained using FEM with 100 × 100 quadrilateral elements for the pressure field, the velocity field in the x direction, and the total velocity field under steady state conditions as well as the concentration field at t = 1.00 s.The flow direction of the pore fluid is also represented by the streamlines.The absolute point error is also shown for each field to illustrate the mismatch between the solutions.

Figure 6 :
Figure 6: Scaled permeability fields selected to possess structural features with different length-scales compared to the domain size.From left to right, i.e., Case 4A, 4B, and 4C, the length-scale of the structural features decreases, leading to more intricate problems to simulate using PiNNs owing to the existence of high-frequency features.

Figure 7 :
Figure 7: A comparison of PiNN models using sin and tanh activation functions in predicting the pressure field in heterogeneous porous media, validated against the FEM solution with a 100 × 100 quadrilateral element grid.The comparison is illustrated for the pressure field considering Dx = Dy = 0.02 m 2 /s, α = 0, µϕ = 0.001, and C 0 = 0.2 kg/m 3 .The absolute point error is also shown for each PiNN model to illustrate the mismatch between the prediction and ground truth solution.The PiNN model with the tanh activation function encounters more difficulty in converging to the ground truth solution as the variation of permeability in the domain increases (i.e., as the length-scale of structural features decreases).In contrast, this issue is less pronounced for the PiNN model with sin activation function.

Figure 8 :
Figure8: A comparison of PiNN models using sin and tanh activation functions to predict the concentration field in heterogeneous porous media, validated against the FEM solution with a 100 × 100 quadrilateral element grid.The comparison is illustrated for the concentration field at t = 1.00 s considering Dx = Dy = 0.02 m 2 /s, α = 0, µϕ = 0.001, and C 0 = 0.2 kg/m 3 .The absolute point error is also shown for each PiNN model to illustrate the mismatch between the predictions and ground truth solutions.The PiNN model with tanh activation function encounters more difficulty in converging to the ground truth solution as the variation of permeability in the domain increases (from left to right).This disparity is attributed to the rapid variation of the velocity field caused by heterogeneities (i.e., as the length-scale of structural features).In contrast, the mismatch is less pronounced for the PiNN model with sin activation function, making it two orders of magnitude more accurate than the PiNN model with tanh activation function.

Figure 9 :
Figure9: A comparison between the PiNN's predictions (with sin activation function) and ground truth solutions for transient solute transport in a heterogeneous porous medium with the given permeability field, Case 4B.The MSE is calculated using Eq.24 for the concentration fields obtained at each given time.