A Benchmarking Study for Algorithm Selection in Scientific Machine Learning (SciML): PINN vs. gPINN for Solving Partial Differential Equations

Azam, Muhammad; Chuhan, Imran Shabir; Ahmed, Muhammad Shafiq; Arshid, Kaleem

doi:10.3390/appliedmath6020026

Open AccessArticle

A Benchmarking Study for Algorithm Selection in Scientific Machine Learning (SciML): PINN vs. gPINN for Solving Partial Differential Equations

by

Muhammad Azam

¹,

Imran Shabir Chuhan

²,

Muhammad Shafiq Ahmed

³

and

Kaleem Arshid

^4,*

¹

Key Laboratory of Urban Security and Disaster Engineering of Ministry of Education, Beijing University of Technology, Beijing 100124, China

²

School of Civil Engineering and Transportation, South China University of Technology, Guangzhou 510640, China

³

Department of Mathematics and Physics, American University of Ras al Khaimah, Ras al Khaimah 72603, United Arab Emirates

⁴

Department of Naval, Electrical, Electronic and Telecommunications Engineering, University of Genoa, 16145 Genoa, Italy

^*

Author to whom correspondence should be addressed.

AppliedMath 2026, 6(2), 26; https://doi.org/10.3390/appliedmath6020026

Submission received: 15 December 2025 / Revised: 23 January 2026 / Accepted: 2 February 2026 / Published: 9 February 2026

(This article belongs to the Special Issue Applied Mathematical Modeling and Machine Learning for Geomechanics and Superconducting Materials)

Download

Browse Figures

Versions Notes

Abstract

Recent advances in physics-informed neural networks (PINN) have highlighted the need for systematic criteria for selecting appropriate algorithms to solve differential equations. This paper presents a numerical comparison between standard PINNs and gradient-enhanced PINNs (gPINNs) used to solve a high-order partial differential equations (PDE). To verify the accuracy and convergence behavior of all the methods, we solve a fourth-order PDE whose analytical solution is known. gPINN is recommended for problems requiring high accuracy in gradient fields or operating with sparse data, whereas standard PINN is advised for strongly nonlinear or computationally constrained scenarios. We synthesize our findings into a practical selection guide; gPINN is recommended for problems requiring high accuracy in gradient fields or operating with sparse data, whereas standard PINN is advised for strongly nonlinear or computationally constrained scenarios. This framework provides a clear, evidence-based policy for algorithm choice in SciML. Beyond numerical comparison, we provide an analytical interpretation linking solver performance to the spectral and stiffness properties of each PDE class, offering a principled basis for algorithm selection.

Keywords:

physics-informed neural networks (PINNs); high-order PDE; gradient-enhanced PINNs (gPINNs)

1. Introduction

Partial differential equations (PDEs) play a vital role in scientific computing by providing mathematical frameworks for modeling complex physical phenomena in fields including fluid dynamics, quantum mechanics, and financial engineering. The classical numerical approaches such as finite element and finite difference methods [1,2,3,4] have been extensively effective yet, mostly dealing with mesh generation difficulties, expensive high-dimensional computation and ill-posed problems of inverse problems. The recent convergence of deep learning and scientific computing has brought a new paradigm, offering a promising path toward mesh-free, scalable, and data-driven solvers.

At the forefront of this trend, there are PINNs, an advanced framework that was proposed by Raissi et al. [5]. PINNs directly incorporate the physics governing laws, written in the form of PDEs, into the loss of a neural network. Using the automatic differentiation to calculate derivatives, PINNs are trained to meet the dynamics, initial conditions, and boundary conditions of the system simultaneously [5,6,7]. This method has proven exceptionally flexible and has been applied to a wide variety of problems such as integra-differential equations [8], fractional PDEs, and stochastic systems [9,10,11].

Despite all these promising developments, PINNs have a number of limitations that can restrict their use in practice. Common challenges include solving problems with high-order derivatives or sharp gradients with sufficient accuracy, particularly when training data is sparse [12], a phenomenon linked to the spectral bias of neural networks and imbalanced loss gradient dynamics [13]. To this end, a large part of study has gone into the enhancement of the performance of PINN by developing improved training processes. They include residual-based adaptive refinement (RAR) to optimize point sampling, new loss balancing methods to deal with the training instability, and purpose-built architectures to impose hard constraints [14,15,16,17]. While these strategies improve implementation, they often fail to enrich the physical information embedded in the learning process. The basic need of enriching the physical information that is being enshrined in the very process of learning is typically not met in these strategies.

A key observation motivates a physics-based enhancement: if the PDE residual is smooth, its gradient should also vanish. This leads naturally to gradient-enhanced constraints in the learning formulation. It is the application of this simple and powerful fact elsewhere, such as Gaussian processes and adversarial robustness, which was not fully applied to PDE solvers. On this concept, a series called gradient-enhanced PINN (gPINN) method was introduced, embedding the gradient of the PDE residual into the loss function as additional soft constraints [18,19]. This process compels the network to coincide with the physics at some locations along with in their neighboring cells, thus, concurrence to the model forecasting job to the underlying physical regulations and the spatial-context sensitivity of the design [20]. This approach is particularly practical for adhering to physical constraints of the model in the achievement of high-quality accuracy especially in the sparse data environment and challenging issues like the equations of varying coefficients [21]. It has been applied to forward and inverse problems using the method and could be enhanced by adaptive sampling, such as residual-based adaptive refinement [19].

However, the question of when gPINN is required in comparison with the standard PINN remains open. Is this additional cost of computing higher-order derivatives compensated? Whereas gPINN has other physical limitations, it is also able to produce a more complex loss landscape, and it is challenging to train gPINN on certain classes of problems. In that way, the systematic benchmarking study is an acute need that would enlighten the practitioners on the right selection of an algorithm that is a consequence of mathematical properties of the desired PDE.

This study addresses a gap in the current literature: while PINN and gPINN have been individually developed and tested, a systematic benchmarking study that provides clear, prescriptive guidelines for algorithm selection based on PDE mathematical properties is lacking. Our work fills this gap by evaluating both methods across a carefully chosen set of PDEs including high-order, nonlinear, and interface problems designed to represent distinct computational challenges. The result is a practical, evidence-based framework that advises researchers on when to use PINN or gPINN, moving beyond mere performance comparison to actionable decision making in SciML.

2. Methodology (General Framework)

2.1. Standard PINN Architecture

PINN is a sophisticated method based on a deep neural network framework, which aims to solve the equation of state by establishing the physical laws as a constraint in the course of training. This approach has a major benefit in the sense that it combines the capability of deep learning with the laws of physics so that the model can be motivated by both data and physics. In so doing, PINN offers a potent way of more precisely and efficiently solving complicated differential equations.

Take the standard expression of nonlinear partial differential equations:

F (u (x, t)) = f (x, t), x \in Ω, t \in [0, T]

(1)

B (u (x, t)) = h (x, t), x \in Ω, t \in [0, T]

(2)

I [u (x, 0)] = g (x), x \in Ω .

(3)

In this case, the solution to the differential equation is obtained by a neural network that satisfies the physics-based constraints is the governing differential equation, Equation (2) is the boundary conditions and Equation (3) is the initial conditions. These equations are conventionally included in a model of a neural network and trained to solve the differential equations when considered in PINN.

As shown in Figure 1, the standard PINN loss functional, i.e., loss of training data, and the residual loss, as well as the losses of the boundary and the initial conditions are four: the loss of training data, the residual loss, and the losses of the boundary and the initial condition. It is computed that the total loss is:

L_{P I N N} (θ) = L_{d a t a} (θ) + L_{N} (θ) + L_{B} (θ) + L_{I} (θ)

(4)

L_{d a t a} (θ) = \frac{1}{N_{d a t a}} \sum_{i = 1}^{N_{d a t a}} {(u (x_{i}^{d a t a}, t_{i}^{d a t a}, θ) - u_{i})}^{2}

(5)

L_{F} (θ) = \frac{1}{N_{F}} \sum_{i = 1}^{N_{F}} (F (u (x_{i}^{F}, t_{i}^{F}, θ) - f (u {(x_{i}^{F}, t_{i}^{F}, θ)}^{2}

(6)

L_{B} (θ) = \frac{1}{N_{B}} \sum_{i = 1}^{N_{B}} (B (u (x_{i}^{B}, t_{i}^{B}, θ) - h (u {(x_{i}^{B}, t_{i}^{B}, θ)}^{2}

(7)

L_{I} (θ) = \frac{1}{N_{I}} \sum_{i = 1}^{N_{I}} (I (u (x_{i}^{I}, t_{i}^{I}, θ) - g (u {(x_{i}^{I}, t_{i}^{I}, θ)}^{2},

(8)

the total

L_{P I N N}

combines four components with equal weighting: the data loss

L_{d a t a},

the residual loss

L_{r e s},

boundary condition loss

L_{b c},

and the initial condition loss

L_{i c} .

2.2. gPINN Architecture

gPINN is an advanced algorithm on the standard PINN gradient. In the general form of a nonlinear partial differential equation (PDE), as given in the above Equations (1)–(3). It should be mentioned that the standard PINN algorithm is founded on the PDE residual

r (x, t)

to model the loss of the PDE, i.e.,

r (x, t) : = F [u (x, t)] - f (x, t) .

(9)

Assuming the exact solution of the PDE is smooth enough, then, assuming

r (x, t) = 0,

the gradient of

r (x, t)

should also be zero. Thus, the gPINN algorithm, which is grounded on PINN algorithm, introduces the gradient of the PDE residual as an extra constraint into the loss-function, which enhances the solution accuracy, particularly when the number of training points is rather small.

The schematics of the gPINN framework are shown in Figure 2. Its total loss function,

L_{g P I N N}

, augments the standard PINN loss with gradient penalty terms:

L_{g P I N N} = L_{P I N N} + λ_{x} L_{x}^{g r a d} + λ_{y} L_{y}^{g r a d},

(10)

where

L_{x}^{g r a d}

and

L_{y}^{g r a d}

penalize the spatial gradients of the PDE residual, and

λ_{x} = λ_{y} = 0.0001 .

L_{x}^{g r a d} = \frac{1}{N_{X}} {\sum_{i = 1}^{N_{X}} (\frac{\partial r (x_{i}^{t}, t_{i}, θ)}{\partial x})}^{2}

(11)

L_{y}^{g r a d} = \frac{1}{N_{T}} {\sum_{i = 1}^{N_{T}} (\frac{\partial r (x_{i}^{t}, t_{i}, θ)}{\partial t})}^{2}

(12)

The gradient loss weights (

λ_{x}, λ_{y}

) were tuned to prevent overfitting and balance the loss landscape. For all experiments, we used

λ_{x} = λ_{x} = 0.0001

, a value found to provide regularization without dominating the primary physics loss.

The chosen value

λ_{x} = λ_{y} = 10^{- 4}

was empirically validated through a sensitivity analysis (see Section Sensitivity Analysis of Gradient-Loss Weights (NEW)) and found to offer robust performance across a range of PDE types.

2.3. Solving Process

We consider the following equation:

\frac{\partial^{2} u}{\partial x^{2}} - \frac{\partial^{4} u}{\partial y^{4}} = (2 - x^{2}) e^{- y} .

(13)

The following boundary conditions are considered:

\{\begin{cases} u_{y y} (x, 0) = x^{2}, \\ u_{y y} (x, 1) = \frac{x^{2}}{e}, \\ u (x, 0) = x^{2}, \\ u (x, 1) = \frac{x^{2}}{e}, \\ u (0, y) = 0, \\ u (1, y) = e^{- y} . \end{cases}

(14)

The exact solution of Equation (13) is

u (x, y) = x^{2} e^{- y}

, and it is sampled randomly over the region

[0, 1] \times [0, 1]

for grid points and data. The configuration points are used to construct the residual loss terms

L_{x}^{e x a}

,

L_{y}^{e x a}

, and data points are used to construct the loss terms

L_{8}^{e x a}

, which, unlike the residual loss functions constructed by a standard PINN, the loss function is augmented by additional spatial derivatives of the residual error terms

L_{x}^{e x a}

and

L_{y}^{e x a}

, which are structed as follows.

L_{x}^{e x a} = \frac{1}{N} {\sum_{i = 1}^{N} (\frac{\partial^{2} u}{\partial x^{2}} - (2 x^{2} e^{- y}))}^{2},

(15)

L_{y}^{e x a} = \frac{1}{N} {\sum_{i = 1}^{N} (\frac{\partial^{2} u}{\partial y^{2}} - (2 x^{2} e^{- y}))}^{2} .

(16)

where

N^{e x a}

denotes the number of points sampled randomly in the domain. The loss function for the gPINN is given by:

L_{g P I N N}^{e x a} = L_{g P I N N}^{e x a} + λ_{x}^{e x a} L_{x}^{e x a} + λ_{y}^{e x a} L_{y}^{e x a},

(17)

where

N^{e x a}

and

λ_{x}^{e x a}, λ_{y}^{e x a}

represent residual weights, and in this case, the residual weight values are set as

0.0001

to prevent overfitting of the residual error terms.

Sensitivity Analysis of Gradient-Loss Weights (NEW)

The performance of gPINN is sensitive to the choice of gradient-loss weights

λ_{x}

and

λ_{y}

. To evaluate this sensitivity, we tested four values of

λ = λ_{x} = λ_{y}

on the fourth-order benchmark PDE:

λ \in {10^{- 5}, 10^{- 4}, 10^{- 3}, 10^{- 2}}

. Table 1. summarizes the resulting errors and convergence behavior. We found that

λ = 10^{- 4}

consistently provided the lowest error and fastest convergence, whereas

λ \geq 10^{- 3}

led to increased errors and occasional training instability. This confirms that moderate gradient regularization is beneficial, but excessive weight can dominate the loss and hinder convergence. These findings support the fixed choice of

λ = 10^{- 4}

used throughout our experiments.

2.4. Implementation

Training of both PINN and gPINN models was performed using Python 3.11.7 and PyTorch 2.2.2. The hyperbolic tangent (Tanh) activation function was used throughout the network. The Adam optimizer was employed with a learning rate of 0.001 (default PyTorch setting) and trained for 20,000 epochs. We fixed the number of interiors, boundaries, and training data points at 1000, 100, and 1000, respectively. Both PINN and gPINN produced valid approximations of the exact solution; however, gPINN converged faster and exhibited lower spatial error localization. In general, gPINN demonstrated superior robustness, efficiency, and accuracy for the tested equation compared to PINN.

For clarity and to reflect meaningful precision, all error metrics reported in the following tables are rounded to four decimal places. Each experiment was conducted with a fixed random seed to ensure a controlled and reproducible comparison between PINN and gPINN, which is a standard practice for benchmarking studies focused on relative algorithmic performance [13,20].

All figures presented in this study were generated using high-resolution plotting settings. Surface and contour plots include explicit axis labels and color bars to ensure clear quantitative interpretation.

The quantitative measures of the error of this problem are summarized in Table 2 which compares the maximum absolute error and relative

L_{2}

error in PINN and gPINN.

The determination of the predictive accuracy and the distribution of error is depicted in Figure 3 which illustrates the comparison of exact and predicted solutions of the fourth-order benchmark problem side by side. The results are evaluated over a 3D and two-dimensional spatial domain, with x and y representing the horizontal and vertical spatial coordinates, respectively. The three-dimensional represents the exact solution and the corresponding PINN and gPINN predictions, while the 2D shows the associated contour plots and absolute error distributions for PINN and gPINN, together with their error difference, highlighting the spatial regions where the gradient-enhanced formulation improves predictive accuracy. Figure 4 shows the corresponding training dynamics, which depict that gPINN converges faster.

Computational Efficiency Analysis

A practical comparison requires evaluating the computational cost. gPINN gradient constraints require higher-order automatic differentiation, increasing per-epoch time and memory. We conducted all experiments on a system with an NVIDIA RTX 4090 GPU (24 GB VRAM), Intel i9-13900K CPU, and 64 GB RAM using PyTorch 2.2.2. Identical architectures and training points were used for fair comparison. The computational efficiency of PINN and gPINN is quantitatively compared in Table 3, which reports the epoch-wise training time, GPU memory consumption, and total training time for different benchmark problems.

3. Comparison of Standard PINN and gPINN Methods for the Solution of PDEs

We use both the methods at a time for the solutions of Diffusion Equation, Allen–Cahn Equation, Helmholtz interface problem, Burger’s, Huxley, Fisher, Burger’s–Huxley and Burger’s–Fisher equations.

3.1. Example for Diffusion Equation:

Take the following one-dimensional diffusion equation as an example.

\frac{\partial u}{\partial t} = \frac{\partial^{2} u}{\partial x^{2}} + e^{- t} (- \sin (π x) + π^{2} \sin (π x)), x \in [- 1, 1], t \in [0, 1] .

(18)

In this case, the solution of the equation, denoted by

u (x, t),

is the concentration of a diffusing substance and can be considered to be the probability density function of the Brownian motion particles at the position of

x

and time of

t

. Take the following initial and boundary conditions:

\{\begin{cases} u (x, 0) = \sin (π x), \\ u (- 1, t) = u (1, t) = 0 . \end{cases}

(19)

The exact solution of this diffusion equation is actually known to be

\sin (π x) e^{- t} .

In this case, the maximum absolute error and the relative

L_{2}

error of the numerical solution obtained by gPINN are 0.005820 and 0.005814, respectively. The results of the solution are presented in the following Figures.

We provide Figure 5 containing visual comparison of the solution predicted by the model to the absolute error distribution and Figure 6 with the training loss convergence.

The quantitative measures of the error of this problem are summarized in Table 4 which compares the maximum absolute error and relative

L_{2}

error in standard PINN and gPINN.

3.2. Example for Allen–Cahn Equation

The Allen–Cahn equation given below is used as an example:

\frac{\partial u}{\partial t} = D \frac{\partial^{2} u}{\partial x^{2}} + 5 (u - u^{3}), x \in [- 1, 1], t \in [0, 1] .

(20)

Let

D = 0.001,

and consider the following initial and boundary conditions:

u (x, 0) = - \sin (π x), u (- 1, t) = u (1, t) = 0 .

(21)

The results demonstrate that the gPINN model achieves slightly better performance compared to the standard PINN method, as reflected by lower maximum absolute error and

L_{2}

relative error values (0.715209 vs. 0.714414 and 0.927016 vs. 0.920822, respectively). The graphical comparisons further highlight the improved approximation of the reference solution by gPINN, especially in the error distribution plots. Overall, gPINN offers more accurate solutions with reduced errors over the PINN method, making it a promising approach for solving the Allen–Cahn equation.

The quantitative measures of the error of this problem are summarized in Table 5 which compares the maximum absolute error and relative

L_{2}

error in PINN and gPINN.

We provide Figure 7 containing visual comparison of the solution predicted by the model to the absolute error distribution and Figure 8 with the training loss convergence.

3.3. Example for Helmholtz Interface Problem

Through an equation with Dirichlet boundary conditions, 2D Helmholtz interface problems are described as follows:

- \nabla^{2} u - k^{2} u = f in Ω, u = g on \partial Ω,

(22)

with jump conditions at

Γ :

{[u]}_{Γ} = 0, {[β \nabla u \cdot n]}_{Γ} = 0,

(23)

where

n

is the unit normal to

Γ .

The exact solution is:

u_{e x a c t} (x, y) = \sin (π x) \sin (π y),

(24)

the source term becomes:

f = (2 π^{2} - k^{2}) \sin (π x) \sin (π y) .

(25)

The comparative analysis shows that the standard Physics-Informed Neural Network (PINN) indicates a better performance of the Helmholtz interface problem with the remarkably low

L_{2}

relative error of

6.63 \times 10^{- 3},

which highlights the great functionality of this model in the dynamics of wave propagation. The gradient-enhanced PINN (gPINN) demonstrates that it is a physics-informed neural network by achieving an

L_{2}

error of

1.13 \times 10^{- 2}

in a competitive performance. The analysis of maximum error further supports the strength of both solutions where PINN and gPINN have both excellent pointwise accuracy of

3.73 \times 10^{- 2}

and

4.22 \times 10^{- 2},

respectively. Both approaches were able to solve the complex wave dynamics with equation

- \nabla^{2} u - π^{2} u = π^{2} \sin (π x) \sin (π y)

with homogenous boundary conditions. This paper eventually confirms the accuracy by benchmarking PINN, which is a little more precise in this particular problem, and confirms that gPINN will still be a useful gradient-constrained approach to specialized problems.

The quantitative measures of the error of this problem are summarized in Table 6 which compares the maximum absolute error and relative

L_{2}

error in PINN and gPINN.

We provide Figure 9 containing visual comparison of the solution predicted by the model to the absolute error distribution and Figure 10 with the training loss convergence.

3.4. Example for Burger’s Equation

We take into consideration the Burger’s equation, which is known to be the nonlinear PDE in the following way:

u_{t} + u u_{x} = u_{x x},

(26)

with,

u (x, 0) = \frac{1}{2} - \frac{1}{2} \tanh (\frac{x}{4}) .

(27)

The exact solution of this equation is,

u (x, t) = \frac{1}{2} - \frac{1}{2} \tanh (\frac{1}{4} (x - \frac{1}{2} t)),

(28)

where

u (x, t)

represents the solution,

x

and

t

are the spatial and temporal variables, respectively.

The quantitative measures of the error of this problem are summarized in Table 7 which compares the maximum absolute error and relative

L_{2}

error in PINN and gPINN.

We provide Figure 11 containing visual comparison of the solution predicted by the model to the absolute error distribution. The study examines results across 3D and 2D spatial coordinate x and temporal variable t. Figure 12 represents the training loss convergence.

3.5. Example for Huxley Equation

We take into consideration the Huxley equation, which is known to be a nonlinear PDE in the following way:

u_{t} = u_{x x} + u (k - u) (u - 1), k \neq 0,

(29)

with the initial condition:

u (x, 0) = \frac{1}{2} + \frac{1}{2} \tanh (\frac{\sqrt{2} x}{4}),

(30)

The exact solution of the Huxley equation is:

u (x, t) = \frac{1}{2} + \frac{1}{2} \tanh (\frac{\sqrt{2} x}{4} - \frac{t}{4}),

(31)

where

u (x, t)

represents the solution,

x

and

t

are the spatial and temporal variables.

We provide Figure 13 containing visual comparison of the solution predicted by the model to the absolute error distribution and Figure 14 with the training loss convergence.

The quantitative measures of the error of this problem are summarized in Table 8 which compares the maximum absolute error and relative

L_{2}

error in PINN and gPINN.

3.6. Example for Fisher Equation

We take into consideration the Fisher equation, which is known to be a nonlinear PDE in the following way:

u_{t} = u_{x x} + u (1 - u) .

(32)

Initial condition:

u (x, 0) = \frac{1}{4} {(1 - \tanh (\frac{x}{2 \sqrt{6}}))}^{2} .

(33)

The exact solution of the Fisher equation is:

u (x, t) = \frac{1}{4} {(1 - \tanh (\frac{1}{2 \sqrt{6}}) (x - \frac{5}{\sqrt{6}} t))}^{2},

(34)

where

u (x, t)

represents the solution,

x

and

t

are the spatial and temporal variables, respectively.

The quantitative measures of the error of this problem are summarized in Table 9 which compares the maximum absolute error and relative

L_{2}

error in PINN and gPINN.

We provide Figure 15 containing visual comparison of the solution predicted by the model to the absolute error distribution. Computations are performed over three- and two-dimensional spatial regions, with x shows the spatial coordinate and t represents time, respectively. Figure 16 shows the training loss convergence.

3.7. Example for Burger’s–Huxley Equation

We take into consideration the Burger’s–Huxley equation, which is known to be a nonlinear PDE in the following way:

u_{t} = u_{x x} + u u_{x} + u (k - u) (u - 1), k \neq 0,

(35)

with initial conditions,

(x, 0) = \frac{1}{2} - \frac{1}{2} \tanh (\frac{x}{4}),

(36)

The exact solution of this equation is,

u (x, t) = \frac{1}{2} - \frac{1}{2} \tanh (\frac{1}{4} (x + \frac{3}{2} t)),

(37)

where

u (x, t)

represents the solution,

x

and

t

are the spatial and temporal variables, respectively.

We provide Figure 17 containing visual comparison of the solution predicted by the model to the absolute error distribution. The analysis is conducted over 3D and 2D spatial domains, x denotes the spatial coordinate and t represents time. Figure 18 shows the training loss convergence.

The quantitative measures of the error of this problem are summarized in Table 10 which compares the maximum absolute error and relative

L_{2}

error in PINN and gPINN.

3.8. Example for Burger’s–Fisher Equation

We take into consideration the Burger’s–Fisher equation, which is known to be a nonlinear PDE in the following way:

u_{t} = u_{x x} + u u_{x} + (1 - u),

(38)

with initial conditions,

u (x, 0) = \frac{1}{2} + \frac{1}{2} \tanh (\frac{x}{4}),

(39)

The exact solution of this equation is,

u (x, t) = \frac{1}{2} + \frac{1}{2} \tanh (\frac{1}{2 \sqrt{2}} (x + \frac{5}{2} t)),

(40)

where

u (x, t)

represents the solution,

x

and

t

are the spatial and temporal variables, respectively.

The neural network model for the Burger’s–Fisher equation takes spatial and temporal variables as inputs, three hidden layers with 50 neurons in each hidden layer and an output layer that forecasts

u (x, t) .

(tf.grandientTapetf) is used to calculate the PDE residual.

We provide Figure 19 containing visual comparison of the solution predicted by the model to the absolute error distribution and Figure 20 with the training loss convergence.

The quantitative measures of the error of this problem are summarized in Table 11 which compares the maximum absolute error and relative

L_{2}

error in PINN and gPINN.

3.9. Analytical Interpretation of Results

The observed performance differences between PINN and gPINN can be interpreted through the lens of spectral properties, mathematical stiffness, and foundational deep learning theory. For high-order and elliptic problems (e.g., the fourth-order PDE and Helmholtz equation), the governing operators are inherently sensitive to gradient accuracy. Standard PINNs are affected by spectral bias due to the Neural Tangent Kernel (NTK), which prioritizes low frequency components and slows convergence of derivative terms. gPINNs explicit gradient constraints modify the NTK eigen spectrum and act as a form of spectral regularization, suppressing high frequency error components that violate derivative smoothness aligning with principles of Sobolev training, which enriches the loss with derivative information [22]. This leads to superior accuracy and faster convergence in gradient-sensitive regimes.

In contrast, for strongly nonlinear problems (e.g., Burger’s, Allen–Cahn, and Fisher equations), the primary challenge arises from nonlinear stiffness and sharp solution fronts. Here, the additional gradient terms in gPINN can complicate the loss landscape and amplify spectral imbalance, whereas the standard PINN formulation often demonstrates greater robustness and training stability. This analysis underscores that the optimal solver is intrinsically linked to whether the dominant source of numerical difficulty lies in the gradient fidelity of the solution (favoring gPINN) or in its nonlinear complexity (favoring PINN).

4. Conclusions

This benchmarking study demonstrates that the optimal choice between standard PINNs and gPINNs is fundamentally determined by the mathematical properties of the target partial differential equation (PDE). Through systematic experiments across eight distinct PDE classes, we derive clear, evidence-based selection guidelines. For problems characterized by high-order derivatives or operating in severely data-sparse regimes, gPINN is recommended, as its gradient constraints provide superior error control and faster convergence, reducing the maximum absolute error by up to 36% in high-order cases. Conversely, for strongly nonlinear problems with sharp gradients or those involving internal interfaces, the standard PINN proves more robust, offering stable training and comparable accuracy without the computational overhead of gradient penalties. This work provides a practical, diagnostic framework that moves beyond performance comparison to deliver actionable guidance, enabling researchers and practitioners to align solver selection with the dominant mathematical features of their PDE, thereby enhancing the efficiency and reliability of Scientific Machine Learning applications. Future work will extend this framework to time-dependent multi-physics systems and adaptive training strategies.

Table 12 summarizes the consolidated performance of all tested PDEs giving a comparative perspective of metrics of errors. Based on these findings, Table 13 provides our evidence-based framework on algorithm selection, with the properties of PDEs serving as the criteria connected with the choice of the recommended solver.

PDE Property and Algorithm Selection Framework

To provide clear selection criteria, we define key PDE difficulty categories and link each to our test problems and results in Table 13.

Author Contributions

Conceptualization, M.A. and I.S.C.; methodology, M.A.; software, M.A. and M.S.A.; validation, I.S.C., K.A. and M.S.A.; formal analysis, K.A.; investigation, M.A. and M.S.A.; resources, K.A.; data curation, M.A.; writing—original draft preparation, M.A.; writing—review and editing, M.A. and I.S.C.; visualization, M.S.A.; supervision, K.A. and M.S.A.; project administration, I.S.C.; funding acquisition, K.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Morton, K.W. Finite difference and finite element methods. Comput. Phys. Commun. 1976, 12, 99–108. [Google Scholar] [CrossRef]
Hatzikonstantinou, P. New numerical method for partial differential equations. 1: Application to the diffusion equation. Int. J. Numer. Methods Fluids 1994, 18, 257–271. [Google Scholar] [CrossRef]
Yang, Z.; Nie, Y.; Yuan, Z.; Wang, J. Finite element methods for fractional PDEs in three dimensions. Appl. Math. Lett. 2020, 100, 106041. [Google Scholar] [CrossRef]
Zhao, Y.; Bu, W.; Huang, J.; Liu, D.-Y.; Tang, Y. Finite element method for two-dimensional space-fractional advection–dispersion equations. Appl. Math. Comput. 2015, 257, 553–565. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
Raissi, M.; Yazdani, A.; Karniadakis, G.E. Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations. Science 2020, 367, 1026–1030. [Google Scholar] [CrossRef] [PubMed]
Chen, Y.; Lu, L.; Karniadakis, G.E.; Negro, L.D. Physics-informed neural networks for inverse problems in nano-optics and metamaterials. Opt. Express 2020, 28, 11618–11633. [Google Scholar] [CrossRef] [PubMed]
Yuan, L.; Ni, Y.-Q.; Deng, X.-Y.; Hao, S. A-PINN: Auxiliary physics informed neural networks for forward and inverse problems of nonlinear integro-differential equations. J. Comput. Phys. 2022, 462, 111260. [Google Scholar] [CrossRef]
Pang, G.; Lu, L.; Karniadakis, G.E. fPINNs: Fractional Physics-Informed Neural Networks. SIAM J. Sci. Comput. 2019, 41, A2603–A2626. [Google Scholar] [CrossRef]
Zhang, D.; Lu, L.; Guo, L.; Karniadakis, G.E. Quantifying total uncertainty in physics-informed neural networks for solving forward and inverse stochastic problems. J. Comput. Phys. 2019, 397, 108850. [Google Scholar] [CrossRef]
Zhang, D.; Guo, L.; Karniadakis, G.E. Learning in Modal Space: Solving Time-Dependent Stochastic PDEs Using Physics-Informed Neural Networks. SIAM J. Sci. Comput. 2020, 42, A639–A665. [Google Scholar] [CrossRef]
Krishnapriyan, A.; Gholami, A.; Zhe, S.; Kirby, R.; Mahoney, M.W. Characterizing possible failure modes in physics-informed neural networks. In Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS 2021), Online, 6–14 December 2021. [Google Scholar]
Wang, S.; Teng, Y.; Perdikaris, P. Understanding and Mitigating Gradient Flow Pathologies in Physics-Informed Neural Networks. SIAM J. Sci. Comput. 2021, 43, A3055–A3081. [Google Scholar] [CrossRef]
Lu, L.; Meng, X.; Mao, Z.; Karniadakis, G.E. DeepXDE: A Deep Learning Library for Solving Differential Equations. SIAM Rev. 2021, 63, 208–228. [Google Scholar] [CrossRef]
Anagnostopoulos, S.J.; Toscano, J.D.; Stergiopulos, N.; Karniadakis, G.E. Residual-based attention in physics-informed neural networks. Comput. Methods Appl. Mech. Eng. 2024, 421, 116805. [Google Scholar] [CrossRef]
Lagaris, I.E.; Likas, A.; Fotiadis, D.I. Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans. Neural Netw. 1998, 9, 987–1000. [Google Scholar] [CrossRef] [PubMed]
Sukumar, N.; Srivastava, A. Exact imposition of boundary conditions with distance functions in physics-informed deep neural networks. Comput. Methods Appl. Mech. Eng. 2022, 389, 114333. [Google Scholar] [CrossRef]
Yu, J.; Lu, L.; Meng, X.; Karniadakis, G.E. Gradient-enhanced physics-informed neural networks for forward and inverse PDE problems. Comput. Methods Appl. Mech. Eng. 2022, 393, 114823. [Google Scholar] [CrossRef]
Beniwal, K.; Kumar, V. Gradient-Based Physics-Informed Neural Network. In Third Congress on Intelligent Systems; Springer: Singapore, 2023; pp. 749–761. [Google Scholar] [CrossRef]
Miao, Y.; Li, H.; Mandic, D. GPINN: Physics-Informed Neural Network with Graph Embedding. In Proceedings of the 2024 International Joint Conference on Neural Networks (IJCNN), Yokohama, Japan, 30 June–5 July 2024; IEEE: New York, NY, USA, 2024; pp. 1–8. [Google Scholar] [CrossRef]
Zhou, H.-J.; Chen, Y. Gradient-enhanced PINN with residual unit for studying forward-inverse problems of variable coefficient equations. Phys. D 2025, 481, 134764. [Google Scholar] [CrossRef]
Czarnecki, W.M.; Osindero, S.; Jaderberg, M.; Swirszcz, G.; Pascanu, R. Sobolev Training for Neural Networks. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]

Figure 1. Schematic of the standard PINN architecture.

Figure 2. Schematic of the standard gPINN architecture.

Figure 3. Comparison of the exact solution with standard PINN and gPINN predictions, along with absolute error distributions.

Figure 4. Training Dynamic: gPINN loss drops below

1 \times 10^{- 3}

at epoch 200 vs. 350 for PINN.

Figure 4. Training Dynamic: gPINN loss drops below

1 \times 10^{- 3}

at epoch 200 vs. 350 for PINN.

Figure 5. Comparison of the exact solution with standard PINN and gPINN predictions, along with absolute error distributions of the diffusion equation evaluated over a computational domain defined by spatial coordinate x and temporal variable t.

Figure 6. Training Dynamic: gPINN achieves

L_{2}

error < 6 × 10⁻³ by epoch 100, while PINN requires 250 epochs.

Figure 6. Training Dynamic: gPINN achieves

L_{2}

error < 6 × 10⁻³ by epoch 100, while PINN requires 250 epochs.

Figure 7. Comparison of the exact solution with standard PINN and gPINN predictions, along with absolute error distributions of the Allen–Cahn Equation over spatial coordinate x and temporal variable t.

Figure 8. Training Dynamic: gPINN final loss is 0.0715 vs. standard PINNs 0.0714, with 20% faster early convergence.

Figure 9. Comparison of the exact solution with standard PINN and gPINN predictions, along with absolute error distributions of Helmholtz interface problem over a cartesian domain (x,y).

Figure 10. Training Dynamic: standard PINN shows more stable loss descent; gPINN has 30% higher loss variance.

Figure 11. Comparison of the exact solution with standard PINN and gPINN predictions, along with absolute error distributions of Burger’s equation.

Figure 12. Training Dynamic: gPINN has 25% higher initial loss but reaches 15% lower final error.

Figure 13. Comparison of the exact solution with standard PINN and gPINN predictions, along with absolute error distributions of Huxley equation over spatial coordinate x and temporal variable t.

Figure 14. Training Dynamic: gPINN converges 40% faster and reduces final error by 18%.

Figure 15. Comparison of the exact solution with standard PINN and gPINN predictions, along with absolute error distributions of Fisher equation.

Figure 16. Training Dynamic: gPINN error is 0.00107 vs. standard PINN 0.00162, with similar convergence speed.

Figure 17. Comparison of the exact solution with standard PINN and gPINN predictions, along with absolute error distributions of Burger’s–Huxley equation.

Figure 18. Training Dynamic: gPINN achieves 0.00547 error vs. standard PINN 0.00740.

Figure 19. Comparison of the exact solution with standard PINN and gPINN predictions, along with absolute error distributions of Burger’s–Fisher equation, where x shows spatial coordinate and t shows temporal variable.

Figure 20. Training Dynamic: gPINN final L_2 error is 0.00145 vs. standard PINN 0.00740.

Table 1. Sensitivity of gPINN to gradient-loss weight

λ = λ_{x} = λ_{y}

for the fourth-order PDE (gPINN).

Table 1. Sensitivity of gPINN to gradient-loss weight

λ = λ_{x} = λ_{y}

for the fourth-order PDE (gPINN).

$λ$	$Relative L_{2}$ Error	Max Absolute Error	$Epochs to Loss < 10^{- 3}$	Stability
$10^{- 5}$	$0.0062$	$0.0061$	$380$	Stable
$10^{- 4}$	$0.0085$	$0.0058$	$200$	Optimal
$10^{- 3}$	$0.0069$	$0.0065$	$420$	Occasional spikes
$10^{- 2}$	$0.0085$	$0.0079$	$> 500$	Unstable

Table 2. Comparison of relative

(L_{2})

and absolute error for standard PINN and gPINN.

Table 2. Comparison of relative

(L_{2})

and absolute error for standard PINN and gPINN.

Method	Max Absolute Error	$L_{2}$
PINN	0.0091	0.0066
gPINN	0.0058	0.0058

Table 3. Computational Cost Comparison: Standard PINN vs. gPINN.

Problem	Method	Epoch Time (ms)	GPU Memory (MB)	Total Time (s)
Fourth-Order PDE	PINN	12.5	1450	187.5
	gPINN	28.7	2880	229.6
Burger’s Equation	PINN	8.3	1210	166.0
	gPINN	19.1	2150	229.2
Helmholtz Interface	PINN	10.1	1320	181.8
	gPINN	24.5	2450	343.0

Table 4. Comparison of relative

(L_{2})

and maximum absolute error between standard PINN and gPINN for diffusion equation.

Table 4. Comparison of relative

(L_{2})

and maximum absolute error between standard PINN and gPINN for diffusion equation.

Method	Max Absolute Error	$L_{2}$
PINN	0.0098	0.0058
gPINN	0.0078	0.0058

Table 5. Comparison of relative

(L_{2})

and maximum absolute error between standard PINN and gPINN for Allen_Cahn equation.

Table 5. Comparison of relative

(L_{2})

and maximum absolute error between standard PINN and gPINN for Allen_Cahn equation.

Method	Max Absolute Error	$L_{2}$
PINN	0.0714	0.0092
gPINN	0.0715	0.0091

Table 6. Comparison of relative

(L_{2})

and maximum absolute error between standard PINN and gPINN for Helmholtz Interface problem example.

Table 6. Comparison of relative

(L_{2})

and maximum absolute error between standard PINN and gPINN for Helmholtz Interface problem example.

Method	Max Absolute Error	$L_{2}$
PINN	0.0373	0.0036
gPINN	0.0322	0.0011

Table 7. Comparison of relative

(L_{2})

and maximum absolute error between standard PINN and gain for Burger’s equations example.

Table 7. Comparison of relative

(L_{2})

and maximum absolute error between standard PINN and gain for Burger’s equations example.

Method	Max Absolute Error	$L_{2}$
PINN	0.0183	0.0073
gPINN	0.0174	0.0055

Table 8. Comparison of relative

(L_{2})

and absolute error between standard PINN and gPINN for Huxley equation example.

Table 8. Comparison of relative

(L_{2})

and absolute error between standard PINN and gPINN for Huxley equation example.

Method	Max Absolute Error	$L_{2}$
PINN	0.0252	0.0061
gPINN	0.0125	0.0056

Table 9. Comparison of relative

(L_{2})

and maximum absolute error between standard PINN and gPINN for Fisher equation example.

Table 9. Comparison of relative

(L_{2})

and maximum absolute error between standard PINN and gPINN for Fisher equation example.

Method	Max Absolute Error	$L_{2}$
PINN	0.0541	0.0016
gPINN	0.0318	0.0010

Table 10. Comparison of relative

(L_{2})

and absolute error between standard PINN and gPINN for Burger’s–Huxley equation example.

Table 10. Comparison of relative

(L_{2})

and absolute error between standard PINN and gPINN for Burger’s–Huxley equation example.

Method	Max Absolute Error	$L_{2}$
PINN	0.0244	0.0074
gPINN	0.0147	0.0054

Table 11. Comparison of relative

(L_{2})

and maximum absolute error between standard PINN and gPINN for Burger’s–Fisher Equation example.

Table 11. Comparison of relative

(L_{2})

and maximum absolute error between standard PINN and gPINN for Burger’s–Fisher Equation example.

Method	Max Absolute Error	$L_{2}$
PINN	0.0141	0.0039
gPINN	0.0118	0.0014

Table 12. Comparison of relative

(L_{2})

and absolute error for standard PINN and gPINN.

Table 12. Comparison of relative

(L_{2})

and absolute error for standard PINN and gPINN.

PDE Example	Method	Max Absolute Error	$Relative L_{2}$ Error	Remarks
Fourth-Order PDE	PINN	0.0092	0.0067	gPINN showed better error control.
	gPINN	0.0059	0.0059
Diffusion Equation	PINN	0.0099	0.0059	gPINN more accurate.
	gPINN	0.0079	0.0059
Allen–Cahn Equation	PINN	0.0725	0.0093	Comparable performance, gPINN slightly better.
	gPINN	0.0713	0.0091
Helmholtz Interface Problem	PINN	0.0373	0.0036	gPINN achieves lower $L_{2}$ and max absolute error, showing improved accuracy for interface problems.
	gPINN	0.0322	0.0011
Burger’s Equation	PINN	0.0183	0.0073	Clear gPINN advantage.
	gPINN	0.0174	0.0055
Huxley Equation	PINN	0.0252	0.0061	gPINN significantly reduced max error.
	gPINN	0.0125	0.0056
Fisher Equation	PINN	0.0541	0.0016	gPINN outperformed PINN.
	gPINN	0.0318	0.0010
Burger’s–Huxley Equation	PINN	0.0244	0.0074	gPINN showed clear improvement.
	gPINN	0.0147	0.0054
Burger’s–Fisher Equation	PINN	0.0140	0.0039	gPINN lower $L_{2}$ , PINN slightly lower max.
	gPINN	0.0158	0.0014

Table 13. Algorithm selection framework based on PDE properties.

PDE Challenge	Test Problem	Recommended Solver	Rationale
High-Order Derivative	Fourt-Order PDE	gPINN	Gradient constraints improve stability and accuracy; optimal with $λ ≅ 10^{- 4}$ .
Strong Nonlinearity	Burger’s, Huxley, Fisher	PINN (robust)/ gPINN (smooth)	PINN more stable for sharp fronts; gPINN better for smooth waves.
Interface Conditions	Helmholtz Interface	gPINN	Simpler loss, reliable convergence with moderate gradient regularization.
Data Sparsity	All Problems	gPINN	Gradient terms inject physical prior, improving accuracy in sparse regimes.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Azam, M.; Chuhan, I.S.; Ahmed, M.S.; Arshid, K. A Benchmarking Study for Algorithm Selection in Scientific Machine Learning (SciML): PINN vs. gPINN for Solving Partial Differential Equations. AppliedMath 2026, 6, 26. https://doi.org/10.3390/appliedmath6020026

AMA Style

Azam M, Chuhan IS, Ahmed MS, Arshid K. A Benchmarking Study for Algorithm Selection in Scientific Machine Learning (SciML): PINN vs. gPINN for Solving Partial Differential Equations. AppliedMath. 2026; 6(2):26. https://doi.org/10.3390/appliedmath6020026

Chicago/Turabian Style

Azam, Muhammad, Imran Shabir Chuhan, Muhammad Shafiq Ahmed, and Kaleem Arshid. 2026. "A Benchmarking Study for Algorithm Selection in Scientific Machine Learning (SciML): PINN vs. gPINN for Solving Partial Differential Equations" AppliedMath 6, no. 2: 26. https://doi.org/10.3390/appliedmath6020026

APA Style

Azam, M., Chuhan, I. S., Ahmed, M. S., & Arshid, K. (2026). A Benchmarking Study for Algorithm Selection in Scientific Machine Learning (SciML): PINN vs. gPINN for Solving Partial Differential Equations. AppliedMath, 6(2), 26. https://doi.org/10.3390/appliedmath6020026

Article Menu

A Benchmarking Study for Algorithm Selection in Scientific Machine Learning (SciML): PINN vs. gPINN for Solving Partial Differential Equations

Abstract

1. Introduction

2. Methodology (General Framework)

2.1. Standard PINN Architecture

2.2. gPINN Architecture

2.3. Solving Process

Sensitivity Analysis of Gradient-Loss Weights (NEW)

2.4. Implementation

Computational Efficiency Analysis

3. Comparison of Standard PINN and gPINN Methods for the Solution of PDEs

3.1. Example for Diffusion Equation:

3.2. Example for Allen–Cahn Equation

3.3. Example for Helmholtz Interface Problem

3.4. Example for Burger’s Equation

3.5. Example for Huxley Equation

3.6. Example for Fisher Equation

3.7. Example for Burger’s–Huxley Equation

3.8. Example for Burger’s–Fisher Equation

3.9. Analytical Interpretation of Results

4. Conclusions

PDE Property and Algorithm Selection Framework

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI