Generalization-Capable PINNs for the Lane–Emden Equation: Residual and StellarNET Approaches

Mohuț, Andrei-Ionuț; Popa, Călin-Adrian

doi:10.3390/app151810035

Open AccessArticle

Generalization-Capable PINNs for the Lane–Emden Equation: Residual and StellarNET Approaches

by

Andrei-Ionuț Mohuț

and

Călin-Adrian Popa

^*

Department of Computers and Information Technology, Politehnica University of Timișoara, 300223 Timișoara, Romania

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(18), 10035; https://doi.org/10.3390/app151810035

Submission received: 23 July 2025 / Revised: 4 September 2025 / Accepted: 8 September 2025 / Published: 14 September 2025

(This article belongs to the Special Issue Advances in AI and Multiphysics Modelling)

Download

Browse Figures

Versions Notes

Abstract

We present a Physics-Informed Neural Network (PINN) approach to solving the Lane–Emden equation, a model used to describe polytropic stars’ behavior in astrophysics. The equation is reformulated as a two-dimensional problem; we treat both the radial coordinate and polytropic index as inputs for the neural network. In order to improve stability and accuracy, we introduced coordinate embedding via Random Fourier Features, residual blocks, and gating mechanisms. Experiments show that our neural networks outperform other traditional numerical methods, including Monte Carlo simulations and standard fully connected PINNs. We achieve accurate predictions for both trained and extrapolated polytropic indices. The code used to implement our method is publicly available providing researchers with the resources to replicate and extend our work.

Keywords:

PINN; Lane–Emden equation; differential equations; astrophysical modeling

1. Introduction

Neural networks are one of the most powerful tools for solving complex mathematical models and engineering tasks in a variety of scientific domains. The Universal Approximation Theorem (UAT), first proven by Cybenko [1] and Hornik et al. [2], establishes that a neural network with a sufficient number of neurons can approximate any continuous function on a compact subset of

R^{n}

to any desired degree of accuracy. Using the UAT, neural networks are not only capable of approximating solutions to real-world problems, but can also achieve any degree of precision with an impressive ability to recognize and generate high-level concepts.

In the theoretical physics domain, all natural phenomena can be mathematically described via a differential equation or a set of differential equations; hence, the UAT supports the use of neural networks as powerful function approximators, capable of learning solutions to such equations under appropriate conditions. This makes them valuable tools for modeling physical systems where traditional methods may face limitations, thus enabling breakthroughs across multiple fields.

The concept of applying neural networks to solve differential equations was proposed long ago by Lagaris et al. [3] and Lee and Kang [4], but gained significant attention just in recent years, fueled by advances in deep learning and the development of specialized libraries such as DeepXDE [5], which boosted the implementation of these techniques. This approach has demonstrated impressive results in solving ordinary differential equations (ODEs) and partial differential equations (PDEs). Neural networks designed specifically for this purpose are also known as Physics-Informed Neural Networks (PINNs). The PINN method integrates the governing differential equation itself or the system of differential equations directly in the neural network training process, ensuring that the model adheres to the underlying physical laws throughout the solution domain. By minimizing a loss function that includes both the residuals of the ODE/PDE and any boundary or initial conditions, PINNs can provide a highly accurate and physics-consistent solution to differential equations. PINNs have the power to solve both forward and inverse problems. They have been successfully applied in fields such as fluid dynamics [6,7], bioengineering [8], materials science [9], electromagnetism [10], and geoscience [11]. Using the physical constraints directly into the loss function does not require any train-test datasets; the neural network will learn directly the solution of the differential equation, obtaining a more accurate approximation at each epoch. Hence, for a wide range of scientific and engineering problems, competitive results have been obtained when solved using PINNs, in comparison with the classical numerical methods.

Despite their empirical success, there are numerous challenges in the development of PINNs. Wang et al. [12,13,14] identified issues such as gradient flow pathologies that hinder convergence, failures in training due to neural tangent kernel properties, and unrespected causality that leads to incorrect solutions. These studies show that, while promising, PINNs still face limitations that need to be addressed, offering a wide area of PINN research subdomains: initialization strategies, network architecture, loss function design, domain decomposition, and many more.

In the context of astrophysics, the Lane–Emden equation is an ODE used to model the structure of polytropic stars. Traditional numerical methods, while effective, often face challenges in terms of stability and long-term accuracy. In particular, near

t = 0

, we have a singularity introduced by division by zero, which can cause a decrease in long-term accuracy. The PINN method uses a completely different approach that, in some aspects, can result in advantages.

This research presents advances in the application of PINNs to the Lane–Emden equation by extending beyond standard fully connected network (FCN) architectures toward more expressive and stable alternatives. Previous studies by Wang et al. [15] demonstrated, both experimentally and theoretically, that, for PINNs that employ multilayer perceptron (MLP) layers, increasing the number of hidden layers or neurons can lead to degraded performance. To address this limitation, the use of residual blocks, gating mechanisms, or loopback connections becomes essential to boost the ability of the network to capture complex data relationships while ensuring more stable and accurate training.

The key contributions introduced by us in this paper are the introduction of two new Residual PINN and StellarNET, both of them being capable of solving multiple instances of the Lane–Emden equation at training and later doing inference on previously unseen equations. This paper is structured as follows. In Section 2, we present the theoretical background of the Lane–Emden equation and the fundamentals of PINNs. Section 3 introduces the proposed network architectures and describes the construction of the loss function. In Section 4, we evaluate the performance of our PINNs by comparing their outputs with known analytical solutions and standard numerical solvers. It is important to note that PINNs are trained in an unsupervised manner; therefore, standard supervised learning metrics such as accuracy, precision, or recall do not apply. Instead, model quality is assessed through satisfaction of the governing physics and, where available, by comparison against known analytical solutions or relative error with respect to other numerical methods. We benchmark our models against other state-of-the-art results and conduct convergence and stability analyses between our two models. We also extend the framework to a supplementary micro-electro-mechanical systems (MEMS) example [16]. In particular, we chose the simple yet representative case of a damped resonator, for which the analytical solution is available, to clearly demonstrate the capability of our Residual PINNs. This aligns with recent research efforts applying PINNs to MEMS-related problems [17,18]. Finally, Section 5 concludes the paper and outlines potential future work directions. The code is available at https://github.com/AndreiMohut/LaneEmden-PINN-Solver (accessed on 4 September 2025).

2. Theoretical Fundamentals

2.1. Lane–Emden Equation

The Lane–Emden equation arises in astrophysical contexts as a non-dimensional representation of the equilibrium conditions governing spherically symmetric, self-gravitating fluids. It is derived under the assumption that the fluid is a self-gravitating, spherically symmetric body in hydrostatic equilibrium. The equation is expressed as follows:

\frac{1}{ξ^{2}} \frac{d}{d ξ} (ξ^{2} \frac{d θ}{d ξ}) + θ^{n} = 0,

(1)

where

ξ

is the dimensionless radial coordinate,

θ

is the dimensionless density, and n is the polytropic index that relates the pressure and density of the fluid by

P \propto ρ^{1 + 1 / n}

[19]. The Lane–Emden equation provides a mathematical model for polytropic stellar structures, where the polytropic index n, which can take any value from 0 to

+ \infty

(not necessarily a natural number), determines the properties of the stellar body.

For example, the case

n = 0

corresponds to a constant-density sphere, while

n = 1.5

models fully convective stars, and

n = 3

approximates the behavior of white dwarfs. The initial conditions are typically defined at the center of the star as follows:

θ (0) = 1, \frac{d θ}{d ξ} (0) = 0,

(2)

ensuring the regularity of the solution at the center [20].

The equation has been a subject of interest since its introduction by Lane in 1870 and later generalized by Emden in 1907 [20,21]. It can be derived starting from the hydrostatic equilibrium formalism of a self-gravitating, spherically symmetric fluid or from the Poisson equation. The solution will depend on the polytropic index; hence, there are only three cases that admit analytical solutions expressible in simple transcendental functions:

\begin{matrix} θ {(ξ)}_{n = 0} & = 1 - \frac{ξ^{2}}{6}, \end{matrix}

(3)

\begin{matrix} θ {(ξ)}_{n = 1} & = \frac{sin ξ}{ξ}, \end{matrix}

(4)

\begin{matrix} θ {(ξ)}_{n = 5} & = \frac{1}{\sqrt{1 + \frac{ξ^{2}}{3}}} . \end{matrix}

(5)

For other values of n, analytical solutions can still exist that do not possess a closed form [22]. For example, in the case of

n = 2

, an analytical solution in the form of a power series around

ξ = 0

satisfies the following equation:

θ (ξ) = \sum_{m = 0}^{\infty} a_{2 m} ξ^{2 m},

(6)

with the following recursion relation:

a_{2 (m + 1)} = - \frac{1}{(2 m + 2) (2 m + 3)} \sum_{k = 0}^{m} a_{2 k} a_{2 (m - k)} .

(7)

Various analytical and numerical methods have been used to solve the Lane–Emden equation. The classical Runge–Kutta methods have been successfully applied to estimate their solutions [23]. A lot of analytical and semi-analytical approaches have also been explored, which can be very effective in handling nonlinear terms in ODEs or PDEs. The Adomian decomposition method was used to investigate the Lane–Emden equation by Shawagfeh [24], a semi-analytical method that was later improved by Wazwaz [25]. Other significant contributions include the series expansion approach proposed by Ramos [26], the accelerated power series method by Nouh [27], and the Variational Iteration Method, explored by Dehghan and Shakeri [28]. The Homotopy Perturbation Method (HPM), used in several Lane–Emden-type problems [29], combines homotopy concepts with classical perturbation techniques to produce rapidly convergent series solutions. However, its performance depends strongly on the initial guess and may deteriorate for highly nonlinear cases or over long integration intervals. Similarly, VIM iteratively corrects trial functions via variational theory, but its accuracy depends on the choice of Lagrange multipliers and the handling of higher-order terms, which can limit robustness across different polytropic indices. Point-solution techniques, including collocation-based and polynomial fitting methods [30], provide accurate approximations on selected grids but face difficulties near

t = 0

due to the singular term

(2 / t) θ^{'} (t)

, and they do not generalize naturally to unseen values of n. Mukherjee [31] also employed the differential transform method to solve the Lane–Emden equation efficiently. In contrast, the proposed PINN approach enables stable treatment near

t = 0

and, importantly, allows a single trained model to interpolate and extrapolate across a family of polytropic indices.

The variety of semi-analytical methods used for this equation reflects the continued interest and ongoing advancements in obtaining a unified theory of differential equations. However, recent advances in hardware acceleration have shifted the focus towards using efficient numerical methods as a viable and faster alternative for solving differential equations. In recent years, new computational techniques such as Monte Carlo methods and neural networks have been introduced to differential equations, greatly advancing the field of computational mathematics. Monte Carlo methods, well known for their stochastic nature and ability to handle high-dimensional problems, have been employed to model Lane–Emden-type equations by El-Essawy et al. [32].

Neural networks have also proven their power in solving the Lane–Emden equation, offering high accuracy for both ordinary and fractional versions. Nouh et al. [33] used artificial neural networks to model fractional polytropic gas spheres, achieving results comparable to traditional methods. Mall and Chakraverty [34] applied a Chebyshev neural network, a highly computationally efficient method. Recently, Baty [35] achieved state-of-the-art results using a simple fully connected PINN, which outperformed other numerical methods, despite using a reduced number of layers and neurons. Some of their results will be used as state-of-the-art numerical benchmarks in later sections.

2.2. Physics-Informed Neural Networks (PINNs)

Physics-Informed Neural Networks (PINNs) represent one of the newest and most powerful tools for simulating and approximating the solutions of differential equations. They represent a modern research area located at the intersection of deep learning with computational mathematics, being able to solve any natural phenomena described by a mathematical model. A PINN is a neural network in which the loss function is built in order to respect the governing differential equation along with any other constraints: boundary, initial, or limit conditions, depending on the type of differential equation. A general schematic of the PINN architecture and its corresponding loss function components is shown in Figure 1.

Consider a general differential equation of the following form [36]:

D [u (x, t); λ] = f (x, t), x \in Ω, t \in [0, + \infty),

(8)

where

D

is the differential operator,

f (x, t)

represents any forcing function, and

u (x, t)

is the unknown solution that we wish to approximate. The domain

Ω

represents the spatial region of interest. In order for the problem to be well posed, the equation is accompanied by boundary conditions, expressed as follows:

B_{k} [u (x, t)] = g_{k} (x, t), x \in Γ_{k} \subseteq \partial Ω,

(9)

where

B_{k}

represents the boundary operator and

g_{k} (x, t)

is the boundary value condition on the boundary

Γ_{k}

. Similarly, since we stated the problem as a time-dependent one, we also enforce initial conditions, such as follows:

u (x, 0) = u_{0} (x), \frac{\partial u (x, t)}{\partial t} |_{t = 0} = u_{0}^{'} (x),

(10)

which define the state of the system at the start of the process.

To solve such a problem, PINNs employ a neural network

N N (x, t; ω)

, where

ω

are the learnable parameters of the network. As a result of the universal approximation theorem, the network can approximate the solution

u (x, t)

, aiming to minimize the residuals of both the differential equation and the constraints. The goal is to ensure that the neural network output

N N (x, t; ω) \approx u (x, t)

adheres to both the physics governing the system and the required conditions at the boundaries and initial time.

The overall training objective of PINNs is to minimize a composite loss function, consisting of a physics loss and a boundary/initial condition loss. The total loss function can be written as follows:

L (ω) = L_{p} (ω) + L_{b} (ω) + L_{i} (ω),

(11)

where

L_{p} (ω)

corresponds to the physics loss, ensuring that the differential equation holds across the domain,

L_{b} (ω)

enforces the boundary conditions, while

L_{i} (ω)

enforces the initial conditions. The physics loss is formulated as follows:

L_{p} (ω) = \frac{1}{N_{p}} \sum_{i = 1}^{N_{p}} {∥D [N N (x_{i}, t_{i}; ω)] - f (x_{i}, t_{i})∥}^{2},

(12)

where

N_{p}

represents the number of collocation points in the domain

Ω

, and

D

denotes the differential operator that governs the system.

For boundary and initial conditions, the losses are expressed as follows:

\begin{matrix} L_{b} (ω) & = \sum_{k} \frac{1}{N_{b}} \sum_{j = 1}^{N_{b}} {∥B_{k} [N N (x_{j}, t_{j}; θ)] - g_{k} (x_{j})∥}^{2}, \end{matrix}

(13)

\begin{matrix} L_{i} (ω) & = {∥N N (x, 0) - u_{0} (x)∥}^{2} + {∥\frac{\partial N N (x, 0; ω)}{\partial t} - u_{0}^{'} (x)∥}^{2}, \end{matrix}

(14)

where

N_{b}

represents the points sampled on the boundary

\partial Ω

or at the initial time

t = 0

. This loss ensures that the solution satisfies the imposed conditions, such as the Dirichlet or Neumann boundary conditions, and the initial time-dependent conditions.

3. Proposed Method

In this section, we describe the proposed method for solving the Lane–Emden equation using PINNs. The approach reformulates the Lane–Emden equation as a two-dimensional problem with input variables t (the radial coordinate) and n (the polytropic index), deviating from the classical use of

ξ

as the independent variable. This reformulation enables the network to approximate the solution across multiple polytropic indices within a unified framework, offering better generalization capabilities for different values of n. The new Lane–Emden equation will be given by the following:

\frac{d^{2} θ}{d t^{2}} + \frac{2}{t} \frac{d θ}{d t} + θ^{n} = 0

(15)

where t is the radial coordinate,

θ

is the dimensionless density, and n is the polytropic index. The same initial conditions apply.

To solve this equation across different polytropic indices, we treat the problem as a function of both t and n, making it a two-dimensional problem. The PINN takes t and n as inputs and learns to predict

θ (t, n)

, the solution, by minimizing a loss function that captures the equation’s residual and the initial conditions.

3.1. Networks and Architectures

In this paper, we conduct experiments using two different PINNs. The first, referred to as the Residual PINN, consists of fully connected dense layers and residual blocks—a design that, even if it is simple, offers a highly accurate solution. The Residual PINN was chosen as the initial experiment to demonstrate how effectively we can construct a baseline solution compared to other existing computational methods, including those employing PINNs. The second architecture, called StellarNET, incorporates residual blocks, gating mechanisms, and adaptive skip connections. It is inspired by PirateNETs [15], which have demonstrated exceptional accuracy in solving various ODE/PDE benchmarks.

3.1.1. Residual PINN

Let

x_{0} = (t, n) \in R^{2}

denote the input vector, where t is the radial coordinate and n is the polytropic index. The input is first mapped to a higher-dimensional space via an affine transformation, followed by the Swish activation:

\begin{matrix} x_{1} & = σ (W_{0} x_{0} + b_{0}), \end{matrix}

(16)

\begin{matrix} σ (x) & = x \cdot sigmoid (x), W_{0} \in R^{h \times 2} . \end{matrix}

(17)

Next, the transformed input is passed through a sequence of L residual blocks. Each residual block consists of a single hidden layer with the same hidden dimensionality h, and the forward pass through the l-th block is defined as follows:

x_{l + 1} = x_{l} + σ (W_{l} x_{l} + b_{l}), W_{l} \in R^{h \times h}, 1 \leq l \leq L .

(18)

After the residual stack, the output passes through two fully connected layers with Swish activation in between:

\begin{matrix} z_{1} & = σ (W_{L + 1} x_{L + 1} + b_{L + 1}), W_{L + 1} \in R^{h \times h}, \end{matrix}

(19)

\begin{matrix} z_{2} & = W_{L + 2} z_{1} + b_{L + 2}, W_{L + 2} \in R^{h / 2 \times h} . \end{matrix}

(20)

Finally, a last linear layer reduces the dimensionality to produce the scalar output:

θ (t, n) = W_{out} z_{2} + b_{out}, W_{out} \in R^{1 \times h / 2} .

(21)

All weights

W_{\cdot}

are initialized using Xavier initialization. The hidden size h controls the network’s width, while the number of residual blocks L defines its depth.

3.1.2. StellarNET

The architecture of StellarNET extends the concept of residual learning by incorporating gating mechanisms, coordinate embeddings, and adaptive skip connections. Inspired by PirateNETs [15], this design utilizes Random Fourier Features (RFFs) to embed the input space into a higher-dimensional representation, enabling the network to approximate high-frequency solution dynamics more effectively [37].

The input vector

(t, n)

, where t is the radial coordinate and n is the polytropic index, is first transformed using an RFF embedding:

Φ (x) = [\begin{matrix} cos (B x) \\ sin (B x) \end{matrix}],

(22)

where B is a matrix with entries sampled from a Gaussian distribution

N (0, s^{2})

, with s being a user-specified hyperparameter.

The embedded coordinates

Φ (x)

are passed through two dense layers with trainable weights to produce intermediate feature maps:

\begin{matrix} U & = σ (W_{U}^{2} σ (W_{U}^{1} Φ (x) + b_{U}^{1}) + b_{U}^{2}), \end{matrix}

(23)

\begin{matrix} V & = σ (W_{V}^{2} σ (W_{V}^{1} Φ (x) + b_{V}^{1}) + b_{V}^{2}), \end{matrix}

(24)

where

σ

represents the sigmoid activation function, and

W_{U}^{1}, W_{U}^{2}, W_{V}^{1}, W_{V}^{2}

are the weights, while

b_{U}^{1}, b_{U}^{2}, b_{V}^{1}, b_{V}^{2}

are the biases of the dense layers.

Let

x^{(l)}

denote the input of the l-th block for

1 \leq l \leq L

. The forward pass in each StellarNET block is defined as follows:

\begin{matrix} f^{(l)} & = softmax (U), \end{matrix}

(25)

\begin{matrix} g^{(l)} & = \max (V, 0), \end{matrix}

(26)

\begin{matrix} z_{1}^{(l)} & = f^{(l)} ⊙ g^{(l)} + (1 - f^{(l)}) ⊙ x^{(l)}, \end{matrix}

(27)

\begin{matrix} h^{(l)} & = β \cdot σ (W_{h}^{2} σ (W_{h}^{1} z_{1}^{(l)} + b_{h}^{1}) + b_{h}^{2}), \end{matrix}

(28)

\begin{matrix} x^{(l + 1)} & = h^{(l)} + (1 - β) ⊙ x^{(l)}, \end{matrix}

(29)

where ⊙ denotes element-wise multiplication,

β

is a learnable scaling parameter, and the transformation

h^{(l)}

consists of two dense layers with trainable weights (

W_{h}^{1}, W_{h}^{2}

) and biases (

b_{h}^{1}, b_{h}^{2}

).

The final output of StellarNET after L residual blocks is given by the following:

θ (t, n) = W_{out} x^{(L)} + b_{out},

(30)

where

W_{out}

and

b_{out}

are the weights and biases of the final output layer.

It is important to note that the residual gate mechanism used in StellarNET was introduced prior to PirateNET by Savarese [38]. In this formulation, a residual function

F (x)

and the input x are combined using a learnable parameter, as follows:

y = ReLU (α) \cdot (F (x) + x) + (1 - ReLU (α)) \cdot x .

(31)

The use of RFF embeddings, combined with gating mechanisms and adaptive skip connections, allows StellarNET to handle high-dimensional solution spaces effectively. Each transformation in StellarNET consists of dense layers initialized with Xavier normalization, ensuring stable training and efficient gradient flow. As we will observe in later sections, these architectural features enable StellarNET to achieve remarkably low errors across the entire solution domain.

3.2. Loss Function and Training

To train the neural networks, the two-dimensional input domain is discretized over a set of values for the radial coordinate t and the polytropic index n. The coordinate t is sampled over a finite interval with a uniform step size, resulting in the set

{t_{i}}_{i = 1}^{m}

. The polytropic index n is sampled from a continuous range of physically relevant values, forming the set

{n_{j}}_{j = 1}^{p}

. Together, these points define a grid of colocation pairs

(t_{i}, n_{j})

where the network’s predictions are evaluated.

The network is trained by minimizing a loss function that captures the residual of the Lane–Emden equation at these colocation points, as well as the associated initial conditions. The residual of the equation at a point

(t_{i}, n_{j})

is given by the following:

R (t_{i}, n_{j}) = \frac{\partial^{2} θ}{\partial t^{2}} (t_{i}, n_{j}) + \frac{2}{t_{i}} \frac{\partial θ}{\partial t} (t_{i}, n_{j}) + θ {(t_{i}, n_{j})}^{n_{j}} .

(32)

The first- and second-order derivatives with respect to t are computed via automatic differentiation. The total loss function

L

consists of three components:

L = λ_{1} L_{residual} + λ_{2} L_{θ (0, n)} + λ_{3} L_{\partial_{t} θ (0, n)},

(33)

where

\begin{matrix} L_{residual} & = \frac{1}{m p} \sum_{i = 1}^{m} \sum_{j = 1}^{p} R {(t_{i}, n_{j})}^{2}, \end{matrix}

(34)

\begin{matrix} L_{θ (0, n)} & = \frac{1}{p} \sum_{j = 1}^{p} {(θ (0, n_{j}) - 1)}^{2}, \end{matrix}

(35)

\begin{matrix} L_{\partial_{t} θ (0, n)} & = \frac{1}{p} \sum_{j = 1}^{p} {({\frac{\partial θ}{\partial t}|}_{t = 0, n = n_{j}})}^{2} . \end{matrix}

(36)

Here,

L_{residual}

enforces the Lane–Emden equation, while the other terms enforce the initial conditions

θ (0, n) = 1

and

\partial_{t} θ (0, n) = 0

. The weights

λ_{1}

,

λ_{2}

, and

λ_{3}

are hyperparameters that control the relative importance of the boundary losses.

The loss function

L

is minimized using the Adam optimizer, which iteratively updates the model parameters. Both the Residual PINN and StellarNET architectures are trained under this framework, using the full set of colocation points as input during optimization.

4. Experiments and Results

In this section, we present the experimental setup and parameters used by both PINNs, along with qualitative analysis and quantitative comparison with other numerical methods. Both models are trained using the polytropic indexes from

n = 0

to

n = 5

with a discretization step of

Δ n = 1

, resulting in six different ODEs. The radial coordinate is discretized from

t = 0

to

t = 10

with a discretization step of

Δ t = 0.01

, resulting in a total of 1001 points for each distinct ODE case. The two-dimensional domain is fixed for both PINN models and across the simulations.

Training is performed using the Adam optimizer, with an initial learning rate of

0.001

. To improve stability and convergence during training, a learning rate scheduler is employed. Other hyperparameters of the networks are fine-tuned in order to find the best performance.

Training is performed on a CPU-only workstation (Intel Core i9-13900, 32 GB RAM, no discrete GPU). The batch size is always set to the full collocation domain. Depending on the architecture and number of epochs, runs range from a few minutes (width/depth studies) up to 10 h of wall-clock time for the largest StellarNET experiments.

To reconcile different training regimes, we note that ablation and width/depth sensitivity studies are run for 2000–5000 epochs. For the main Lane–Emden benchmarks, Residual PINN is trained up to 90,000 epochs (with an intermediate qualitative result reported at 50,000 epochs), StellarNET is trained for 70,000 epochs, and the supplementary MEMS resonator case for 30,000 epochs.

4.1. Residual PINN

Before fixing the architectural parameters of our Residual PINN, we conducted a study to evaluate the impact of network width and depth on training performance. Specifically, we took a single residual block and varied the number of neurons by changing the hidden size parameter of the network from 16 to 256. In a parallel experiment, we fixed the hidden size to 128 neurons and varied the number of residual blocks from 1 to 6. We kept all other hyperparameters fixed (the loss function component weights were both set to unity). In both experiments, we plotted the loss on a logarithmic scale; all width simulations were trained for a total of 5000 epochs, while all the depth simulations were trained for a total of 2000 epochs. The results of these studies are presented in Figure 2.

In the left panel, we observe that increasing the number of neurons in a residual block improves convergence but with some limitations. Models with 16 and 32 neurons show significantly higher losses, while 64 to 256 neurons consistently yield lower loss curves. The 512-neuron model converges very slowly, likely due to overparameterization or an insufficient number of training epochs. We can state that there is an optimal range (64–256 neurons) where the balance between model complexity and optimization efficiency is best.

In the right panel, we observe that by increasing the number of residual blocks, the convergence speed and stability do not improve significantly. Particularly, we can see some improvements moving from 1 to 2 blocks, or from 2 to 3 blocks, but using a higher number of blocks is not optimal. A small number of residual blocks is sufficient to stabilize training and solve the Lane–Emden equation.

Residual depth seems to play a more decisive role in early convergence and final performance than width alone. A practical architecture for this Residual PINN application would use 2–3 residual blocks with 64–256 hidden size, offering strong convergence.

Based on these observations, we fix our Residual PINN architecture to use two residual blocks with a hidden size of 128 neurons. We train this PINN for a total of 50,000 epochs using the two-dimensional domain constructed earlier. We present qualitative results in Figure 3, where the left simulation contains the training results for colocation points in the domain and the right simulation shows the extrapolation capability of our PINN in points outside the training set. The numerical Runge–Kutta (RK) solutions are also present in our plots.

From a qualitative point of view, the Residual PINN seems to correctly capture the behavior of the solutions from

n = 0

to

n = 5

, matching the numerical RK solutions. We have the first visual validation that our model can converge to solutions that match the underlying physics of the system, not just minimize the loss residuals.

In addition, the model’s performance on extrapolation points is also impressive. Despite not being trained on fractional indices, the network is able to generalize well on unseen data

n = 1.5

,

2.5

,

3.5

, and

4.5

, which is outside the training domain. The produced approximations closely follow the expected trends based on the numerical RK solutions.

The mean absolute errors against the RK method are

3.204 \times 10^{- 3}

for

n = 1.5

,

4.169 \times 10^{- 4}

for

n = 2.5

,

2.054 \times 10^{- 4}

for

n = 3.5

, and

2.246 \times 10^{- 4}

for

n = 4.5

. These strong results were computed just for the interval

[0, 3]

since, for fractional indices, the numerical solution is not always mathematically defined (the behavior can also be observed in Figure 3—right panel, where the RK points disappear when the plots reach the 0 axis). This ability to extrapolate across families of differential equations is uncommon in the PINN literature, where models are typically trained to solve a fixed instance of an ODE or PDE rather than generalize across different problems.

To assess the accuracy of our model after extended training (up to 90,000 epochs), we compare its predictions on the analytically available cases with results from other numerical solvers available in the literature. These benchmark cases are standard for validating Lane–Emden solvers due to the availability of exact solutions and allow a direct comparison of numerical precision. The benchmark methods used for comparison include a Monte Carlo technique (MC-E) [32], an active-set algorithm neural network (AST-NN-E) [39], a Chebyshev neural network (Ch-NN-E) [34], the pattern search optimization technique (PS-E) [40], and a genetic algorithm-based approach (GA-E) [41].

The model was evaluated in the interval

t = 0

to

t = 1

, focusing on the three polytropic indices for which exact analytical solutions are known:

n = 0

,

n = 1

, and

n = 5

(see Equations (3)–(5)). This benchmarking setup can be observed in Figure 4, where we can quantify how well the Residual PINN performs against traditional solvers in terms of absolute error.

The errors are significantly lower in comparison with other models, most of them being situated in the interval

[10^{- 7}, 10^{- 6}]

, sometimes going as low as

10^{- 8}

. We proceed to evaluate our model’s performance beyond the previous range and compare its performance against the PINN presented in [35]. In particular, we use the same strategic radius points where the previous model was tested against Monte Carlo simulations.

Table 1 summarizes the results of this comparison for

n = 0

,

n = 1

, and

n = 5

. The Residual PINN consistently demonstrates superior performance with significantly lower errors than other numerical methods, particularly in the range

t > 1

. This result is expected, given that our Residual PINN architecture includes two residual blocks, which enhance its learning capacity, in contrast to the baseline PINN that uses only two FC layers.

While the Residual PINN achieves highly accurate results across both interpolation and extrapolation scenarios, its architecture is relatively simple, relying just on stacking residual blocks on each other with uniform hidden layers. This raises a natural question: could a more expressive internal structure within the residual blocks improve performance, generalization, or training stability? To explore this, we introduce StellarNET. While we do not necessarily expect StellarNET to outperform the previous PINN’s results, its architectural innovations are worth investigating.

4.2. StellarNET

We now explore the performance of StellarNET, for which we have already introduced the mathematical model in a previous section. Before feeding our two-dimensional input to the network, it is first projected into a higher-dimensional space using RFF embedding. This transformation expands the dimensionality from 2 to

2 * n u m_f o u r i e r_f e a t u r e s

(concatenation of the sine and cosine components). In our case, we fixed

n u m_f o u r i e r_f e a t u r e s = 64

, so the actual input to the neural network has size 128.

The RFF-encoded representation is then passed through a sequence of Stellar blocks, each incorporating dense layers, gating mechanisms, and adaptive skip connections. We fixed the number of blocks to 2, each operating at a fixed internal hidden size of 256. We tried to mirror the architecture setup of the previous PINN; specifically, since the Residual PINN relies on additional pre- and post-processing fully connected layers, we double the hidden size in Stellar blocks to maintain comparable modeling capacity.

In the first experiment, we analyze the training stability of each PINN by running five independent training trials for StellarNET and Residual PINN, respectively. We track the evolution of the loss function over 5000 epochs, and in each trial, we initialize the network’s trainable parameters with a different random seed. The Residual PINN parameters are the same ones we used in the previous section, while for StellarNET, it is important to mention that the hyperparameters used to control the loss terms are not set to unity. Based on some preliminary tunings, we fix

λ_{1} = 10

and

λ_{2} = λ_{3} = 5

.

The results can be observed in Figure 5. Both models show stable convergence with low variance and high reproducibility. Both models reach a similar order of magnitude for the final loss, but StellarNET achieves a lower loss with bigger hyperparameters, showing robustness. Residual PINN presents some small spikes with a few upward fluctuations. StellarNET is not free of local instabilities, but demonstrates a better overall smoothness. In the early training stages, Residual PINN shows a better descending rate, this behavior happening because of loss control hyperparameters set to unity. This initial success is outperformed by StellarNET, which outperforms with more stable long-term behavior.

We continue this study by training StellarNET for a total of 70,000 epochs, and we evaluate its performance in comparison to the Residual PINN baseline obtained earlier. We report the mean absolute error for both training and extrapolated polytropic indices in Figure 6.

StellarNET outperforms the Residual PINN for the training domain across almost all polytropic indices, achieving up to an order of magnitude improvement in MAE for

n = 0, 1, 2, 5

, the cases in which we were able to compare the neural network approximations with the true analytical functions. It is important to mention that MAE was computed over the domain

t \in [0, 10]

for

n = 0, 1, 3, 4, 5

, while for

n = 2

, the evaluation was restricted to the interval

t \in [0, 1]

, where the analytical baseline was generated using the series expansion based on the recursion relations (6) and (7).

For the extrapolation cases involving fractional polytropic indices, MAE was computed over the domain

t \in [0, 3]

, in order to avoid numerical instabilities in the RK method. When considering the extrapolation cases, the Residual PINN may possess a small advantage in comparison with StellarNET; it is encouraging that StellarNET remains competitive also in the extrapolation regime.

4.3. Supplementary Case Study: MEMS Linear Resonator

As a cross-domain validation of StellarNET, we consider the canonical linear damped resonator, a standard reduced-order model for MEMS devices,

\ddot{x} (t) + 2 ζ ω_{0} \dot{x} (t) + ω_{0}^{2} x (t) = 0, x (0) = 1, \dot{x} (0) = 0,

(37)

where t denotes time in seconds,

ω_{0}

is the natural frequency,

ζ

is the damping ratio, and

x (t)

is a dimensionless displacement.

We evaluate three complementary approaches:

Analytical reference: The closed-form solution of (37) is $x (t) = e^{- ζ ω_{0} t} (cos (ω_{d} t) + \frac{ζ}{\sqrt{1 - ζ^{2}}} sin (ω_{d} t)),$ with $ω_{d} = ω_{0} \sqrt{1 - ζ^{2}}$ .
Variational method: We construct a weighted-residual trial expansion $x (τ) \approx 1 + τ^{2} \sum_{k = 0}^{M} a_{k} cos (k π τ),$ on normalized time $τ = t / T$ , enforcing $x (0) = 1$ , $x_{τ} (0) = 0$ . The coefficients ${a_{k}}$ are determined by the Galerkin orthogonality of the ODE residual to the trial space (integrals by Gauss–Legendre quadrature, multi-element partition for stability). This corresponds to the compact variational treatment widely used in MEMS modeling.
PINN (StellarNet): We augment StellarNet with inputs $(τ, ω_{0})$ and enforce the hard ansatz $x (τ, ω_{0}) = 1 + τ^{2} f_{θ} (τ, ω_{0})$ to satisfy the initial conditions. The physics loss is the mean squared residual of (37) under the chain rule $\frac{d}{d t} = \frac{1}{T} \frac{d}{d τ}$ . We train on eight values $ω_{0} \in [0.8, 1.5]$ with $Δ ω_{0} = 0.1$ and test on five unseen values $ω_{0} \in {0.85, 0.95, 1.05, 1.15, 1.25}$ , with fixed damping $ζ = 0.2$ .

The same StellarNET architecture used for the Lane–Emden equation is now trained for a total of 30,000 epochs and Figure 7 shows the results obtained using the analytical (solid), PINN (circles), and variational (dotted) approaches. The StellarNet–PINN reproduces both the training and extrapolation families with high fidelity, whereas the variational method, although it captures the main damped oscillatory shape, exhibits larger phase and amplitude errors as t increases. Error metrics against the analytical solution are presented in Table 2.

5. Conclusions

In this work, we proposed two PINN architectures—Residual PINN and StellarNET—for solving the Lane–Emden equation using physics-informed training, positioning StellarNET as an architectural enhancement to standard fully connected PINNs. Both models were designed to handle a range of polytropic indices, enabling training across multiple ODE instances and inference on previously unseen cases. We formulated a shared two-dimensional PINN problem and evaluated the models’ performance in terms of accuracy, stability, and extrapolation capability.

Residual PINN, a simple model in structure, demonstrated strong generalization, achieving high precision in both the interpolation and extrapolation regimes. StellarNET, a more expressive architecture inspired by PirateNETs, is using RFF embeddings and gating mechanisms to further improve learning capacity and stability. It achieved lower mean absolute errors across most training scenarios, particularly for those cases with known analytical solutions.

While Residual PINN may exhibit slightly better extrapolation in some cases, StellarNET’s performance remains highly competitive and more robust in convergence. Both models outperform existing numerical benchmarks in standard cases, reinforcing the potential of physics-informed learning frameworks for astrophysical modeling.

To further demonstrate the versatility of our StellarNET approach, we included a supplementary ODE study for MEMS linear resonators. Using the same StellarNET architecture, a single network was trained on multiple natural frequencies and successfully generalized to unseen ones, with results compared against both analytical solutions and a compact variational baseline. This highlights the method’s ability to interpolate and extrapolate across parameterized families of ODEs.

Future research can aim at extending these models to more complex differential equations arising in astrophysical contexts, such as the Tolman–Oppenheimer–Volkoff. Future work could also address expected limitations of the current study, such as extending the models to stiff regimes or more complex boundary geometries. Moreover, strategies like domain decomposition, adaptive mesh refinement, or transformer-based PINN architectures could improve scalability and precision. Beyond astrophysics, our results suggest that StellarNET can serve as a broadly applicable computational tool across mathematical modeling of physical systems.

Author Contributions

Conceptualization, A.-I.M. and C.-A.P.; methodology, A.-I.M. and C.-A.P.; software, A.-I.M.; validation, A.-I.M.; formal analysis, A.-I.M.; investigation, A.-I.M.; resources, C.-A.P.; data curation, A.-I.M.; writing—original draft preparation, A.-I.M.; writing—review and editing, C.-A.P.; visualization, A.-I.M.; supervision, C.-A.P.; project administration, C.-A.P.; funding acquisition, C.-A.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding. The APC was funded by Politehnica University of Timișoara.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The code supporting this study is openly available at GitHub, commit 86f422d https://github.com/AndreiMohut/LaneEmden-PINN-Solver/commit/86f422d13fbb73b32befcf268ca9e038218a1c16 (accessed on 4 September 2025). The repository includes a run script (train.py) and an example configuration file. Dependencies are specified in the README.md (Python 3.10.12, PyTorch 2.5.1, using torch.autograd for automatic differentiation).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 1989, 2, 303–314. [Google Scholar] [CrossRef]
Hornik, K.; Stinchcombe, M.; White, H. Multilayer feedforward networks are universal approximators. Neural Netw. 1989, 2, 359–366. [Google Scholar] [CrossRef]
Lagaris, I.; Likas, A.; Fotiadis, D. Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans. Neural Netw. 1998, 9, 987–1000. [Google Scholar] [CrossRef]
Lee, H.; Kang, I.S. Neural algorithm for solving differential equations. J. Comput. Phys. 1990, 91, 110–131. [Google Scholar] [CrossRef]
Lu, L.; Meng, X.; Mao, Z.; Karniadakis, G.E. DeepXDE: A deep learning library for solving differential equations. SIAM Rev. 2021, 63, 208–228. [Google Scholar] [CrossRef]
Raissi, M.; Yazdani, A.; Karniadakis, G. Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations. Science 2020, 367, eaaw4741. [Google Scholar] [CrossRef]
Mathews, A.; Francisquez, M.; Hughes, J.W.; Hatch, D.R.; Zhu, B.; Rogers, B.N. Uncovering turbulent plasma dynamics via deep learning from partial observations. Phys. Rev. E 2021, 104, 025205. [Google Scholar] [CrossRef]
Costabal, F.S.; Yang, Y.; Perdikaris, P.; Hurtado, D.E.; Kuhl, E. Physics-Informed Neural Networks for Cardiac Activation Mapping. Front. Phys. 2020, 8, 42. [Google Scholar] [CrossRef]
Fang, Z.; Zhan, J. Deep Physical Informed Neural Networks For Metamaterial Design. IEEE Access 2019, 8, 24506–24513. [Google Scholar] [CrossRef]
Kovacs, A.; Exl, L.; Kornell, A.; Fischbacher, J.; Hovorka, M.; Gusenbauer, M.; Breth, L.; Özelt, H.; Yano, M.; Sakuma, N.; et al. Conditional physics informed neural networks. arXiv 2021, arXiv:2104.02741. [Google Scholar] [CrossRef]
Smith, J.; Ross, Z.; Azizzadenesheli, K.; Muir, J. HypoSVI: Hypocenter inversion with Stein variational inference and Physics Informed Neural Networks. Geophys. J. Int. 2021, 228, 698–710. [Google Scholar] [CrossRef]
Wang, S.; Teng, Y.; Perdikaris, P. Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM J. Sci. Comput. 2021, 43, A3055–A3081. [Google Scholar] [CrossRef]
Wang, S.; Yu, X.; Perdikaris, P. When and why PINNs fail to train: A neural tangent kernel perspective. J. Comput. Phys. 2022, 449, 110768. [Google Scholar] [CrossRef]
Wang, S.; Sankaran, S.; Perdikaris, P. Respecting causality is all you need for training physics-informed neural networks. arXiv 2022, arXiv:2203.07404. [Google Scholar] [CrossRef]
Wang, S.; Li, B.; Chen, Y.; Perdikaris, P. PirateNets: Physics-informed Deep Learning with Residual Adaptive Networks. J. Mach. Learn. Res. 2024, 25, 1–51. [Google Scholar]
Abdolvand, R.; Bahreyni, B.; Lee, J.; Nabki, F. Micromachined Resonators: A Review. Micromachines 2016, 7, 160. [Google Scholar] [CrossRef]
Kim, K.; Lee, J. Stochastic Memristor Modeling Framework Based on Physics-Informed Neural Networks. Appl. Sci. 2024, 14, 9484. [Google Scholar] [CrossRef]
Nguyen, B.H.; Torri, G.; Rochus, V. Physics-informed neural networks with data-driven in modeling and characterizing piezoelectric micro-bender. J. Micromech. Microeng. 2024, 34, 115004. [Google Scholar] [CrossRef]
Chandrasekhar, S. An Introduction to the Study of Stellar Structure; Dover Publications, Inc.: New York, NY, USA, 1957. [Google Scholar]
Lane, J.H. On the theoretical investigation of the internal constitution of stars. Am. J. Sci. 1870, 50, 57–74. [Google Scholar] [CrossRef]
Emden, R. Gaskugeln: Anwendungen der Mechanischen Wärmetheorie auf Kosmologische und Meteorologische Probleme; Teubner: Leipzig, Germany, 1907. [Google Scholar]
Pleyer, J. Zero Values of the TOV Equation. arXiv 2024, arXiv:2411.15264. Available online: http://arxiv.org/abs/2411.15264 (accessed on 4 September 2025). [CrossRef]
Horedt, G. Polytropes: Applications in astrophysics and related fields. Astrophys. Space Sci. 1986, 126, 357–408. [Google Scholar] [CrossRef]
Shawagfeh, N. Non-perturbative solution of Lane–Emden equation using Adomian decomposition method. Appl. Math. Comput. 1993, 77, 81–88. [Google Scholar]
Wazwaz, A.M. A new algorithm for calculating Adomian polynomials for nonlinear operators. Appl. Math. Comput. 2001, 111, 33–51. [Google Scholar] [CrossRef]
Ramos, J. Series solution of the Lane–Emden equation with a non-linear term by Adomian’s method. J. Comput. Appl. Math. 2008, 214, 223–228. [Google Scholar]
Nouh, M.I. On the Lane–Emden equation and its solutions. Astrophys. Space Sci. 2004, 291, 47–56. [Google Scholar]
Dehghan, M.; Shakeri, F. Solution of Lane–Emden type equations using the variational iteration method. Phys. Lett. A 2008, 372, 3921–3925. [Google Scholar]
Wei, C.-F. Application of the homotopy perturbation method for solving fractional Lane–Emden type equation. Therm. Sci. 2019, 23, 2237–2244. [Google Scholar] [CrossRef]
Ramos, J. Piecewise-adaptive decomposition methods. Chaos Solitons Fractals 2009, 40, 1623–1636. [Google Scholar] [CrossRef]
Mukherjee, S. Solution of Lane–Emden type equations by Differential Transform Method. Appl. Math. Model. 2011, 35, 544–554. [Google Scholar]
El-Essawy, S.H.; Nouh, M.I.; Soliman, A.A.; Abdel Rahman, H.I.; Abd-Elmougod, G.A. Monte Carlo Simulation of Lane–Emden Type Equations Arising in Astrophysics. Comput. Astrophys. J. 2021, 42, 56–67. [Google Scholar] [CrossRef]
Nouh, M.I.; Azzam, Y.A.; Abdel-Salam, E. Modeling fractional polytropic gas spheres using artificial neural network. Neural Comput. Appl. 2021, 33, 7000–7015. [Google Scholar] [CrossRef]
Mall, S.; Chakraverty, S. Chebyshev neural network based model for solving Lane–Emden type equations. Appl. Math. Comput. 2014, 247, 100–114. [Google Scholar] [CrossRef]
Baty, H. Modelling Lane–Emden type equations using Physics-Informed Neural Networks. J. Comput. Phys. 2021, 452, 110–122. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
Tancik, M.; Srinivasan, P.P.; Mildenhall, B.; Fridovich-Keil, S.; Raghavan, N.; Singhal, U.; Ramamoorthi, R.; Barron, J.T.; Ng, R. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Virtual, 6–12 December 2020; Curran Associates, Inc.: Nice, France, 2020. [Google Scholar]
Savarese, P.H.P.; Mazza, L.O.; Figueiredo, D.R. Learning Identity Mappings with Residual Gates. arXiv 2016, arXiv:1611.01260. [Google Scholar] [CrossRef]
Iftikhar, A.; Raja, M.A.Z.; Bilal, M.; Farooq, A. Neural network methods to solve the Lane–Emden type equations arising in thermodynamic studies of the spherical gas cloud model. Neural Comput. Appl. 2017, 28, 929–944. [Google Scholar] [CrossRef]
Lewis, R.; Torczon, V. Pattern search methods for linearly constrained minimization. SIAM J. Optim. 2000, 10, 917–941. [Google Scholar] [CrossRef]
Ahmad, I.; Raja, M.A.Z.; Bilal, M.; Ashraf, F. Bio-inspired computational heuristics to study Lane–Emden systems arising in astrophysics model. SpringerPlus 2016, 5, 1866. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Schematic of the Physics-Informed Neural Network (PINN) architecture. The network minimizes the residuals of the governing equation, boundary conditions, and initial conditions through loss functions

L_{p} (ω)

,

L_{b} (ω)

, and

L_{i} (ω)

, respectively.

Figure 1. Schematic of the Physics-Informed Neural Network (PINN) architecture. The network minimizes the residuals of the governing equation, boundary conditions, and initial conditions through loss functions

L_{p} (ω)

,

L_{b} (ω)

, and

L_{i} (ω)

, respectively.

Figure 2. Training loss evolution (log scale) for Residual PINNs by varying width–depth. Left (width study): 5000 epochs; right (depth study): 2000 epochs. Left: increasing neurons in a single residual block improves convergence up to 256 neurons, with diminished returns beyond. Right: increasing residual block depth with fixed hidden size (128) shows optimal loss around 2–3 blocks.

Figure 3. Residual PINN predictions at different polytropic indices. Left: model predictions for points inside the training domain with polytropic index

n \in {0, 1, 2, 3, 4, 5}

. Right: extrapolation results for fractional values of

n \in {1.5, 2.5, 3.5, 4.5}

reveal the model’s generalization capabilities. In both cases, the results are compared to numerical solutions obtained via a Runge–Kutta (RK) solver. All plots closely follow the RK trends, even the extrapolation simulations, despite not training the Residual PINN in those points. Residual PINN shown after being trained for 50,000 epochs.

Figure 3. Residual PINN predictions at different polytropic indices. Left: model predictions for points inside the training domain with polytropic index

n \in {0, 1, 2, 3, 4, 5}

. Right: extrapolation results for fractional values of

n \in {1.5, 2.5, 3.5, 4.5}

reveal the model’s generalization capabilities. In both cases, the results are compared to numerical solutions obtained via a Runge–Kutta (RK) solver. All plots closely follow the RK trends, even the extrapolation simulations, despite not training the Residual PINN in those points. Residual PINN shown after being trained for 50,000 epochs.

Figure 4. Comparison of absolute errors for

n = 0

,

n = 1

, and

n = 5

between various methods and the Residual PINN model after 90,000 epochs.

Figure 4. Comparison of absolute errors for

n = 0

,

n = 1

, and

n = 5

between various methods and the Residual PINN model after 90,000 epochs.

Figure 5. Training loss curves for StellarNET and Residual PINN across five independent runs. Shaded regions denote the min–max range across seeds, with solid lines showing the mean trajectory; each run trained for 5000 epochs.

Figure 6. Mean absolute error (log scale) comparison between Residual PINN and StellarNET across both training and extrapolation regimes for various polytropic indices. Residual PINN was trained for 50,000 epochs, and StellarNET for 70,000 epochs.

Figure 7. StellarNET predictions at different training and extrapolation frequencies for a MEMS linear resonator

x^{″} + 2 ζ ω_{0} x^{'} + ω_{0}^{2} x = 0

with

ζ = 0.2

. Left: training frequencies; Right: unseen test frequencies. Solid = analytical reference, circles = PINN (StellarNet), dotted = variational method. PINN was trained for 30,000 epochs. Different colors depict different frequencies.

Figure 7. StellarNET predictions at different training and extrapolation frequencies for a MEMS linear resonator

x^{″} + 2 ζ ω_{0} x^{'} + ω_{0}^{2} x = 0

with

ζ = 0.2

. Left: training frequencies; Right: unseen test frequencies. Solid = analytical reference, circles = PINN (StellarNet), dotted = variational method. PINN was trained for 30,000 epochs. Different colors depict different frequencies.

Table 1. Error comparison for

n = 0

,

n = 1

, and

n = 5

. Single-run errors are reported for StellarNET trained for 90,000 epochs.

Table 1. Error comparison for

n = 0

,

n = 1

, and

n = 5

. Single-run errors are reported for StellarNET trained for 90,000 epochs.

t	Residual PINN Error	FC PINN Error [35]	Monte Carlo Error
$n = 0$
0.00	4.2 × 10⁻⁷	1.4 × 10⁻⁶	0.0 × 10⁰
0.30	1.0 × 10⁻⁸	1.6 × 10⁻⁵	1.0 × 10⁻⁴
0.60	4.1 × 10⁻⁷	8.0 × 10⁻⁶	2.0 × 10⁻⁴
0.90	1.3 × 10⁻⁶	1.8 × 10⁻⁵	3.0 × 10⁻⁴
1.20	8.2 × 10⁻⁷	1.1 × 10⁻⁵	4.0 × 10⁻⁴
1.50	4.8 × 10⁻⁷	1.4 × 10⁻⁵	5.0 × 10⁻⁴
1.80	9.2 × 10⁻⁷	1.7 × 10⁻⁵	9.0 × 10⁻⁴
2.10	1.2 × 10⁻⁶	1.1 × 10⁻⁵	7.0 × 10⁻⁴
2.40	1.2 × 10⁻⁶	1.3 × 10⁻⁵	8.0 × 10⁻⁴
$n = 1$
0.00	1.1 × 10⁻⁶	3.1 × 10⁻⁶	0.0 × 10⁰
0.40	1.1 × 10⁻⁶	3.3 × 10⁻⁵	1.0 × 10⁻⁴
0.80	4.5 × 10⁻⁷	2.7 × 10⁻⁵	3.0 × 10⁻⁴
1.20	1.7 × 10⁻⁶	1.9 × 10⁻⁵	4.0 × 10⁻⁴
1.60	5.0 × 10⁻⁸	2.5 × 10⁻⁵	4.0 × 10⁻⁴
2.00	3.4 × 10⁻⁶	1.1 × 10⁻⁵	4.0 × 10⁻⁴
2.40	3.2 × 10⁻⁶	1.3 × 10⁻⁵	3.0 × 10⁻⁴
2.80	1.9 × 10⁻⁶	3.3 × 10⁻⁵	3.0 × 10⁻⁴
$n = 5$
0.00	4.8 × 10⁻⁷	2.1 × 10⁻⁵	0.0 × 10⁰
1.00	2.7 × 10⁻⁶	4.7 × 10⁻⁶	2.0 × 10⁻⁴
2.00	4.7 × 10⁻⁶	2.1 × 10⁻⁵	1.0 × 10⁻⁴
3.00	7.8 × 10⁻⁶	2.2 × 10⁻⁵	3.0 × 10⁻⁴
4.00	8.3 × 10⁻⁶	2.1 × 10⁻⁵	3.0 × 10⁻⁴
5.00	9.5 × 10⁻⁶	2.4 × 10⁻⁵	7.0 × 10⁻⁴
6.00	9.9 × 10⁻⁶	1.5 × 10⁻⁵	1.0 × 10⁻⁴

Table 2. Errors vs. analytical solution for the MEMS resonator (

ζ = 0.2

). Values are mean absolute error (MAE) and root mean square error (RMSE). Single-run errors are reported, with no averaging over seeds.

Table 2. Errors vs. analytical solution for the MEMS resonator (

ζ = 0.2

). Values are mean absolute error (MAE) and root mean square error (RMSE). Single-run errors are reported, with no averaging over seeds.

$ω_{0}$	PINN		Variational
$ω_{0}$	MAE	RMSE	MAE	RMSE
0.80	$2.32 \times 10^{- 3}$	$2.70 \times 10^{- 3}$	$5.63 \times 10^{- 2}$	$6.56 \times 10^{- 2}$
0.85	$1.66 \times 10^{- 3}$	$2.11 \times 10^{- 3}$	$6.39 \times 10^{- 2}$	$7.37 \times 10^{- 2}$
0.90	$1.63 \times 10^{- 3}$	$1.96 \times 10^{- 3}$	$7.20 \times 10^{- 2}$	$8.20 \times 10^{- 2}$
0.95	$8.81 \times 10^{- 4}$	$1.10 \times 10^{- 3}$	$8.00 \times 10^{- 2}$	$8.90 \times 10^{- 2}$
1.00	$6.33 \times 10^{- 4}$	$7.81 \times 10^{- 4}$	$8.60 \times 10^{- 2}$	$9.43 \times 10^{- 2}$
1.05	$5.74 \times 10^{- 4}$	$7.56 \times 10^{- 4}$	$8.88 \times 10^{- 2}$	$9.82 \times 10^{- 2}$
1.10	$3.30 \times 10^{- 4}$	$4.07 \times 10^{- 4}$	$8.92 \times 10^{- 2}$	$1.02 \times 10^{- 1}$
1.15	$2.68 \times 10^{- 4}$	$4.45 \times 10^{- 4}$	$9.07 \times 10^{- 2}$	$1.05 \times 10^{- 1}$
1.20	$5.68 \times 10^{- 4}$	$7.63 \times 10^{- 4}$	$9.38 \times 10^{- 2}$	$1.09 \times 10^{- 1}$
1.25	$8.92 \times 10^{- 4}$	$1.30 \times 10^{- 3}$	$9.68 \times 10^{- 2}$	$1.13 \times 10^{- 1}$
1.30	$1.61 \times 10^{- 3}$	$2.63 \times 10^{- 3}$	$9.86 \times 10^{- 2}$	$1.16 \times 10^{- 1}$
1.40	$2.92 \times 10^{- 3}$	$4.44 \times 10^{- 3}$	$1.01 \times 10^{- 1}$	$1.20 \times 10^{- 1}$
1.50	$6.13 \times 10^{- 3}$	$8.14 \times 10^{- 3}$	$1.03 \times 10^{- 1}$	$1.25 \times 10^{- 1}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mohuț, A.-I.; Popa, C.-A. Generalization-Capable PINNs for the Lane–Emden Equation: Residual and StellarNET Approaches. Appl. Sci. 2025, 15, 10035. https://doi.org/10.3390/app151810035

AMA Style

Mohuț A-I, Popa C-A. Generalization-Capable PINNs for the Lane–Emden Equation: Residual and StellarNET Approaches. Applied Sciences. 2025; 15(18):10035. https://doi.org/10.3390/app151810035

Chicago/Turabian Style

Mohuț, Andrei-Ionuț, and Călin-Adrian Popa. 2025. "Generalization-Capable PINNs for the Lane–Emden Equation: Residual and StellarNET Approaches" Applied Sciences 15, no. 18: 10035. https://doi.org/10.3390/app151810035

APA Style

Mohuț, A.-I., & Popa, C.-A. (2025). Generalization-Capable PINNs for the Lane–Emden Equation: Residual and StellarNET Approaches. Applied Sciences, 15(18), 10035. https://doi.org/10.3390/app151810035

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Generalization-Capable PINNs for the Lane–Emden Equation: Residual and StellarNET Approaches

Abstract

1. Introduction

2. Theoretical Fundamentals

2.1. Lane–Emden Equation

2.2. Physics-Informed Neural Networks (PINNs)

3. Proposed Method

3.1. Networks and Architectures

3.1.1. Residual PINN

3.1.2. StellarNET

3.2. Loss Function and Training

4. Experiments and Results

4.1. Residual PINN

4.2. StellarNET

4.3. Supplementary Case Study: MEMS Linear Resonator

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI