Stable and Efficient Gaussian-Based Kolmogorov–Arnold Networks

Pasquale De Luca; Emanuel Di Nardo; Livia Marcellino; Angelo Ciaramella

doi:10.3390/math14030513

,

and

Department of Science and Technology, Parthenope University of Naples, Centro Direzionale C4, I-80143 Naples, Italy

^*

Authors to whom correspondence should be addressed.

Mathematics2026, 14(3), 513;https://doi.org/10.3390/math14030513
(registering DOI)

This article belongs to the Special Issue Advances in High-Performance Computing, Optimization and Simulation

Version Notes

Order Reprints

Abstract

Kolmogorov–Arnold Networks employ learnable univariate activation functions on edges rather than fixed node nonlinearities. Standard B-spline implementations require

O (3 K W)

parameters per layer (K basis functions, W connections). We introduce shared Gaussian radial basis functions with learnable centers

μ_{k}^{(l)}

and widths

σ_{k}^{(l)}

maintained globally per layer, reducing parameter complexity to

O (K W + 2 L K)

for L layers—a threefold reduction, while preserving Sobolev convergence rates

O (h^{s - Ω})

. Width clamping at

σ_{min} = 10^{- 6}

and tripartite regularization ensure numerical stability. On MNIST with architecture

[784, 128, 10]

and

K = 5

, RBF-KAN achieves

87.8 %

test accuracy versus

89.1 %

for B-spline KAN with

1.4 \times

speedup and 33% memory reduction, though generalization gap increases from

1.1 %

to

2.7 %

due to global Gaussian support. Physics-informed neural networks demonstrate substantial improvements on partial differential equations: elliptic problems exhibit a

45 \times

reduction in PDE residual and maximum pointwise error, decreasing from

1.32

to

0.18

; parabolic problems achieve a

2.1 \times

accuracy gain; hyperbolic wave equations show a

19.3 \times

improvement in maximum error and a

6.25 \times

reduction in

L^{2}

norm. Superior hyperbolic performance derives from infinite differentiability of Gaussian bases, enabling accurate high-order derivatives without polynomial dissipation. Ablation studies confirm that coefficient regularization reduces mean error by 40%, while center diversity prevents basis collapse. Optimal basis count

K \in [3, 5]

balances expressiveness and overfitting. The architecture establishes Gaussian RBFs as efficient alternatives to B-splines for learnable activation networks with advantages in scientific computing.

Keywords:

kolmogorov–arnold networks; radial basis functions; universal approximation; sobolev spaces; numerical stability; parameter efficiency; shared-basis architecture

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Article metric data becomes available approximately 24 hours after publication online.