Fractional Derivative in LSTM Networks: Adaptive Neuron Shape Modeling with the Grünwald–Letnikov Method
Abstract
1. Introduction
2. Materials and Methods
2.1. Mathematical Model of the Proposed Fractional Order LSTM Cell
2.2. Forward Model
2.2.1. Gates
2.2.2. Candidate Cell State
2.2.3. Update of Memory Vectors
2.3. Backpropagation Through Time
2.4. Methodology for Convergence Analysis
3. Results and Discussion
3.1. Task Definition and Data
3.2. Experimental Setup
3.3. Overall Results and Learning Dynamics
3.4. Industrial Addendum: Compact RLM-Line Evaluation
4. Conclusions
- Stabilizes gradient flow and eliminates the vanishing-gradient bottleneck without architectural modifications.
- Introduces a continuous control dimension through the fractional order , enabling smooth adaptation of the neuron’s transfer function between linear and nonlinear regimes.
- Improves convergence and generalization, achieving up to 1.6× faster training and higher accuracy in industrial anomaly-detection tasks.
- Bridges discrete and continuous learning representations, establishing a unified framework for modeling memory depth and temporal smoothness in recurrent systems.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| FD–LSTM | Fractional Derivative Long Short-Term Memory network |
| LSTM | Long Short-Term Memory (recurrent neural network) |
| GL | Grünwald–Letnikov (fractional derivative operator) |
| BPTT | Backpropagation Through Time |
| RNN | Recurrent Neural Network |
| CNN | Convolutional Neural Network |
| SGD | Stochastic Gradient Descent |
| GPU | Graphics Processing Unit |
| CPU | Central Processing Unit |
References
- Gomolka, Z.; Dudek-Dyduch, E.; Kondratenko, Y. From Homogeneous Network to Neural Nets with Fractional Derivative Mechanism. In Proceedings of the International Conference on Artificial Intelligence and Soft Computing, Zakopane, Poland, 11–15 June 2017; Volume 5, pp. 52–63. [Google Scholar] [CrossRef]
- Gomolka, Z. Backpropagation algorithm with fractional derivatives. ITM Web Conf. 2018, 21, 00004. [Google Scholar] [CrossRef]
- Gomolka, Z. Neurons’ Transfer Function Modeling with the Use of Fractional Derivative. In Proceedings of the International Conference on Dependability and Complex Systems, Brunow, Poland, 2–6 July 2018; Zamojski, W., Mazurkiewicz, J., Sugier, J., Walkowiak, T., Kacprzyk, J., Eds.; pp. 218–227. [Google Scholar]
- Gomolka, Z. Fractional Backpropagation Algorithm—Convergence for the Fluent Shapes of the Neuron Transfer Function. In Proceedings of the International Conference on Neural Information Processing, Bangkok, Thailand, 18–22 November 2020; pp. 580–588. [Google Scholar] [CrossRef]
- Kumar, M.; Mehta, U. Enhancing the performance of CNN models for pneumonia and skin cancer detection using novel fractional activation function. Appl. Soft Comput. 2025, 168, 112500. [Google Scholar] [CrossRef]
- Liu, C.G.; Wang, J.L. Passivity of fractional-order coupled neural networks with multiple state/derivative couplings. Neurocomputing 2021, 455, 379–389. [Google Scholar] [CrossRef]
- Mohanrasu, S.; Priyanka, T.; Gowrisankar, A.; Kashkynbayev, A.; Udhayakumar, K.; Rakkiyappan, R. Fractional derivative of Hermite fractal splines on the fractional-order delayed neural networks synchronization. Commun. Nonlinear Sci. Numer. Simul. 2025, 140, 108399. [Google Scholar] [CrossRef]
- Wei, J.L.; Wu, G.C.; Liu, B.Q.; Zhao, Z. New semi-analytical solutions of the time-fractional Fokker–Planck equation by the neural network method. Optik 2022, 259, 168896. [Google Scholar] [CrossRef]
- Solís-Pérez, J.; Gómez-Aguilar, J.; Atangana, A. A fractional mathematical model of breast cancer competition model. Chaos Solitons Fractals 2019, 127, 38–54. [Google Scholar] [CrossRef]
- Ganji, R.; Jafari, H.; Moshokoa, S.; Nkomo, N. A mathematical model and numerical solution for brain tumor derived using fractional operator. Results Phys. 2021, 28, 104671. [Google Scholar] [CrossRef]
- Vignesh, D.; Banerjee, S. Dynamical analysis of a fractional discrete-time vocal system. Nonlinear Dyn. 2023, 111, 4501–4515. [Google Scholar] [CrossRef]
- Al-Qurashi, M.; Asif, Q.U.A.; Chu, Y.M.; Rashid, S.; Elagan, S. Complexity analysis and discrete fractional difference implementation of the Hindmarsh–Rose neuron system. Results Phys. 2023, 51, 106627. [Google Scholar] [CrossRef]
- Alsharidi, A.K.; Rashid, S.; Elagan, S.K. Short-memory discrete fractional difference equation wind turbine model and its inferential control of a chaotic permanent magnet synchronous transformer in time-scale analysis. AIMS Math. 2023, 8, 19097–19120. [Google Scholar] [CrossRef]
- Brzeziński, D.W.; Ostalczyk, P. About accuracy increase of fractional order derivative and integral computations by applying the Grünwald-Letnikov formula. Commun. Nonlinear Sci. Numer. Simul. 2016, 40, 151–162. [Google Scholar] [CrossRef]
- MacDonald, C.L.; Bhattacharya, N.; Sprouse, B.P.; Silva, G.A. Efficient computation of the Grünwald–Letnikov fractional diffusion derivative using adaptive time step memory. J. Comput. Phys. 2015, 297, 221–236. [Google Scholar] [CrossRef]
- Türkmen, M.R. Outlier-Robust Convergence of Integer- and Fractional-Order Difference Operators in Fuzzy-Paranormed Spaces: Diagnostics and Engineering Applications. Fractal Fract. 2025, 9, 667. [Google Scholar] [CrossRef]
- Öğünmez, H.; Türkmen, M.R. Statistical Convergence for Grünwald-Letnikov Fractional Differences: Stability, Approximation, and Diagnostics in Fuzzy Normed Spaces. Axioms 2025, 14, 725. [Google Scholar] [CrossRef]
- Yao, Z.; Yang, Z.; Gao, J. Unconditional stability analysis of Grünwald Letnikov method for fractional-order delay differential equations. Chaos Solitons Fractals 2023, 177, 114193. [Google Scholar] [CrossRef]
- Zuñiga Aguilar, C.; Gómez-Aguilar, J.; Alvarado-Martínez, V.; Romero-Ugalde, H. Fractional order neural networks for system identification. Chaos Solitons Fractals 2020, 130, 109444. [Google Scholar] [CrossRef]
- Panda, S.K.; Kalla, K.S.; Nagy, A.; Priyanka, L. Numerical simulations and complex valued fractional order neural networks via (ε − μ)-uniformly contractive mappings. Chaos Solitons Fractals 2023, 173, 113738. [Google Scholar] [CrossRef]
- Sivalingam, S.M.; Kumar, P.; Govindaraj, V. A neural networks-based numerical method for the generalized Caputo-type fractional differential equations. Math. Comput. Simul. 2023, 213, 302–323. [Google Scholar] [CrossRef]
- Anwar, N.; Raja, M.A.Z.; Kiani, A.K.; Ahmad, I.; Shoaib, M. Autoregressive exogenous neural structures for synthetic datasets of olive disease control model with fractional Grünwald-Letnikov solver. Comput. Biol. Med. 2025, 187, 109707. [Google Scholar] [CrossRef]
- El Akhal, H.; Ben Yahya, A.; Moussa, N.; El Belrhiti El Alaoui, A. A novel approach for image-based olive leaf diseases classification using a deep hybrid model. Ecol. Inform. 2023, 77, 102276. [Google Scholar] [CrossRef]
- Jones, A.M.; Itti, L.; Sheth, B.R. Expert-level sleep staging using an electrocardiography-only feed-forward neural network. Comput. Biol. Med. 2024, 176, 108545. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Gomolka, Z.; Zeslawska, E.; Olbrot, L. Using Hybrid LSTM Neural Networks to Detect Anomalies in the Fiber Tube Manufacturing Process. Appl. Sci. 2025, 15, 1383. [Google Scholar] [CrossRef]
- Pittino, F.; Puggl, M.; Moldaschl, T.; Hirschl, C. Automatic Anomaly Detection on In-Production Manufacturing Machines Using Statistical Learning Methods. Sensors 2020, 20, 2344. [Google Scholar] [CrossRef]
- Abdelli, K.; Cho, J.Y.; Azendorf, F.; Griesser, H.; Tropschug, C.; Pachnicke, S. Machine Learning-based Anomaly Detection in Optical Fiber Monitoring. arXiv 2022, arXiv:2204.07059. [Google Scholar] [CrossRef]
- Abdallah, M.; Joung, B.G.; Lee, W.J.; Mousoulis, C.; Raghunathan, N.; Shakouri, A.; Sutherland, J.W.; Bagchi, S. Anomaly Detection and Inter-Sensor Transfer Learning on Smart Manufacturing Datasets. Sensors 2023, 23, 486. [Google Scholar] [CrossRef]
- Guo, W.; Jiang, P. Weakly Supervised anomaly detection with privacy preservation under a Bi-Level Federated learning framework. Expert Syst. Appl. 2024, 254, 124450. [Google Scholar] [CrossRef]
- Iqbal Basheer, M.Y.; Mohd Ali, A.; Abdul Hamid, N.H.; Mohd Ariffin, M.A.; Osman, R.; Nordin, S.; Gu, X. Autonomous anomaly detection for streaming data. Knowl.-Based Syst. 2024, 284, 111235. [Google Scholar] [CrossRef]
- Kang, B.; Zhong, Y.; Sun, Z.; Deng, L.; Wang, M.; Zhang, J. MSTAD: A masked subspace-like transformer for multi-class anomaly detection. Knowl.-Based Syst. 2024, 283, 111186. [Google Scholar] [CrossRef]
- Lyu, S.; Mo, D.; Wong, W. REB: Reducing biases in representation for industrial anomaly detection. Knowl.-Based Syst. 2024, 290, 111563. [Google Scholar] [CrossRef]
- Shen, L.; Wei, Y.; Wang, Y.; Li, H. AFMF: Time series anomaly detection framework with modified forecasting. Knowl.-Based Syst. 2024, 296, 111912. [Google Scholar] [CrossRef]
- Williams, R.J.; Zipser, D. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks. Neural Comput. 1989, 1, 270–280. [Google Scholar] [CrossRef]
- Ribeiro, E.; Mancho, R.A. Incremental construction of LSTM recurrent neural network. Res. Comput. Sci. 2002, 1, 171–184. [Google Scholar]
- Staudemeyer, R.C.; Morris, E.R. Understanding LSTM—A tutorial into Long Short-Term Memory Recurrent Neural Networks. arXiv 2019, arXiv:1909.09586. [Google Scholar] [CrossRef]
- Podlubny, I. Fractional Differential Equations: An Introduction to Fractional Derivatives, Fractional Differential Equations, to Methods of Their Solution and Some of Their Applications; Mathematics in Science and Engineering; Academic Press: London, UK, 1999. [Google Scholar]
- Ortigueira, M.D.; Tenreiro Machado, J. What is a fractional derivative? J. Comput. Phys. 2015, 293, 4–13. [Google Scholar] [CrossRef]
- Wei, J.L.; Wu, G.C.; Liu, B.Q.; Nieto, J.J. An optimal neural network design for fractional deep learning of logistic growth. Neural Comput. Appl. 2023, 35, 10837–10846. [Google Scholar] [CrossRef]
- Area, I.; Nieto, J. Power series solution of the fractional logistic equation. Phys. A Stat. Mech. Its Appl. 2021, 573, 125947. [Google Scholar] [CrossRef]
- Fan, Q.; Wu, G.C.; Fu, H. A Note on Function Space and Boundedness of the General Fractional Integral in Continuous Time Random Walk. J. Nonlinear Math. Phys. 2022, 29, 95–102. [Google Scholar] [CrossRef]
- Gai, M.; Cui, S.; Liang, S.; Liu, X. Frequency distributed model of Caputo derivatives and robust stability of a class of multi-variable fractional-order neural networks with uncertainties. Neurocomputing 2016, 202, 91–97. [Google Scholar] [CrossRef]
- Sabir, Z.; Ali, M.R. Analysis of perturbation factors and fractional order derivatives for the novel singular model using the fractional Meyer wavelet neural networks. Chaos Solitons Fractals X 2023, 11, 100100. [Google Scholar] [CrossRef]
- Werbos, P. Backpropagation through time: What it does and how to do it. Proc. IEEE 1990, 78, 1550–1560. [Google Scholar] [CrossRef]
- Huang, Z.; Haider, Q.; Sabir, Z.; Arshad, M.; Siddiqui, B.K.; Alam, M.M. A neural network computational structure for the fractional order breast cancer model. Sci. Rep. 2023, 13, 22756. [Google Scholar] [CrossRef] [PubMed]
- Chu, Y.M.; Alzahrani, T.; Rashid, S.; Rashidah, W.; ur Rehman, S.; Alkhatib, M. An advanced approach for the electrical responses of discrete fractional-order biophysical neural network models and their dynamical responses. Sci. Rep. 2023, 13, 18180. [Google Scholar] [CrossRef] [PubMed]
- Panda, S.K.; Abdeljawad, T.; Nagy, A.M. On uniform stability and numerical simulations of complex valued neural networks involving generalized Caputo fractional order. Sci. Rep. 2024, 14, 4073. [Google Scholar] [CrossRef]
- Yang, B.; Lei, Y.; Li, N.; Li, X.; Si, X.; Chen, C. Balance recovery and collaborative adaptation approach for federated fault diagnosis of inconsistent machine groups. Knowl.-Based Syst. 2025, 317, 113480. [Google Scholar] [CrossRef]





| Category | Representative Works | Method/Approach | Focus/Model | Key Contribution | Limitations |
|---|---|---|---|---|---|
| Fractional neural foundations | [1,2,3,4,5,6,7,8] | Fractional derivatives (Caputo, GL) applied to neural structures | Theoretical formulation of fractional-order neurons and backpropagation | Demonstrated improved adaptability and convergence in learning | Limited validation on recurrent or large-scale models |
| Fractional dynamic systems and applications | [9,10,11,12,13] | Fractional differential equations; chaotic and oscillatory models | Modeling long-term memory, stability, and nonlinear transitions | Showed fractional differentiation captures complex temporal dependencies | No integration with neural or learning frameworks |
| Mathematical foundations of GL operators | [14,15,16,17,18] | Grünwald–Letnikov discrete formulations; adaptive-memory solvers | Numerical accuracy, convergence, and stability of discrete fractional operators | Provided proofs of robustness and efficiency for GL discretization | Lack of neural application or learning-based context |
| Fractional neural architectures | [19,20,21] | Fractional operators embedded in FONN, complex-valued, and Caputo-type networks | Neural architectures with fractional gradients or activations | Improved gradient smoothness and dynamic response | Applied only to feed-forward or static networks |
| Applied fractional models | [22,23,24] | Hybrid CNN and autoregressive GL solvers; biomedical signal analysis | Ecological prediction, image-based classification, biomedical signal interpretation | Demonstrated cross-domain benefits of fractional frameworks | Lacking unified recurrent backpropagation or consistency in training |
| Work | Fractional Operator | Forward Pass | Backward Pass |
|---|---|---|---|
| C.J. Zuñiga Aguilar et al. [19] | Caputo | Fractional activation only | Classical |
| Sivalingam S.M. et al. (2021) [21] | Caputo–Fabrizio | Fractional cell update | Classical |
| Mohanrasu et al. (2022) [7] | A–B | Partial fractional gates | Classical |
| Wei et al. (2022) [8] | FO memory units | Fractional state update | Classical |
| This work | GL | Full fractional gates | Full fractional backpropagation |
| Test Dataset (10%) | Full Dataset | |||
|---|---|---|---|---|
| Accuracy | Goodness-of-Fit | Accuracy | Goodness-of-Fit | |
| 0.90 | 0.9000 | 0.8940 | 0.9580 | 0.9526 |
| 0.95 | 0.9185 | 0.9145 | 0.9599 | 0.9543 |
| 1.00 | 0.9148 | 0.9051 | 0.9595 | 0.9550 |
| 1.05 | 0.8925 | 0.8863 | 0.9517 | 0.9462 |
| 1.10 | 0.8935 | 0.8867 | 0.9443 | 0.9387 |
| Model | Early Loss Drop | Epoch of Stabilization | Accuracy | F1 |
|---|---|---|---|---|
| Classical LSTM | Moderate | ∼120 | 0.914 | 0.905 |
| Bi–LSTM | Fast | ∼100 | 0.921 | 0.912 |
| GRU | Moderate | ∼110 | 0.909 | 0.900 |
| Fractional LSTM (Caputo) | Slow–Moderate | ∼130 | 0.910 | 0.908 |
| FD–LSTM (GL) | Fastest | ∼80 | 0.9185 | 0.915 |
| Model | Mechanism | Accuracy | F1 | |
|---|---|---|---|---|
| Classical LSTM | tanh/sigmoid | 0.914 | 0.905 | 0.959 |
| Bi–LSTM | bidirectional | 0.921 | 0.912 | 0.960 |
| Fractional LSTM (Caputo) | Caputo gradient | 0.910 | 0.908 | 0.957 |
| FD–LSTM (GL) | GL fractional operator | 0.9185 | 0.915 | 0.9599 |
| Model | Coefficients N | s/Epoch | Overhead | |
|---|---|---|---|---|
| Classical LSTM | 1.0 | – | 1.00 | – |
| FD–LSTM | 0.90 | 20 | 1.11 | +11% |
| FD–LSTM | 0.95 | 20 | 1.15 | +15% |
| FD–LSTM | 1.05 | 25 | 1.21 | +21% |
| Model | Test Accuracy | |
|---|---|---|
| Classical LSTM | 0.9333 | |
| FD–LSTM (GL) | 0.9667 | |
| FD–LSTM (GL) | 0.9630 | |
| FD–LSTM (GL) | 0.9370 | |
| FD–LSTM (GL) | 0.9296 | |
| FD–LSTM (GL) | 0.9370 | |
| FD–LSTM (GL) | 0.9630 | |
| FD–LSTM (GL) | 0.9519 |
| Model | Accuracy | Precision | Recall | F1 |
|---|---|---|---|---|
| FD–LSTM (GL) | 0.9667 | 0.96 | 0.96 | 0.93 |
| Classical LSTM | 0.9333 | >0.94 | >0.96 | >0.91 |
| Random Forest | ∼0.85 | ∼0.85 | ∼0.83 | ∼0.84 |
| SVM | ∼0.88 | ∼0.88 | ∼0.85 | ∼0.86 |
| RNN (vanilla) | ∼0.87 | ∼0.87 | ∼0.86 | ∼0.86 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gomolka, Z.; Zeslawska, E.; Olbrot, L.; Komsa, M.; Ćwiąkała, A. Fractional Derivative in LSTM Networks: Adaptive Neuron Shape Modeling with the Grünwald–Letnikov Method. Appl. Sci. 2025, 15, 13046. https://doi.org/10.3390/app152413046
Gomolka Z, Zeslawska E, Olbrot L, Komsa M, Ćwiąkała A. Fractional Derivative in LSTM Networks: Adaptive Neuron Shape Modeling with the Grünwald–Letnikov Method. Applied Sciences. 2025; 15(24):13046. https://doi.org/10.3390/app152413046
Chicago/Turabian StyleGomolka, Zbigniew, Ewa Zeslawska, Lukasz Olbrot, Michal Komsa, and Adrian Ćwiąkała. 2025. "Fractional Derivative in LSTM Networks: Adaptive Neuron Shape Modeling with the Grünwald–Letnikov Method" Applied Sciences 15, no. 24: 13046. https://doi.org/10.3390/app152413046
APA StyleGomolka, Z., Zeslawska, E., Olbrot, L., Komsa, M., & Ćwiąkała, A. (2025). Fractional Derivative in LSTM Networks: Adaptive Neuron Shape Modeling with the Grünwald–Letnikov Method. Applied Sciences, 15(24), 13046. https://doi.org/10.3390/app152413046

