Next Article in Journal / Special Issue
Advancing Taxonomy with Machine Learning: A Hybrid Ensemble for Species and Genus Classification
Previous Article in Journal
Building a Custom Crime Detection Dataset and Implementing a 3D Convolutional Neural Network for Video Analysis
Previous Article in Special Issue
Optimizing Apache Spark MLlib: Predictive Performance of Large-Scale Models for Big Data Analytics
 
 
Article
Peer-Review Record

A Training Algorithm for Locally Recurrent Neural Networks Based on the Explicit Gradient of the Loss Function

Algorithms 2025, 18(2), 104; https://doi.org/10.3390/a18020104
by Sara Carcangiu and Augusto Montisci *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Algorithms 2025, 18(2), 104; https://doi.org/10.3390/a18020104
Submission received: 6 January 2025 / Revised: 5 February 2025 / Accepted: 12 February 2025 / Published: 14 February 2025
(This article belongs to the Special Issue Algorithms in Data Classification (2nd Edition))

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

In  paper, a novel algorithm for the training of  LRNNs is introduced. The algorithm reduces computational complexity while ensuring network stability during the training phase. A unique feature of this method is its ability to represent the error gradient explicitly. The core of the algorithm relies on the interpretation of the Fibonacci sequence as the output of an IIR second-order filter, leveraging Binet's formula to calculate sequence terms directly. This approach enables the explicit calculation of the loss function gradient, expressed in terms of network stability parameters.

It seems that ref. [17] is equal to ref. [21] 

 

Author Response

The authors would like to thank the Reviewers for their very useful and helpful comments. All comments have been considered, and the paper has been revised accordingly.

Further to the revisions made in the manuscript, the authors believe that the whole paper has been improved.

In the following the Reviewers’ comments are reported in black and the authors’ answers in red.

The revised *.docx version of the paper includes modifications highlighted in red.

Review 1 report form

In paper, a novel algorithm for the training of  LRNNs is introduced. The algorithm reduces computational complexity while ensuring network stability during the training phase. A unique feature of this method is its ability to represent the error gradient explicitly. The core of the algorithm relies on the interpretation of the Fibonacci sequence as the output of an IIR second-order filter, leveraging Binet's formula to calculate sequence terms directly. This approach enables the explicit calculation of the loss function gradient, expressed in terms of network stability parameters.

It seems that ref. [17] is equal to ref. [21] 

The double reference has been eliminated in the new version of the manuscript.

 

Reviewer 2 Report

Comments and Suggestions for Authors

The paper presents A Training Algorithm for Locally Recurrent Neural Networks based on the Explicit Gradient of the Loss Function. My comments are as follows:

1.The introduction provides valuable context regarding Locally Recurrent Neural Networks (LRNNs) and the proposed training algorithm. However, it does not clearly delineate the specific limitations or gaps in current methods that the new algorithm aims to address?

2.The derivation and use of the Binet formula for the IIR filter are mathematically dense but lack intermediate explanatory steps, making it difficult for readers unfamiliar with this field to follow.

 

3.The paper discusses stability control but does not include a detailed quantitative evaluation or benchmarks against other stability-control methods.

4..All figures presenting predictions are included in the manuscript; however, they lack detailed explanations within the text. Enhanced descriptions of these figures, including their relevance to the study's objectives and how they illustrate the effectiveness of the proposed algorithm, would significantly improve clarity and comprehension.

5.The discussion section acknowledges the algorithm's potential but does not discuss its limitations or areas where it might not perform well?

Author Response

The authors would like to thank the Reviewers for their very useful and helpful comments. All comments have been considered, and the paper has been revised accordingly.

Further to the revisions made in the manuscript, the authors believe that the whole paper has been improved.

In the following the Reviewers’ comments are reported in black and the authors’ answers in red.

The revised *.docx version of the paper includes modifications highlighted in red.

Review 2 report form

The paper presents A Training Algorithm for Locally Recurrent Neural Networks based on the Explicit Gradient of the Loss Function. My comments are as follows:

  1. The introduction provides valuable context regarding Locally Recurrent Neural Networks (LRNNs) and the proposed training algorithm. However, it does not clearly delineate the specific limitations or gaps in current methods that the new algorithm aims to address.

Thank you for the comment. We have carefully revised the manuscript to explicitly emphasize the critical limitations in previous methods that our proposed algorithm seeks to overcome. Specifically, we have expanded Section 2 to provide a more comprehensive explanation of the fundamental principles underlying the Causal Backpropagation Through Time (CBTT) algorithm, as introduced by Campolucci et al. (1996), and the issues related to stability (Campolucci et al. 2000) are briefly described. As during training, the appropriate memory depth to be assigned to the equivalent FIR filter is unknown a priori, as it depends on the updated feedback coefficient. Consequently, an initial estimate is assigned to the memory depth of the FIR filter, which must subsequently be validated after the feedback coefficient has been updated. To the best of our knowledge, no general criterion exists for determining this parameter, which means that multiple iterations may be required to find an appropriate value. Regarding the stability of the network following parameter updates, stability is determined by the poles of the transfer function of the infinite impulse response (IIR) filter, which, in turn, are influenced by the feedback coefficients. Since the parameters adjusted during network training are the feedback coefficients, the training process lacks direct control over the poles and, consequently, the overall stability of the system.

The method proposed in this paper addresses both issues. First, it provides an exact and explicit formulation of the impulse response of the IIR filter. Second, it ensures that training is conducted by directly updating the parameters that govern the stability of the network, thereby offering a more robust and reliable approach to stability control.

  1. The derivation and use of the Binet formula for the IIR filter are mathematically dense but lack intermediate explanatory steps, making it difficult for readers unfamiliar with this field to follow.

The demonstration of the Binet’s formula has been integrated with intermediate steps, and some explanatory comments have been added in subsections 2.1 and 2.2.

  1. The paper discusses stability control but does not include a detailed quantitative evaluation or benchmarks against other stability-control methods.

The stability of the network dynamically adapted is motivated in general from a theoretical point of view, and then the Willamowski-Rössler example is presented as a validation. The choice of the chaotic series is due to the fact that by its nature the dynamics changes continuously and then feedback coefficients do not converge to a constant value, but they must change continuously. Nonetheless, the output never diverges.

  1. All figures presenting predictions are included in the manuscript; however, they lack detailed explanations within the text. Enhanced descriptions of these figures, including their relevance to the study's objectives and how they illustrate the effectiveness of the proposed algorithm, would significantly improve clarity and comprehension.

We have revised the results section to include more detailed descriptions of all figures, providing clear explanations to enhance the reader's understanding of their significance. Additionally, we have introduced a second test case to further evaluate the algorithm’s performance, demonstrating its robustness and applicability across different scientific and engineering contexts.

  1. The discussion section acknowledges the algorithm's potential but does not discuss its limitations or areas where it might not perform well?

In none of the cases examined, including those not reported in the paper, stability problems emerged. Moreover, the stability of the method is demonstrated from a theoretical point of view, so it should have general validity. The examples were chosen to be particularly challenging both from the point of view of training and stability, so that they could validate the results of the theory, although its validity necessarily remains limited to the examples. Since the aim of the work was to expose the method, it was preferred to limit the analysis to very few significant examples, postponing to future work a study dedicated to the limits of the method.

Reviewer 3 Report

Comments and Suggestions for Authors

The research article is well-written and eligible for publication in this journal. However, before the manuscript is published in this journal, I would want to make a few technical and concise recommendations to improve its quality.
1. Explain how the proposed LRNN training algorithm varies considerably from existing approaches like BPTT and other gradient-based techniques.
2. Include a performance comparison of the proposed method to existing algorithms to provide more solid evidence of its practical advantages.
3. Please offer a clear mathematical reason or experimental validation for the proposed method's stability guarantees while utilizing roots.
4. Discuss the algorithm's sensitivity to critical parameters, such as the learning rate and feedback coefficients, to help users choose the best settings.
5. Calculate the computing complexity of the proposed algorithm in comparison to existing training methods, noting any trade-offs.
6. Improve the quality and clarity of figures, such as the phase space trajectory and error evolution, by include extensive captions and emphasizing important patterns.
7. Expand on the proposed method's real-world applications, particularly beyond the chemical reaction case study.

Author Response

The authors would like to thank the Reviewers for their very useful and helpful comments. All comments have been considered, and the paper has been revised accordingly.

Further to the revisions made in the manuscript, the authors believe that the whole paper has been improved.

In the following the Reviewers’ comments are reported in black and the authors’ answers in red.

The revised *.docx version of the paper includes modifications highlighted in red.

Review 3 report form

The research article is well-written and eligible for publication in this journal. However, before the manuscript is published in this journal, I would want to make a few technical and concise recommendations to improve its quality.

  1. Explain how the proposed LRNN training algorithm varies considerably from existing approaches like BPTT and other gradient-based techniques.

Thank you for the comment. We have carefully revised the manuscript to explicitly emphasize the critical limitations in previous methods that our proposed algorithm seeks to overcome. Specifically, we have expanded Section 2 to provide a more comprehensive explanation of the fundamental principles underlying the Causal Backpropagation Through Time (CBTT) algorithm, as introduced by Campolucci et al. (1996), and the issues related to stability (Campolucci et al. 2000) are briefly described. As during training, the appropriate memory depth to be assigned to the equivalent FIR filter is unknown a priori, as it depends on the updated feedback coefficient. Consequently, an initial estimate is assigned to the memory depth of the FIR filter, which must subsequently be validated after the feedback coefficient has been updated. To the best of our knowledge, no general criterion exists for determining this parameter, which means that multiple iterations may be required to find an appropriate value. Regarding the stability of the network following parameter updates, stability is determined by the poles of the transfer function of the infinite impulse response (IIR) filter, which, in turn, are influenced by the feedback coefficients. Since the parameters adjusted during network training are the feedback coefficients, the training process lacks direct control over the poles and, consequently, the overall stability of the system. The method proposed in this paper addresses both issues. First, it provides an exact and explicit formulation of the impulse response of the IIR filter. Second, it ensures that training is conducted by directly updating the parameters that govern the stability of the network, thereby offering a more robust and reliable approach to stability control.

  1. Include a performance comparison of the proposed method to existing algorithms to provide more solid evidence of its practical advantages.

A second example has been added in the results section, in which the performance of the proposed method is compared with that of a different neural system. A comparative analysis of all the methods present in the literature would require dedicated study, which is the subject of future work, being the present one focused on the description of the theoretical aspects.

  1. Please offer a clear mathematical reason or experimental validation for the proposed method's stability guarantees while utilizing roots.

The whole Section 2 has been re-written to better clarify this point. The Binet’s formula allows one to write the impulsive response of the feedback in terms of linear combination of exponential functions. As the training algorithm consists of changing the values of the basis of such exponentials, provided that their module is less than 1 the stability is guaranteed.

  1. Discuss the algorithm's sensitivity to critical parameters, such as the learning rate and feedback coefficients, to help users choose the best settings.

Limited to the applications on which the method was tested, no critical issues related to the learning rate, or the topology of the neural network emerged. As for the learning rate, as happens in all cases where first-order optimization methods are applied, there is no theoretically demonstrated rule, more than anything else there are heuristics, and the trial-and-error method has always proven to be the best performing approach. As for the topology, it was observed that during training the parts of the structure that are not useful are excluded naturally, simply by zeroing the relative coefficients. Nonetheless, at the current stage of the study the only information available is of a heuristic nature, so it was considered appropriate not to report it, waiting to identify the theoretical reasons.

  1. Calculate the computing complexity of the proposed algorithm in comparison to existing training methods, noting any trade-offs.

We started to conduct this analysis, which turned out to be so complex that we decided to postpone the study to a specific work. The main difficulty was to define the network categories with which to make the comparison, because within the same network category there are several variants. Furthermore, there are few works in which the analysis of computational complexity is treated from a theoretical point of view, so the comparison must often be made on experimental results, which are always experimenter-dependent. While waiting to study the issue in greater depth with the attention it deserves, we have modified the description of section 2, in such a way as to make the operations that must be carried out as explicit as possible, thus giving the reader the tools to make a rough estimate of complexity.

  1. Improve the quality and clarity of figures, such as the phase space trajectory and error evolution, by include extensive captions and emphasizing important patterns.

We have revised the results section to include more detailed descriptions of all figures, providing clear explanations to enhance the reader's understanding of their significance.

  1. Expand on the proposed method's real-world applications, particularly beyond the chemical reaction case study.

We thank the reviewer for the suggestion, and we have introduced in the results section a second test case involving the forecasting of energy demand within the medium voltage distribution network to further evaluate the algorithm’s performance, demonstrating its robustness and applicability across different scientific and engineering contexts.

 

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have satisfactorily responded to all my questions and made the necessary changes to the manuscript.

Back to TopTop