An Advanced Pruning Method in the Architecture of Extreme Learning Machines Using L1-Regularization and Bootstrapping
Round 1
Reviewer 1 Report
REVIEW ELECTRONICS-789850
Title: An Advanced Pruning Method in the Architecture of Extreme Learning Machines using L1-regularization and Bootstrapping
Thε paper presents a methodology called BR-RLM for pruning the architecture of an ELM using regression and re-sampling techniques to select the most relevant neurons to the output of the model. The LARS (Least Angle Regressor) method is used to compute the regression coefficients and shrink the subset of candidate regressors to be included in the final model. It also uses the Belasso method to select most relevant neurons in the hidden layers of ELMs.
COMMENTS
The authors present an interesting approach to the problem of pruning unnecessary neurons in the hidden layer of an ELM to avoid overfitting.
The paper is well written and structured, covers effectively the state of the art in the area, the proposed methodology and presents sufficiently the pros and cons of the authors’ work. I enjoyed reading it.
I have two minor comments:
- “PLS” in line 201 needs to be defined and,
- The authors could consider adding a couple of lines on weight constraining (or bounding the weights) as an alternative approach to addressing the overfitting problem in their future work.
Author Response
Dear reviewer. Thank you for the valuable compliments and relevant comments. Below are the answers to your contributions. All modifications were included and highlighted in red in the text.
I have two minor comments:
- “PLS” in line 201 needs to be defined and,
The PLS symbol was complemented by the term Partial Least Squares.
- The authors could consider adding a couple of lines on weight constraining (or bounding the weights) as an alternative approach to addressing the overfitting problem in their future work.
Considerations for the future work tip reported by the reviewer have been included in the last paragraph of the conclusion.
Reviewer 2 Report
The subject of the paper is interesting and important. The proposed method should be more clearly described, especially the reasons why the Bootstrapping method should be used.What is the advantage of Bootstrapping over other methods?
Formula (11) is wrong. Please correct it.
In Table 6, I think that for PUL and CDT data sets, the number of selected neurons for ELM method should be 1000 ratehr than 200. This may be a slip of the pen.
I don't think the conclusion is adequate.The authors need to summarize what the paper's contributions are and what the innovations are, rather than simply restate the comparative results.
Author Response
Dear reviewer. Thanks for the efficient comments. Below are the answers to each of your comments and suggestions. It should be noted that the inclusions and changes were highlighted in red in the text.
The subject of the paper is interesting and important. The proposed method should be more clearly described, especially the reasons why the Bootstrapping method should be used. What is the advantage of Bootstrapping over other methods?
Dear reviewer. The advantage of using the model was included in section 3.
Formula (11) is wrong. Please correct it.
Dear reviewer. Thanks for the comment. The formula was duly corrected.
In Table 6, I think that for PUL and CDT data sets, the number of selected neurons for ELM method should be 1000 ratehr than 200. This may be a slip of the pen.
Dear reviewer. Thanks for the conference on the table. It was a typo that was adequately fixed.
I don't think the conclusion is adequate.The authors need to summarize what the paper's contributions are and what the innovations are, rather than simply restate the comparative results.
Dear reviewer. Thanks for the review. The conclusion was extended according to the guidelines raised in its correction.
Reviewer 3 Report
A pruning method is proposed called BR-RLM in this paper. It is based on regularization and resampling techniques to select the most representative neurons for the model response. Pattern classification tests and benchmark regression tests of complex real-world problems are performed by comparing the proposed approach to other pruning models for ELMs.
The paper is in general good format, but the quality of figures need to improve, such as: the words in Figure1, Figure 2, Figure 9 and Figure 10 are not so clear to read.
Author Response
A pruning method is proposed called BR-RLM in this paper. It is based on regularization and resampling techniques to select the most representative neurons for the model response. Pattern classification tests and benchmark regression tests of complex real-world problems are performed by comparing the proposed approach to other pruning models for ELMs.
The paper is in general good format, but the quality of figures need to improve, such as: the words in Figure1, Figure 2, Figure 9 and Figure 10 are not so clear to read.
Dear reviewer.
We improved the rendering of the figures. Thank you for the tips. If they are not yet legible enough, we can change them in the editing process.
This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.
Round 1
Reviewer 1 Report
The paper discusses a novel method for regularising neural networks used for extreme machine learning techniques, in order to prevent overfitting.
The paper is written well, though there are a few details that can be corrected. I feel a bit puzzled by the initial ab rupto insertion into the central theme, I would appreciate a more gentle analysis of the problem, but this is certainly a consequence of me being not central as a researcher for the treated topics.
I like the balance between figures and text, and all sections are present as they should be. I recommend re-reading by a native English speaker to reduce typos.
Finally, some motivating cases are exhibited, consisting in the validation of the method against a large set of benchmarks. These are all quite nice and convincing.
I recommend publication.
Reviewer 2 Report
REVIEW ELECTRONICS-665922
Title:
Pruning method in the architecture of artificial neural networks using the feature extraction technique based on L1 regularization and resampling
According to the abstract:
… One of the problems that an ELM may face is due to a large number of non-relevant neurons in the hidden layer, making the expert model a specific dataset (overfitting). With a large number of neurons in the hidden layer, unnecessary information to the model can influence the performance of neural network pattern classification and regression problems. To solve this problem, a pruning method is proposed, based on regularization and resampling techniques, to select the most representative neurons for the model response called BR-ELM … (my underlining)
In short, this article discusses methods for regularizing extreme machine learning (ELM) neural networks, aiming at preventing the overfitting problem.
COMMENTS
This article is basically a sort of review on pruning methods for Extreme Learning Machines, (ELM is a particular learning algorithm for single-hidden layer feedforward networks with low computational complexity). However, when the number of neurons in the single hidden layer is high, overfitting becomes a real but undesirable situation. The authors claim, as seen in the abstract and many places scattered in the article, that they propose a pruning method called BR-ELM. Reading through the article I cannot verify that claim. Instead, what I see is basically a review article on pruning and regularization methods specifically for ELM. Such a review could be of interest to scientific community, even restricted to ELM alone but, it requires total redesign and rewriting of the article. Since regularization methods are the main concept of this special issue, Ι recommend the authors to devote at least a paragraph regarding two efficient regularization methodologies: Dropout and Weight-Constrained networks. In its present form, I fail to see any author contribution or novelty since all methods and algorithms presented are published by other authors (as quite rightly are sited by the authors of the present article). In lines 41-42, the article reads: Bach [9] proposes the model that uses the LARS in replications by re-sampling and calls the method Bolasso. This paper will demonstrate the use of Bolasso to select the most relevant neurons in the hidden layer of ELM. (the underlining is mine). Further down in lines 49-51, it reads: The main contribution of this paper is to act dynamically on artificial neural networks to select the most significant neurons using techniques that combine resampling and regularization techniques to select neurons capable of obtaining more assertive outputs. (again the underlining is mine). So, the way I read it is this: The paper demonstrates the use of Bolasso (Bach[9]) to select the most significant neurons, using techniques that combine resampling and regularization, to select neurons capable of obtaining more assertive outputs in ELMs. This is not contribution but, is useful practical demonstration of Bolasso method in a review paper. Section 2, Related Work, is rather “Background Work”. Related work refers to research work by other researchers, similar or close to the one presented here by the authors. This is not the case here. What the authors present here is a review of pruning and regularization methods for ELMs Your Related Work section is in fact within your introduction section and in any case needs much more elaboration because is presented in extremely shortened form. The work of Xu and Wang [24] is presented in lines 178-183. However, I fail to see how it relates or compares to your work. You have to provide a comparison in some way. Several portions of the article are very well written but there are also several portions which are either badly written or even unreadable and need rewriting and careful reading. For EXAMPLE,Lines 271-275: “Considered n independent and identically distributed observations (xi, yi) ∈ Rd × R, i = 1 ,...,n, provided by the matrices X ∈ Nn×d and Y ∈ Rn, us admit m bootstrap replications of n data points [9] apud [43]; that is, for k = 1,...m, we suppose an auxiliary sample (xik, yik) ∈ Rd × R, i = 1,...,n, given by the matrices Xk ∈ Rnxp and Yk ∈ Rn. The n pairs (xik, yik), i = 1 ,...,n, are regularly sampled at arbitrary by a replacement of the original n pairs in (X, Y), subtracting the evaluations not to be biased.” contains several typos and makes no clear sense!! Same thing happens in several other places of the article (e.g. line 354) The terms neuron and node are interchanged is several places within the text. Lines 175-177. Please rephrase as: “Finally, in [23] Campos Souza et.al. … “. Also, please describe exactly what they have done with regard to pruning an ELM model. You refer to L1 as L1 as well as L1, please use uniform notation. You present the two algorithms, algorithm1 and algorithm 2 which are clearly not yours. So, my question remains: what is the novelty or your contribution in this work???? In several places within the article, as well as in the conclusions, you refer to “the proposed methods”, this implies that you refer to methods proposed by the authors in the context of their presented work in this article. But this is not the case. Finally, the title should be slightly changed to “Pruning methods in the architecture of artificial neural networks using the feature extraction technique based on L1 regularization and resampling”.
IN CONCLUSION
This article, in its present form, cannot be accepted as a research paper because there is no originality or new research contribution. However, it could make a useful and interesting REVIEW paper on pruning and regularization methods specifically for ELM and I would like to encourage the authors to resubmit their work as such, after taking into consideration the above comments.
Reviewer 3 Report
This paper presents a sound application of a new method in pruning nodes of an ELM network. The work is well presented and is well written (except for minor comments see below). The paper presents a good review of work being done in this field and a good bibliography is presented.
The conclusions are correct and well stated.
Notes:
In the abstract the use of ELM as a abbreviation should be introduced in the first line "Extreme learning machines (ELM) are efficient....." line 27 ïncuring in a situation"should be deleted line 46 should end with a "." and line 47 does not make sense and should be rewritten or discarded line 53 "dynamic and assertive."-> ""dynamically and assertively." line 186 "ït is developed"-> "is developed" line 272 "üs"-> "we" line 334 "disregarded" -> "discarded"