Evolutionary Neural Architecture Search (NAS) Using Chromosome Non-Disjunction for Korean Grammaticality Tasks
Round 1
Reviewer 1 Report
Please see the attachment.
Comments for author File: Comments.pdf
Author Response
Please see the attached file for the answers to the comments.
Author Response File: Author Response.doc
Reviewer 2 Report
Reading the new version of the paper, I noticed all my remarks expressed in my previous review were ignored. Therefore, in the following, you can find my major and minor remarks:
Authors present a Neural Architecture Search (NAS) approach for linguistic modeling of Korean linguistic phenomena.
NAS belongs to the so called Automated Machine learning (AutoML), that is a computational paradigm useful to search for deep learning architecture topologies in an automatic way. In particular, the authors propose a NAS technique based on Evolutionary Algorithm (EA).
Given an initial network topology, applying NAS the authors demonstrate, on one dataset, that the accuracy loss of the final network is close to zero percent. The proposed method is quite interesting, and the preliminary results, that need to be further validated with other datasets and against similar approaches, are encouraging. English is good.
In my opinion, the technical part of the paper needs an improvement because many details are not clear.
Major remarks:
- It is not clear what is the output of the network and how it is computed the accuracy score. Please explain it
- It is not clear how a network topology is mapped to a chromosome representation. Please explain it
- Used datasets and source code should be available for the community
Minor remarks:
- Please provide a bibliographic reference for CoDeepNEAT on page 3.
- Please check the punctuaction all over the manuscript.
Author Response
Please see the attached file for the answers to the comments.
Author Response File: Author Response.doc
Reviewer 3 Report
The authors present a neural architecture search strategy to answer a Korean Grammaticality task. The neural architecture search strategy is a Variable Chromosome Genetic Algorithm (VCGA) and it is based on a previous paper of Park and Shin, which was published in Applied Science:
Park, K.; Shin, D.; Chi, S. “Variable Chromosome Genetic Algorithm for Structure
Learning in Neural Networks to Imitate Human Brain”. Applied Sciences 2019, 9, 3176.
The novelty of this paper is in the application of VCGA to a grammaticality task and shows how feed-forward neural networks can do well in these tasks that are traditionally addressed using recurrent neural networks. In conclusion, the authors claim that their research results show how feed-forward neural networks can outperform recurrent neural networks, but they must remind the reader that they are considering small sentences, and not arbitrarily large sentences. Clearly, in general, feed-forward networks may not work as well.
Here are some important points the authors should consider. Before resubmitting:
- Pag. 2, line 51. The authors introduce the concept of “syntactic linearization”. This concept is an important and complex concept that must be explained in greater detail and deserves a more extensive explanation.
- Pag. 2, line 59. The example given must be explained more extensively. The example is very hard to understand, especially if we consider that most readers are not linguists and do not know Korean.
- Pag. 3, line 110. “however, it is not appropriate to generate topology of the neural network due to its goal is within HPO [31,32]”. The authors should explain why Baysian Optimization may not generate a new topology.
- The authors must explain in greater detail the “Genetic Operator” and the “DNN Generator” and how these modules work. At the moment only a brief sketch is given. Considering that any paper must be self-consistent and the findings reproducible, the authors must explain how the neural networks are decoded in the chromosome population and how the population is manipulated by the DNN generator.
- Pag. 3, line 116. The acronym GA is not defined previously. Is the “GA generator” the same as the “Genetic Operator” in the corresponding figure?
- Pag. 3, line 116. The acronym DNN is not defined. In the results only three-layer networks are discussed, which normally are not considered to be deep neural networks. Thus, I suggest that the authors refer to neural networks (NN) instead of Deep NN.
- Pag. 4, figure 1. What is the GAML?
- Pag. 4, line 155. At the end of the equation, there are three dots. This is most likely an error.
- Pag. 4, line 162. “We have created four-word level sentences in Korean that contain 7 syntactic categories (Noun Phrase, Verb Phrase, Prepositional Phrase, Adjective Phrase, Adverbs, Complementizer Phrase, Auxiliary phrases), which results in 2401 combinations.” This must be explained better and so must Table 1. Specifically, I do not understand why certain words are crossed and how the order is assigned. Why do you consider a four word level sentence and in the table you have five words? What does the first slot represent?
- Pag. 6, lines 179-180. “...and one hidden layer.” -> “...and one output layer.”
- Pag. 6, lines 190-191. “It added a link to the output layer from the hidden layer every evolution step and converged after generation 6.” This sentence is not clear, I guess because the paper does not explain the genetic operator.
- Pag. 7, Figure 5 the hidden layer has output 1. Should this be 5?
Here are minor linguistic considerations the authors should address before the paper is published:
- Pag. 1, line 29. The sentence “... to language modeling for Korean linguistic phenomena.” is vague and imprecise, the authors are considering a grammaticality task for Korean languages.
- Pag. 1, line 30. “...designing neural architecture.” -> ”...automatic neural network design.”
- Pag. 1, line 36. “open-source toolkit” -> “open-source toolkits” and “has” -> ”have”.
- Pag. 2, line 78. “Design” -> “designing.
- Pag. 2, line 85. ”...challenges to AutoML” -> ”...approaches to AutoML”. Challenge is not the most appropriate word.
- Pag. 3, line 110. “... is within HPO [31,32].” -> “... being within HPO [31,32]
- Pag. 3, line 116. Maybe you should change: “Locus” -> “focus”
- Pag. 4, line 158. “Numlayer” and “Numavg” must be correctly formatted.
- Pag. 4, line 162. “Preposition phrase” -> “Prepositional Phrase”.
- Pag. 4, line 162. “...results to 2401 combinations.” -> “...results in 2401 combinations.”
Author Response
Please see the attached file for the answers to the comments.
Author Response File: Author Response.doc
Round 2
Reviewer 2 Report
Dear authors, most of my concerns have been clarified.
In my opinion, it is still not clear what is the output of the network. Frmo Figure 5 the network has 1 output, can you better explain what is that output?
Author Response
Answers to reviewer file is attached.
Author Response File: Author Response.doc
Reviewer 3 Report
Clearly, there must have been a problem with the upload of the version 2 of the manuscript because although the authors claim to have answered my remark number 6, 7, 8, 9, 10, and the minor linguistic considerations, the second version of the manuscript does not report these changes. Thus I invite the authors to solve this problem.
Specifically for remarke 3. I do understand that Bayesian Optimization is aimed for parameter optimization and not for topology generation. This is clear to most people that understand a little of Bayesian Optimization. But for the manuscript to be self-explanatory to any reader of Applsci included readers that may not know anything of Bayesian Optimization in my opinion. Thus the authors should have a sentence in the text that explains what they explained to me.
For me it is important that the authors change the manuscript in response to remarke 4. Simply presenting a possible NN topology in a figure does not explain to me how the topologies are coded in chromosome information and how mutation and non-disjunction work (genetic operator). The main problem is that this operation could be made in several different ways that could give different results. Consequently the paper is not reproducible with the information it gives, the reader would be forced to read the previous papers of the authors. I have the same problem with the DNN generator. The authors must add a sintetic explanation of what it is happening and how things are coded practically.
The explanation to remark 12 must be in the manuscript otherwise the figure makes no sense.
In conclusion, I repeat once more. The authors claim that their research results show how feed-forward neural networks can outperform recurrent neural networks on language task, but they must remind the reader that they are considering small four-word sentences, and not arbitrarily large sentences. Clearly, in general, feed-forward networks may not work as well. The authors must add this consideration in their responce.
Author Response
The answers to the reviewers are attached as follows.
Author Response File: Author Response.doc
Round 3
Reviewer 3 Report
My major concerns are now addressed.
The authors should proofread the manuscript at least one spelling error is still present.
Pag. 4 line 156. "muates" -> "mutates"
The paper can be accepted after these spelling errors are corrected.
Author Response
Dear reviewer,
We corrected a spelling error as you commented.
>> Pag. 4 line 156. "muates" -> "mutates"
Author Response File: Author Response.docx
This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.
Round 1
Reviewer 1 Report
Thank you for the opportunity to review this paper.
The primary point to note is that no introduction is given to the variable chromosome genetic algorithm beyond the remark (line 117-8) that 'It does not need minimum architecture since it uses a new genetic operation to make the destructive method as well as a constructive method.' This statement is not very clear, so it might be worth reworking this a little and considering whether it is possible to give a more intuitive explanation of the salient features of the VCGA. It is evident that detailed discussion is outside the scope of this letter, but perhaps it is possible to concisely express the essential take-home points, as this is key to comprehension of the paper.
This is a well-described experiment but clearly limited in its scope due to the reasonably small size of the dataset. Would it be possible to identify potential applications for this work as it stands or in the future? Additionally, could you comment on application to other languages that exhibit scrambling or argument ellipsis (Latin, with its tendency toward hyperbaton, comes to mind)? The significance of the work would be greatly clarified if it were possible to point to applications.
Occasionally the text can be awkwardly phrased (example: 'Since the addition of input means increases the number of the entire data seven times, we postpone this to the next research.')
Finally: 'the NAS can successfully find the neural architecture for the Korean language data' - do you mean 'the' neural architecture, or 'a' neural architecture considered adequate given the input data? i.e. would it always converge to the same structure?
Reviewer 2 Report
Authors apply the Variable Chromosome Genetic Algorithm (VCGA) with chromosome non-disjunction [18] to search the network architecture for Korean. There is not a technical novelty, applying to the Korean dataset is the only contribution. Thus, I cannot be accepted your paper even the letter.
Reviewer 3 Report
Authors present a Neural Architecture Search (NAS) approach for linguistic modeling of Korean linguistic phenomena.
NAS belongs to the so called Automated Machine learning (AutoML), that is a computational paradigm useful to search for deep learning architecture topologies in an automatic way. In particular, the authors propose a NAS technique based on Evolutionary Algorithm (EA).
Given an initial network topology, applying NAS the authors demonstrate, on one dataset, that the accuracy loss of the final network is close to zero percent. The proposed method is quite interesting, and the preliminary results, that need to be further validated with other datasets and against similar approaches, are encouraging. English is good.
In my opinion, the technical part of the paper needs an improvement because many details are not clear.
Major remarks:
- It is not clear what is the output of the network and how it is computed the accuracy score. Please explain it
- It is not clear how a network topology is mapped to a chromosome representation. Please explain it
- Used datasets and source code should be available for the community
Minor remarks:
- Please provide a bibliographic reference for CoDeepNEAT on page 3.
- Please check the punctuaction all over the manuscript.
- On line 79, please replace "has developed" with "is developed".