IoT Dataset Validation Using Machine Learning Techniques for Traffic Anomaly Detection
Round 1
Reviewer 1 Report
It is vital to detect anomaly workload in IoT networks. This paper presents an optimized Random Forest classifier to detect traffic anomalies using the techniques such as SMOTE, RFE, and the ERDE metric. I have some concerns to the proposed work.
1) These proposed features of this work seem not novel for machine learning applications. For example, ERDE has been widely used for early anomaly detection in network IDSs as the authors have described in Line 422. This paper just adopts ERDE in IoT network applications for early anomaly detection. Besides, the contents of SMOTE, RFE, and the MQTT-IoT modeling are not with enough novelty.
2)Some more experiments compared with other relative works are required to support the authors' points. Besides, more classifier implementations using different machine learning algorithms are required to prove the effectiveness and high precision of the proposed random forest classifier.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 2 Report
First of all congratulations on submitting the paper. The comments which could improve the paper are given below:
1) At the end of the abstract, the obtained results such as accuracy, other important measures (values) could be presented. It would help the reader to know what was obtained and how it related to what they are looking for, the first impression. At least once the definition of IoT has to be written as a full sentence, for people who are less familiar with this shortening.
2) Need to re-check keywords, sometimes a comma is used, sometimes a semicolon to separate them. Uppercase the machine learning term or other terms has no sense.
3) Lines 52-57 would be better to place somewhere else (for example before experiments), and in my opinion, the argument why Random forest has been used is too weak. In this field, there are a lot of deep learning solutions which prove efficiency.
4) In my opinion the 1.1, 1.2, 1.3 could be just a part of the Introduction section, and subsets no needed.
5) I have to commend the authors for a really thorough analysis of the literature, well done.
6) I don't think it is enough just to write that details about the dataset are given in some reference (line 231). In this case, it is hard to understand all situations about the data just from reading this paper, so at least some short description should be given to see what size of dataset items, classes, unbalanced classes distribution, etc.
7) Does not get the point why to bold abbreviations, lines 243-244.
8) No information about the dataset before SMOTE, and after SMOTE. Also, how the number of neighbors in SMOTE method has been selected (one of the SMOTE parameters)?
9) The Kappa and accuracy measures are used, so at least the main formulas what these measures show and how it is calculated could be given.
10) In lines 464 and 475 I don't understand what is (a), (b), because in the Figures there is no such labels, just one picture, something wrong.
11) I understood that the authors used in the research the Random forest mostly by default, so it still would be interesting to see how other algorithms with the same conditions are effective. From personal experience, I can guarantee that other tree-based methods would give almost the same accuracy. To reach 100% is impossible, but to get the worse results using such methods like K-NN is very easy. In my opinion, the authors need to give a very good explanation of why the Random Forest is the best in this situation because the is no comparative analysis made with other methods.
Good luck with submitting the manuscript.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
The authors have made lots of modifications during the revision. Most of the concerns have been addressed. The writing of the paper still needs to be carefully checked.
Eg. in Table 6, 1.1226e-03 and 8.3510e-3 should use the same format.
Author Response
Response to Reviewer 1 Comments
Point 1: The authors have made lots of modifications during the revision. Most of the concerns have been addressed. The writing of the paper still needs to be carefully checked.
Eg. in Table 6, 1.1226e-03 and 8.3510e-3 should use the same format.
Response 1: We have tried to check all format, the tables and figures and put them in the same format in the best possible way. Lines 539, 544, 550, 562,576, 583, 586, 603 and 612.
Reviewer 2 Report
Thank you for taking the suggestion into account. There are some minor mistakes left, for example, the references are not formatted in the same style. But I hope it will be fixed. Good luck.
Author Response
Response to Reviewer 2 Comments
Point 1: Thank you for taking the suggestion into account. There are some minor mistakes left, for example, the references are not formatted in the same style. But I hope it will be fixed. Good luck.
Response 1: We have tried to check all the references and put them in the same format in the best possible way. Thank you.