Next Article in Journal
Communication-Efficient Federated Learning with Adaptive Consensus ADMM
Previous Article in Journal
Investigating the Fracture Process and Tensile Mechanical Behaviours of Brittle Materials under Concentrated and Distributed Boundary Conditions
 
 
Article
Peer-Review Record

A Phishing-Attack-Detection Model Using Natural Language Processing and Deep Learning

Appl. Sci. 2023, 13(9), 5275; https://doi.org/10.3390/app13095275
by Eduardo Benavides-Astudillo 1,2,*, Walter Fuertes 2, Sandra Sanchez-Gordon 1, Daniel Nuñez-Agurto 2 and Germán Rodríguez-Galán 2
Reviewer 1:
Reviewer 2:
Appl. Sci. 2023, 13(9), 5275; https://doi.org/10.3390/app13095275
Submission received: 5 March 2023 / Revised: 10 April 2023 / Accepted: 10 April 2023 / Published: 23 April 2023
(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

This paper presents a phishing detection model that uses the Keras Embedding Layer to take advantage of the web page content’s semantic and syntactic features. In addition, to assess which DL algorithm works best, the authors evaluate four DL algorithms in experiments.

 

The comments are as follows:

 

1.    The author pointed out that there are rich semantic and syntactic relationships between words, and previous research work lacked relevant research in this area, but in the experimental part, the author did not show the degree of excellence of the model proposed in this paper compared to previous work.

 

2.    The description of the experimental part is relatively crude, and some indicators are not defined or marked with references, for example, Test score and Train score are missing related descriptions in L308-309.

 

3.    The format of references should be improved, eg, L430, 455-456.

 

 

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The author has mentioned the reference to Kitchenham for SLR but had missed many points of its study that includes the designing of research questions, time period of study, and inclusion and exclusion criteria. If authors would like to follow this study then follow it in full spirit.

Authors have devoted a maximum number of pages to the methodology which is already been designed and worked on by many authors. It should be cut down and include only essential things. Further, many pages were devoted to explaining the already defined algos(LSTM, BiLSTM,GRU, BiGRU). Put some new information for it.

I did not find any novelty or contribution to the work. Authors have used freely available datasets and applied already designed algos(for which codes are readily available). Further, no configuration parameters of these algos were mentioned. It seems that these algos were run on default values. Tunning must be performed to get the path for the best values. Give more space to results and discussion section to attract good readers and references.

Manuscript can be improved further to be suitable for this journal

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

The authors experimented 4 deep learning Learner architectures under three different vector lengths for the detection of phishing web pages through analyzing their contents. A large dataset has been used in the experiment. The dataset was subjects to a set of parsing and preprocessing techniques that includes removing stopwords, tokenization et. The results show that GRU and GiGRU are the mbest models.

 

The manuscript is well written and structured. However, I have some concerns about the originality, methodology and the evaluation process adopted in the manuscript:

 

1. Originality:

 

The originality of the work is questionable. In the related work section, the authors claimed that few works are dedicated to detect phishing attacks through analyzing web page contents and using NLP. They justified their claim through a conduction of a systematic review following the kitchenham process and only three related papers were found and discussed. However, searching only Web of Sciences and Scopus are not recommended since those sources delay introducing recent published papers. In the kitchenham process, about 7 sources are suggest to be explored. Any way, by applying the same search query to Google scholar several interesting original works and reviews were found, here are some of them: 

 

Safi, A., & Singh, S. (2023). A Systematic Literature Review on Phishing Website Detection Techniques. Journal of King Saud University-Computer and Information Sciences.

 

Zieni, R., Massari, L., & Calzarossa, M. C. (2023). Phishing or not phishing? A survey on the detection of phishing websites. IEEE Access.

 

Somesha, M., Pais, A. R., Rao, R. S., & Rathour, V. S. (2020). Efficient deep learning techniques for the detection of phishing websites. Sādhanā, 45, 1-18.

 

Thirumaran, M., R. P. Karthikeyan, and V. Rathaamani. "Phishing Website Detection Using Natural Language Processing and Deep Learning Algorithm." Advances in Science and Technology. Vol. 124. Trans Tech Publications Ltd, 2023.

 

Villanueva, A., Atibagos, C., De Guzman, J., Cruz, J. C. D., Rosales, M., & Francisco, R. (2022, August). Application of Natural Language Processing for Phishing Detection Using Machine and Deep Learning Models. In 2022 International Conference on ICT for Smart Society (ICISS) (pp. 01-06). IEEE

 

 

Liu, D. J., Geng, G. G., & Zhang, X. C. (2022). Multi-scale semantic deep fusion models for phishing website detection. Expert Systems with Applications, 209, 118305.

 

Alshingiti, Z., Alaqel, R., Al-Muhtadi, J., Haq, Q. E. U., Saleem, K., & Faheem, M. H. (2023). A Deep Learning-Based Phishing Detection System Using CNN, LSTM, and LSTM-CNN. Electronics, 12(1), 232.

 

Ariyadasa, S., Fernando, S., & Fernando, S. (2022). Combining Long-Term Recurrent Convolutional and Graph Convolutional Networks to Detect Phishing Sites Using URL and HTML. IEEE Access, 10, 82355-82375.

 

Please check, introduce in the related work and compare with their results.

 

 

2. Methodology:

 

a. Word parsing process, which forms the most important part of the experiment, requires further explanation. Specifically, the use of Wordnet and Averaged_perceptron_tagger require further explanations, providing an example will be helpful. Moreover, introducing an experiment without such preprocessing steps will be useful to justify the need of them.

 

b. The use of Keras embedding layer requires further explanation since it forms the main stone of the contribution.

 

3. Validation process:

 

a. You have to provide a clear definition of metrics used for evaluation since they are less common (test score, train score, micro avg, Macro avg). 

b. The accuracy and AUC are not appropriate metrics for the evaluation of severely imbalanced datasets such as the one used in the manuscript, this in fact contradicts with the statement given in page 7 line 262-263 "While the dataset is unbalanced, we used performance measurements for unbalanced data in the evaluation process [25]". 

c. k-fold cross validation is one of the common techniques used to avoid any bias resulted from random splitting data.

d. I suggest to provide more decimals to the evaluation results so that the comparison will be more accurate.

Minor revision:

1. Page 2, line 55: "to detecting" -> "to detect"

2. Reference 22 and 23 page 16 line 455 and 456: please add the URL.

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The authors have made modifications according to the modification suggestions, and  the innovation of the paper needs to be improved if it can be accepted.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The author can read and add one of the latest systematic reviews of the phishing attack in the related literature section so that they can have a good comparison of his study with others. All other points are addressed by authours.

 

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

The new submission is a significant improvement of the original manuscript. The majority of our concerns were responded properly. However, some of them are not well-treated and require further explanation and improvements:

1. In the related work section, you should explore more databases such as Springer and ACM to strictly fit to the Kitchenham process as stated at the beginning of the section.

2. In the Word parsing section, I suggest adding a pseudocode showing how WordNet is used, since there is no further explanation of what happens after sunset printing!! The same for Average perceptron tagger, the authors should explicitly state what tags are considered and which are ignored. Moreover, it is not clear what encoding algorithm or system the authors are using (One-hot, TF-IDF, or others). Overall, the preprocessing system should be clear enough so that readers can replicate the process for future comparison.

3. The authors should include F1-score to Table 5 and remove table 6 since F1-score is one of the powerful metrics used for unbalanced datasets, AUC and avg of precisions are deceiving metrics.

4. In fact, K-fold cross-validation is necessary in this study to remove any results given by chance through random sampling!!

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop