A Systematic Review of Deep Learning Techniques for Phishing Email Detection
Abstract
1. Introduction
- A systematic literature review using the PRISMA approach with transparency and no bias.
- Conduct an in-depth qualitative analysis of 33 selected papers to categorize and present various deep learning approaches for detecting phishing emails.
- Discover the strengths and limitations of previous research and suggest potential areas for future investigation.
2. Methodology
2.1. Objectives
- To assess the empirical evidence regarding the efficacy of DL algorithms in detecting phishing emails.
- To summarize the advantages and drawbacks of current implementations of DL algorithms in detecting phishing emails.
- To identify the discrepancies and potential improvements in current phishing email detection research.
2.2. Research Questions
- How deep learning algorithms are applied and what are the most effective techniques to detect and defend against sophisticated social engineering attacks via email, such as phishing and spear phishing?
- How does the integration of deep learning algorithms in cybersecurity applications influence the precision and effectiveness of detecting phishing email threats in contrast to conventional ML approaches?
- What kind of datasets, optimization, and evaluation methods/matrix are used to train and measure the outcome?
2.3. Search Strategy
2.4. Criteria for Study Selection
2.4.1. Inclusion Criteria
- Published within 5 years. However, some literature which influenced the development of DL for phishing detection will be included in the study.
- Published using the English language.
- Relevance to topic and/or answer the research questions.
2.4.2. Exclusion Criteria
- The studies which are not related to the research questions.
- Traditional machine learning techniques or signature-based solutions for phishing email detection.
- The survey, or review, or meta-analysis papers.
- Not a research article, books, chapters, editorials, summaries of workshops, duplicated publication on the same study. For duplicated publications on the same topic, the latest publication will be selected.
2.5. Quality Instrument
2.6. Paper Selection and Syntheses
3. Literature Review and Discussion
3.1. Classification by Email Body
| Type of Features | Authors | DL Models | 
|---|---|---|
| Email body | R. Chataut et al. [23] | LLM | 
| Bagui et al. [24] | LSTM, CNN, and word embedding DL | |
| Giri et al. [25] | GloVe+CNN, and BERT+FCN | |
| McGinley et al. [27] | CNN | |
| Zannat et al. [26] | CNN, Bangla-BERT, Bi-LSTM | |
| Ramprasath et al. [28] | RNN with LSTM cells | |
| Valecha et al. [29] | Bi-LSTM | |
| Paradkar [30] | LSTM, Bi-SLTM, CNN | |
| Paliath et al. [31] | NN | |
| Divakarla et al. [32] | LSTM, Bi-LSTM, CNN+RNN | |
| Gholampour et al. [33] | ALBERT, ROBERTA, BERT, DEBERTA, DEBERT, SQ, and YOSO | |
| Bountakas et al. [34] | BERT | |
| Qachfar et al. [35] | BERT | |
| Sachan et al. [36] | CNN+BiLSTM+GRU | |
| Alhogail et al. [37] | GCN | |
| Nicholas et al. [38] | CNN | |
| AbdulNabi et al. [39] | BERT | |
| Hina et al. [40] | LSTM-GRU | |
| Header, subject, and email body | D. He et al. [41] | LSTM, Bi-SLTM | 
| Aassal et al. [11] | Multiple models from AutoSklearn and TPOT | |
| Alotaibi et al. [42] | CNN | |
| Fang et al. [43] | RCNN (with Bi-LSTM) | |
| Kaddoura et al. [44] | FFNN (MLP), BERT | |
| Salloum et al. [45] | MLP | |
| Jáñez-Martino et al. [46] | BERT | |
| Doshi et al. [47] | ANN, CNN, RNN | |
| Krishnamoorthy et al. [48] | DNN+BiLSTM | |
| Borra et al. [49] | DLCNN | |
| Header, subject, email body, and URL | T. Saka et al. [50] | BERT | 
| Magdy et al. [51] | ANN | |
| Bountakas et al. [52] | DT+KNN+MLP | |
| Header, subject, email body, URL, and attachment | T. Muralidharan et al. [10] | BERT, CNN | 
| Email structure, body, and URL | J. Lee et al. [7] | CNN-LSTM, BERT | 
| Authors | DL Models | Results | Main Contribution | Weakness/Limitation | 
|---|---|---|---|---|
| Chataut et al. [23] | LLM | Accuracy = 97.46%, F1 = 0.9668 | Demonstrate potential of LLMs for phishing identification. | Cannot detect malicious link or attachment inside email body. Not a real-time processing. Require high computational resources. Dataset is small and model may overfit. | 
| Bagui et al. [24] | LSTM, CNN, and Word Embedding DL | Word embedding DL Accuracy = 98.89%, F1 = NA DT, NB, SVM, CNN, LSTM Accuracy = < 97.50%, F1 = NA | Showed the context of the email is important in detecting phishing email. | Cannot detect malicious link or attachment inside email body. | 
| Giri et al. [25] | GloVe+CNN, and BERT+FCN | GloVe+CNN Accuracy = 98%, F1 = 0.9749 BERT+FCN Accuracy = 96%, F1 = 0.9576 | Compare the combination of word embedding techniques and DL architectures. | Cannot detect malicious link or attachment inside email body. BERT model has limitation of maximum 512 tokens (words length). | 
| McGinley et al. [27] | CNN | Accuracy = 98.139%, F1 = 0.9819 | Ablation study for best performing setting in CNN architecture for phishing email text classification. | Cannot detect malicious link or attachment inside email body. Dataset is small and model may overfitted. | 
| Zannat et al. [26] | CNN, Bangla-BERT, Bi-LSTM | Bi-LSTM Accuracy = 97%, F1 = 0.8889 CNN Accuracy = 96.8%, F1 = 0.8745 Bangla-BERT Accuracy = 96.4%, F1 = 0.8635 NB, KNN, DT, SVM, AdaBoost, RF Accuracy = < 93.6%, F1 = NA | New labeled dataset for Bangla email. | Cannot detect malicious link or attachment inside email body. No dropout layer to prevent overfitting. | 
| Ramprasath et al. [28] | RNN with LSTM cells | RNN Accuracy = 99.1%, F1 = 0.958 SVM Accuracy = 98.2%, F1 = 0.932 CkNN Accuracy = 98.1%, F1 = 0.928 | NA | Cannot detect malicious link or attachment inside email body. No dropout layer to prevent overfitting. | 
| Valecha et al. [29] | Bi-LSTM | Accuracy = 95.97%, F1 = 0.9569 | Phishing detection based on gain and loss persuasion cues of text context. | Cannot detect malicious link or attachment inside email body. Manual coding of persuasion cues labels and manual hyperparameter tuning. | 
| Paradkar [30] | LSTM, Bi-SLTM, CNN | CNN Accuracy = 98.05%, F1 = 0.9826 LSTM Accuracy = 97.32%, F1 = 0.9786 Bi-LSTM Accuracy = 98.04%, F1= 0.9825 NB, LR, SVM, DT Accuracy = < 73.23%, F1 = NA | NA | Cannot detect malicious link or attachment inside email body. | 
| Paliath et al. [31] | NN | NN Accuracy = 99.44%, F1 = 0.9915 SVM, NB, RS, RF, RT Accuracy = < 99.21%, F1 = < 0.9878 | NA | Cannot detect malicious link or attachment inside email body. Dataset is small and model may overfitted. Not scalable and may have limitation in real-world application. | 
| Divakarla et al. [32] | LSTM, Bi-LSTM, CNN+RNN | LSTM Accuracy = 98.8%, F1 = 0.987 Bi-LSTM Accuracy = 95.4%, F1= 0.95 CNN+RNN Accuracy = 97.9%, F1 = 0.956 | NA | Cannot detect malicious link or attachment inside email body. No dropout layer to prevent overfitting. | 
| Gholampour et al. [33] | ALBERT, ROBERTA, BERT, DEBERTA, DEBERT, SQ, and YOSO | BERT and its variants Accuracy = 98%∼99%, F1 = 0.92∼0.97 | Developed new adversarial ham/phish dataset. Proposed ensemble method with KNN as shield model to assign correct label before feeding to DL models. | Cannot detect malicious link or attachment inside email body. Require high computational resources. Maximum tokens of BERT an ALBERT is 512. Dataset is small and model may overfitted. | 
| Bountakas et al. [34] | BERT | Balance dataset: Word2Vec+RF Accuracy = 98.95%, F1 = 0.9897 Other combinations Accuracy = < 97%, F1 = 0.9744 Imbalanced dataset: Word2Vec+LR Accuracy = 98.62%, F1= 0.9241 Other combinations Accuracy = < 98.42%, F1 = 0.8996 | Compare the combination of NLP techniques and ML models. | Cannot detect malicious link or attachment inside email body. BERT model has limitation of maximum 512 tokens. | 
| Qachfar et al. [35] | BERT | BERT F1 = 0.991 to 0.998 RF, DT, SVM, SGD, KNN, GNB, LR, LSTM, CNN F1 = 0.72 to 0.99 | Propose method to reduce the impact of imbalanced data by adding synthetic training data. | Cannot detect malicious link or attachment inside email body. BERT model has limitation of maximum 512 tokens. | 
| Sachan et al. [36] | CNN+BiLSTM+GRU | CNN+BiLSTM+GRU Accuracy = 97.32%, F1 = 0.9545 NB, RF, KNN, SVM Accuracy = <92.4%, F1 = <0.915 | Show stacking DL models performed better than ML and single DL model. | Cannot detect malicious link or attachment inside email body. The three block stacking model is complex and may need high computing resources. | 
| Alhogail et al. [37] | GCN | Accuracy = 98.2%, F1 = 0.9855 | Propose NLP+GCN. | Cannot detect malicious link or attachment inside email body. No dropout layer to prevent overfitting. | 
| Nicholas et al. [38] | CNN | Accuracy = 98.75%, F1 = NA | Use Sand Cat Swam Optimization (SCSO) to tune the weight in CNN. | Cannot detect malicious link or attachment inside email body. SCSO is computationally more expensive than other optimization techniques. | 
| AbdulNabi et al. [39] | BERT | BERT Accuracy = 97.30%, F1 = 0.9696 BiLSTM Accuracy = 96.43%, F1 = 0.96 KNN and NB Accuracy = <94%, F1 = <0.94 | NA | Cannot detect malicious link or attachment inside email body. Maximum input sequence length is 300. | 
| Hina et al. [40] | LSTM-GRU | LSTM+GRU Accuracy = 95%, F1 = 0.95 LR, SVM, SGD, NB, RF Accuracy = <92%, F1 = <0.90 | Show stacking DL models performed better than ML in multiclassification. | Cannot detect malicious link or attachment inside email body. | 
| He et al. [41] | LSTM, Bi-SLTM | LSTM-XGB Accuracy = 98.35%, F1 = 0.9824 L-SVM, L-GNB, L-DTC Accuracy = <97%, F1 = <0.96 | Double-layer detection mechanism for both phishing and insider threats. | Cannot detect the image links embedded in phishing emails. Dataset is small and model may be overfitted. | 
| Aassal et al. [11] | Multiple models from AutoSklearn and TPOT | With header: LR, SVM, Auto-Sklearn Accuracy = 99.95%, F1 = 0.9995 DL: Accuracy = 99.85%, F1 = 0.9985 Without header: Auto-Sklearn Accuracy = 99.09%, F1 = 0.9909 DL Accuracy = 97.86%, F1 = 0.9789 | Proposed new phishing research benchmarking framework (PhishBench). | Cannot detect malicious link or attachment inside email body. | 
| Alotaibi et al. [42] | CNN | Accuracy = 99.42%, F1 = 0.9917 | NA | Cannot detect malicious link or attachment inside email body. No dropout layer to prevent overfitting. | 
| Fang et al. [43] | RCNN (with Bi-LSTM) | RCNN (with Bi-LSTM) Accuracy = 99.84%, F1 = 0.9933 LSTM Accuracy = 97.38%, F1 = 0.8783 CNN Accuracy = 96.58%, F1 = 0.849 | Embedding both character level and word level. | Cannot detect malicious link or attachment inside email body. Maximum input sequence length is 300. | 
| Kaddoura et al. [44] | FFNN (MLP), BERT | FFNN Accuracy = NA, F1 = 0.9922 | NA | Cannot detect malicious link or attachment inside email body. Purely feedforward. Limitation to capture local patterns and context analysis. | 
| Salloum et al. [45] | MLP | MLP Accuracy = 94.63%, F1 = 0.9478 KNN, DT, LR, SVM, RF, NB, XGBoost Accuracy = <93.7%, F1 = 0.9376 | New Arabic-English parallel corpus. | Cannot detect malicious link or attachment inside email body. Purely feedforward. Limitation to capture local patterns and context analysis. | 
| Jáñez-Martino et al. [46] | BERT | TF-IDF+LR Accuracy = 94.6%, F1 = 0.953 BERT+LR Accuracy = 94.2%, F1 = 0.939 | Propose SPEMC-15K-E and SPEMC-15K-S datasets and multiclassification, used OCR to scan text in the picture of HTML email body. | Cannot detect malicious link or attachment inside email body. Limitation to capture local patterns and context analysis. | 
| Doshi et al. [47] | ANN, CNN, RNN | Dual layer CNN Accuracy = 99.40%, F1 = 0.992 Dual layer RNN Accuracy = 99.10%, F1 = 0.995 Dual layer ANN Accuracy = 99.51%, F1 = 0.989 Traditional ML models Accuracy = <98.5%, F1 = 0.978 | Dual layer approach to overcome class imbalance. | Cannot detect malicious link or attachment inside email body. Splitting different class label data to train with different model seperaely may lead to overfitting for majority class. | 
| Krishnamoorthy et al. [48] | DNN+BiLSTM | DNN-BiLSTM Accuracy = 98.69%, F1 = 0.9869 LR, RF, RNN, CNN, LSTM Accuracy = <96.39%, F1 = NA | AES encryption in preliminary stage. | Cannot detect malicious link or attachment inside email body. | 
| Borra et al. [49] | DLCNN | DLCNN Accuracy = 98.43%, F1 = 0.9707 LR, SVM, NB, AdaBoost Accuracy = <89%, F1 = 0.8066 | Multiclassification with the combination of PCA for feature extraction, PSO for feature selection, and DLCNN for classification. | Cannot detect malicious link or attachment inside email body. PSO+DLCNN may computationally expensive. No dropout layer to prevent overfitting. | 
| Saka et al. [50] | BERT | BERT+DBSCAN Accuracy = 99.2%, F1 = NA BERT+Agglomerative Accuracy = 98.7%, F1 = NA BERT+K-Mean Accuracy = 98.0%, F1 = NA | Compare the combination of BERT and unsupervised clustering algorithms. | Cannot detect malicious attachment inside email body. Manual labeling. Need feature extraction and selection processes. BERT model has limitation of maximum 512 tokens. | 
| Magdy et al. [51] | ANN | Accuracy = 99.94, F1 = 0.9935 | Present multiclassification with fast training time (max 78.6 milliseconds). | Cannot detect malicious attachment inside email body. Purely feedforward. Limitation to capture local patterns and context analysis. | 
| Bountakas et al. [52] | DT+KNN+MLP | KNN-DT+ArgMax Accuracy = 99.43%, F1 = 0.9942 KNN-DT+MLP Accuracy = 99.07%, F1 = 0.9907 LR, GNB, KNN, DT, RF, MLP Accuracy = <98.6%, F1 = <0.9856 | Present ensemble learning to train hybrid features with fast training time (31 milliseconds). | Cannot detect malicious attachment inside email body. Limited adaptability to unseen features. | 
| Muralidharan et al. [10] | BERT, CNN | Accuracy = 99.2%, F1 = 0.941 | Ensemble learning to analyze all email segments including attachment. | Inference time to process and may need high computing resources. | 
| Lee et al. [7] | CNN-LSTM, BERT | RF-BERT+RF-CNN+LSTM AUPRC = 0.9997, F1 = NA RF-Word2Vec+LSTM-CNN+LSTM AUPRC = 0.9851, F1 = NA | Propose moduler architecture to analyze all components of email except attachment. | Cannot detect malicious attachment inside email body. BERT model has limitation of maximum 512 tokens (words length). Require high computational resources. | 
3.2. Classification by Header, Subject, and Email Body
3.3. Classification by Header, Subject, Email Body, and URL
3.4. Classification by Header, Subject, Email Body, URL, and Attachment
3.5. Classification by Email Structure, Body, and URL
4. Limitations
4.1. Dataset
4.2. Future Engineering
4.3. Flexible and Robust System
4.4. Sophisticated Phishing Techniques
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| ANN | Artificial Neural Network | 
| BERT | Bidirectional Encoder Representations from Transformers | 
| Bi-LSTM | Bidirectional Long Short-Term Memory | 
| CNN | Convolutional Neural Network | 
| FFNN | Feedforward Neural Network | 
| GCN | Graph Convolutional Network | 
| GRU | Gated Recurrent Unit | 
| LLM | Large Language Model | 
| LSTM | Long Short-Term Memory | 
| MLP | Multilayer Perceptron | 
| NN | Neural Network | 
| RCNN | Recurrent Convolutional Neural Networks | 
| RNN | Recurrent Neural Network | 
Appendix A
| Datasets | Size | Authors | 
|---|---|---|
| Phishing email detection dataset | Total 828 (ham 504, phishing 324) | Chataut et al. [23] | 
| Private dataset | Total 18,366 (ham 14,950, phish 3416) | Bagui et al. [24] | 
| Ling-Spam, Enron-Spam, Enron, SpamAssassin, and Nazario | Total 22,965 (ham 15,502, phish 7463) | Giri et al. [25] | 
| Enron-Spam, Enron, and Nazario | Total 3804 (ham 1870, phish 1934) | McGinley and Monroy [27] | 
| Private dataset (EES2020) | Total 291,702 (ham 224,137, phish 67,565) | Lee et al. [7] | 
| Private dataset | Total 5572 (ham 4572, phish 1000) | Zannat et al. [26] | 
| Enron and monkey.org | Total 4000 (ham 2000, phish 2000) | He et al. [41] | 
| Kaggle | Total 5572 | Ramprasath et al. [28] | 
| Wikileaks, Enron, SpamAssasin, and Nazario | Total 22,000 (ham 10,500, phish 10,500) | Aassal et al. [11] | 
| PhishingCorpus and SpamAssassin | Total 6428 (ham 4150, phish 2278) | Alotaibi et al. [42] | 
| IWSPA-AP, Wikileaks, Enron, SpamAssassin, and Nazario | Total 8780 (ham 7781, phish 999) | Fang et al. [43] | 
| Millersmile and Enron corpus | Total 38,084 (ham 19,661, phish 18,423) | Valecha et al. [29] | 
| Enron corpus | Total 20,000 (ham 11,664, phish 8336) | Paradkar [30] | 
| Nazario and SpamAssasin | Total 1256 (ham 842, phish 414) | Paliath et al. [31] | 
| Kaggle and PhishTank | Total 5171 (ham 3672, phish 1499) | Divakarla and Chandrasekaran [32] | 
| Enron | Total 32,638 (ham 16,094, phish 16,544) | Kaddoura et al. [44] | 
| IWSPA 2.0 and generated dataset | Total 7286 (ham 5692, phish 1594) | Gholampour and Verma [33] | 
| Nazario and Enron | Total 15,407 (ham 14,000, phish 1407) | Bountakas et al. [34] | 
| Nazario and Enron | Total 4472 (ham 2193, phish 2279) | Saka et al. [50] | 
| SpamAssasin, Enron, Wikileaks, and Nazario | Total 21,000 (ham 10,500, phish 10,500) | Qachfar et al. [35] | 
| IWSPA-AP and Arbaic-translated | Total 84,033 (ham 47,692, phish 36,341) | Salloum et al. [45] | 
| Ernon corpora, Phished emails corpora, and Hate Speech & Offensive | Total 42,153 (ham 12,498, harassment 19,190, suspicious 5323, phish 5142) | Sachan et al. [36] | 
| Radev | Total 8579 (ham 4894, phish 3685) | Alhogail and Alsabih [37] | 
| SPEMC-15K-E, and SPEMC-15K-S | Spam 15,000 each | Jáñez-Martino et al. [46] | 
| Nazario and SpamAssasin | Total 5554 (ham 2664, phish 4204, spam 1350) | Doshi et al. [47] | 
| UCI ML | Total 23,386 (ham 14,011, phish 4864, spam 4511) | Magdy et al. [51] | 
| UCI ML | Total 9120 (ham 1200, phish 7920) | Nicholas and Nirmalrani [38] | 
| Enron, SpamAssassin, and Nazario | Total 35,511 (ham 32,051, phish 3460) | Bountakas and Xenakis [52] | 
| VirusTotal | Total 32,676 (ham 20,037, phish 9996) | Muralidharan and Nissim [10] | 
| Enron | Total 33,727 (ham 16,563, phish 17,188) | Krishnamoorthy et al. [48] | 
| UCI ML, CSDMC, and SpamAssassin | NA | Borra et al. [49] | 
| UCI ML and SpamFilter | Total 5000 (ham 3000, spam 2000) | AbdulNabi and Yaseen [39] | 
| Enron, CLAIR, and Hate Speech & Offensive | Total 32,427 (ham 9001, harassment 9138, suspicious 5287, phish 9001) | Hina et al. [40] | 
References
- Anti-Phishing Working Group (APWG). Phishing Activity Trends Report: 4th Quarter 2023. 2023. Available online: https://www.apwg.org/trendsreports/ (accessed on 27 February 2024).
- Federal Bureau of Investigation (FBI). 2022 Internet Crime Report. 2022. Available online: https://www.ic3.gov/Media/PDF/AnnualReport/2022_IC3Report.pdf (accessed on 27 February 2024).
- Check Point Research. 2023 Cyber Security Report. 2023. Available online: https://resources.checkpoint.com/report/2023-check-point-cyber-security-report (accessed on 23 February 2024).
- Verizon. Data Breach Investigations Report 2022. 2022. Available online: https://www.phishingbox.com/downloads/Verizon-Data-Breach-Investigations-Report-DBIR-2022.pdf (accessed on 23 February 2024).
- Yamin, M.M.; Ullah, M.; Ullah, H.; Katt, B. Weaponized AI for cyber attacks. J. Inf. Secur. Appl. 2021, 57, 102722. [Google Scholar] [CrossRef]
- Kocher, G.; Kumar, G. Machine learning and deep learning methods for intrusion detection systems: Recent developments and challenges. Soft Comput. 2021, 25, 9731–9763. [Google Scholar] [CrossRef]
- Lee, J.; Tang, F.; Ye, P.; Abbasi, F.; Hay, P.; Divakaran, D.M. D-Fence: A flexible, efficient, and comprehensive phishing email detection system. In Proceedings of the 2021 IEEE European Symposium on Security and Privacy (EuroS&P), Vienna, Austria, 6–10 September 2021; pp. 578–597. [Google Scholar]
- Apruzzese, G.; Colajanni, M.; Ferretti, L.; Guido, A.; Marchetti, M. On the effectiveness of machine and deep learning for cyber security. In Proceedings of the 2018 10th International Conference on Cyber Conflict (CyCon), Tallinn, Estonia, 29 May–1 June 2018; pp. 371–390. [Google Scholar]
- Ahmad, R.; Alsmadi, I. Machine learning approaches to IoT security: A systematic literature review. Internet Things 2021, 14, 100365. [Google Scholar] [CrossRef]
- Muralidharan, T.; Nissim, N. Improving malicious email detection through novel designated deep-learning architectures utilizing entire email. Neural Netw. 2023, 157, 257–279. [Google Scholar] [CrossRef]
- El Aassal, A.; Baki, S.; Das, A.; Verma, R.M. An in-depth benchmarking and evaluation of phishing detection research for security needs. IEEE Access 2020, 8, 22170–22192. [Google Scholar] [CrossRef]
- Odeh, A.; Keshta, I.; Abdelfattah, E. Machine learningtechniquesfor detection of website phishing: A review for promises and challenges. In Proceedings of the 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC), Virtual, 27–30 January 2021; pp. 0813–0818. [Google Scholar]
- Zaimi, R.; Hafidi, M.; Lamia, M. Survey paper: Taxonomy of website anti-phishing solutions. In Proceedings of the 2020 Seventh International Conference on Social Networks Analysis, Management and Security (SNAMS), Paris, France, 14–16 December 2020; pp. 1–8. [Google Scholar]
- Zaimi, R.; Hafidi, M.; Lamia, M. A literature survey on anti-phishing in websites. In Proceedings of the 4th International Conference on Networking, Information Systems & Security, Kenitra, Morocco, 1–2 April 2021; pp. 1–7. [Google Scholar]
- Tang, L.; Mahmoud, Q.H. A survey of machine learning-based solutions for phishing website detection. Mach. Learn. Knowl. Extr. 2021, 3, 672–694. [Google Scholar] [CrossRef]
- Aung, E.S.; Zan, C.T.; Yamana, H. A survey of URL-based phishing detection. In DEIM Forum; 2019; pp. G2–G3. Available online: https://db-event.jpn.org/deim2019/post/papers/201.pdf (accessed on 8 April 2024).
- Benavides, E.; Fuertes, W.; Sanchez, S.; Sanchez, M. Classification of phishing attack solutions by employing deep learning techniques: A systematic literature review. In Developments and Advances in Defense and Security: Proceedings of MICRADS 2019; Springer: Berlin/Heidelberg, Germany, 2020; pp. 51–64. [Google Scholar]
- Al-Yozbaky, R.S.; Alanezi, M. A Review of Different Content-Based Phishing Email Detection Methods. In Proceedings of the 2023 9th International Engineering Conference on Sustainable Technology and Development (IEC), Erbil, Iraq, 21–23 February 2023; pp. 20–25. [Google Scholar]
- Salloum, S.; Gaber, T.; Vadera, S.; Shaalan, K. A systematic literature review on phishing email detection using natural language processing techniques. IEEE Access 2022, 10, 65703–65727. [Google Scholar] [CrossRef]
- Quang, D.N.; Selamat, A.; Krejcar, O. Recent research on phishing detection through machine learning algorithm. In Proceedings of the Advances and Trends in Artificial Intelligence. Artificial Intelligence Practices: 34th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2021, Kuala Lumpur, Malaysia, 26–29 July 2021; Proceedings, Part I 34. Springer: Berlin/Heidelberg, Germany, 2021; pp. 495–508. [Google Scholar]
- Kitchenham, B.; Brereton, O.P.; Budgen, D.; Turner, M.; Bailey, J.; Linkman, S. Systematic literature reviews in software engineering—A systematic literature review. Inf. Softw. Technol. 2009, 51, 7–15. [Google Scholar] [CrossRef]
- Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372. [Google Scholar]
- Chataut, R.; Gyawali, P.K.; Usman, Y. Can AI Keep You Safe? A Study of Large Language Models for Phishing Detection. In Proceedings of the 2024 IEEE 14th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 8–10 January 2024; pp. 0548–0554. [Google Scholar]
- Bagui, S.; Nandi, D.; Bagui, S.; White, R.J. Classifying phishing email using machine learning and deep learning. In Proceedings of the 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security), Oxford, UK, 3–4 June 2019; pp. 1–2. [Google Scholar]
- Giri, S.; Banerjee, S.; Bag, K.; Maiti, D. Comparative Study of Content-Based Phishing Email Detection Using Global Vector (GloVe) and Bidirectional Encoder Representation from Transformer (BERT) Word Embedding Models. In Proceedings of the 2022 First International Conference on Electrical, Electronics, Information and Communication Technologies (ICEEICT), Trichy, India, 16–18 February 2022; pp. 01–06. [Google Scholar]
- Zannat, R.; Mumu, A.A.; Khan, A.R.; Mubashshira, T.; Mahmud, S.R. A Deep Learning-Based Approach for Detecting Bangla Spam Emails. In Proceedings of the 2023 3rd International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME), Tenerife, Spain, 19–21 July 2023; pp. 1–6. [Google Scholar]
- McGinley, C.; Monroy, S.A.S. Convolutional neural network optimization for phishing email classification. In Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, 15–18 December 2021; pp. 5609–5613. [Google Scholar]
- Ramprasath, J.; Priyanka, S.; Manudev, R.; Gokul, M. Identification and mitigation of phishing email attacks using deep learning. In Proceedings of the 2023 3rd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), Greater Noida, India, 12–13 May 2023; pp. 466–470. [Google Scholar]
- Valecha, R.; Mandaokar, P.; Rao, H.R. Phishing email detection using persuasion cues. IEEE Trans. Dependable Secur. Comput. 2021, 19, 747–756. [Google Scholar] [CrossRef]
- Paradkar, N.S. Phishing Email’s Detection Using Machine Learning and Deep Learning. In Proceedings of the 2023 3rd International Conference on Advances in Computing, Communication, Embedded and Secure Systems (ACCESS), Ernakulam, India, 18–20 May 2023; pp. 160–162. [Google Scholar]
- Paliath, S.; Qbeitah, M.A.; Aldwairi, M. PhishOut: Effective phishing detection using selected features. In Proceedings of the 2020 27th International Conference on Telecommunications (ICT), Bali, Indonesia, 5–7 October 2020; pp. 1–5. [Google Scholar]
- Divakarla, U.; Chandrasekaran, K. Predicting Phishing Emails and Websites to Fight Cybersecurity Threats Using Machine Learning Algorithms. In Proceedings of the 2023 3rd International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON), Bangalore, India, 29–31 December 2023; pp. 1–10. [Google Scholar]
- Mehdi Gholampour, P.; Verma, R.M. Adversarial robustness of phishing email detection models. In Proceedings of the 9th ACM International Workshop on Security and Privacy Analytics, Charlotte, NC, USA, 26 April 2023; pp. 67–76. [Google Scholar]
- Bountakas, P.; Koutroumpouchos, K.; Xenakis, C. A Comparison of Natural Language Processing and Machine Learning Methods for Phishing Email Detection. In Proceedings of the 16th International Conference on Availability, Reliability and Security, New York, NY, USA, 17–20 August 2021. ARES ’21. [Google Scholar] [CrossRef]
- Qachfar, F.Z.; Verma, R.M.; Mukherjee, A. Leveraging synthetic data and pu learning for phishing email detection. In Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy, Baltimore, DC, USA, 25–27 April 2022; pp. 29–40. [Google Scholar]
- Sachan, S.; Doulani, K.; Adhikari, M. Semantic Analysis and Classification of Emails through Informative Selection of Features and Ensemble AI Model. In Proceedings of the 2023 Fifteenth International Conference on Contemporary Computing, Noida, India, 3–5 August 2023; pp. 181–187. [Google Scholar]
- Alhogail, A.; Alsabih, A. Applying machine learning and natural language processing to detect phishing email. Comput. Secur. 2021, 110, 102414. [Google Scholar] [CrossRef]
- Nicholas, N.N.; Nirmalrani, V. An enhanced mechanism for detection of spam emails by deep learning technique with bio-inspired algorithm. e-Prime-Adv. Electr. Eng. Electron. Energy 2024, 8, 100504. [Google Scholar] [CrossRef]
- AbdulNabi, I.; Yaseen, Q. Spam Email Detection Using Deep Learning Techniques. Procedia Comput. Sci. 2021, 184, 853–858. [Google Scholar] [CrossRef]
- Hina, M.; Ali, M.; Javed, A.R.; Ghabban, F.; Khan, L.A.; Jalil, Z. Sefaced: Semantic-based forensic analysis and classification of e-mail data using deep learning. IEEE Access 2021, 9, 98398–98411. [Google Scholar] [CrossRef]
- He, D.; Lv, X.; Xu, X.; Chan, S.; Choo, K.K.R. Double-layer Detection of Internal Threat in Enterprise Systems Based on Deep Learning. IEEE Trans. Inf. Forensics Secur. 2024, 19, 4741–4751. [Google Scholar] [CrossRef]
- Alotaibi, R.; Al-Turaiki, I.; Alakeel, F. Mitigating email phishing attacks using convolutional neural networks. In Proceedings of the 2020 3rd International Conference on Computer Applications & Information Security (ICCAIS), Riyadh, Saudi Arabia, 19–21 March 2020; pp. 1–6. [Google Scholar]
- Fang, Y.; Zhang, C.; Huang, C.; Liu, L.; Yang, Y. Phishing email detection using improved RCNN model with multilevel vectors and attention mechanism. IEEE Access 2019, 7, 56329–56340. [Google Scholar] [CrossRef]
- Kaddoura, S.; Alfandi, O.; Dahmani, N. A spam email detection mechanism for English language text emails using deep learning approach. In Proceedings of the 2020 IEEE 29th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), Virtual, 4–6 November 2020; pp. 193–198. [Google Scholar]
- Salloum, S.; Gaber, T.; Vadera, S.; Shaalan, K. A New English/Arabic Parallel Corpus for Phishing Emails. ACM Trans. Asian Low Resour. Lang. Inf. Process. 2023, 22, 1–17. [Google Scholar] [CrossRef]
- Jáñez-Martino, F.; Alaiz-Rodríguez, R.; González-Castro, V.; Fidalgo, E.; Alegre, E. Classifying spam emails using agglomerative hierarchical clustering and a topic-based approach. Appl. Soft Comput. 2023, 139, 110226. [Google Scholar] [CrossRef]
- Doshi, J.; Parmar, K.; Sanghavi, R.; Shekokar, N. A comprehensive dual-layer architecture for phishing and spam email detection. Comput. Secur. 2023, 133, 103378. [Google Scholar] [CrossRef]
- Krishnamoorthy, P.; Sathiyanarayanan, M.; Proença, H.P. A novel and secured email classification and emotion detection using hybrid deep neural network. Int. J. Cogn. Comput. Eng. 2024, 5, 44–57. [Google Scholar] [CrossRef]
- Borra, S.R.; Yukthika, M.; Bhargavi, M.; Samskruthi, M.; Saisri, P.V.; Akhila, Y.; Alekhya, S. OECNet: Optimal feature selection-based email classification network using unsupervised learning with deep CNN model. e-Prime-Adv. Electr. Eng. Electron. Energy 2024, 7, 100415. [Google Scholar] [CrossRef]
- Saka, T.; Vaniea, K.; Kökciyan, N. Context-Based Clustering to Mitigate Phishing Attacks. In Proceedings of the 15th ACM Workshop on Artificial Intelligence and Security, Los Angeles, CA, USA, 11 November 2022; AISec’22. pp. 115–126. [Google Scholar] [CrossRef]
- Magdy, S.; Abouelseoud, Y.; Mikhail, M. Efficient spam and phishing emails filtering based on deep learning. Comput. Netw. 2022, 206, 108826. [Google Scholar] [CrossRef]
- Bountakas, P.; Xenakis, C. Helphed: Hybrid ensemble learning phishing email detection. J. Netw. Comput. Appl. 2023, 210, 103545. [Google Scholar] [CrossRef]
- Koshute, P.; Zook, J.; McCulloh, I. Recommending training set sizes for classification. arXiv 2021, arXiv:2102.09382. [Google Scholar]
- Ying, X. An overview of overfitting and its solutions. J. Phys. Conf. Ser. 2019, 1168, 022022. [Google Scholar] [CrossRef]


| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kyaw, P.H.; Gutierrez, J.; Ghobakhlou, A. A Systematic Review of Deep Learning Techniques for Phishing Email Detection. Electronics 2024, 13, 3823. https://doi.org/10.3390/electronics13193823
Kyaw PH, Gutierrez J, Ghobakhlou A. A Systematic Review of Deep Learning Techniques for Phishing Email Detection. Electronics. 2024; 13(19):3823. https://doi.org/10.3390/electronics13193823
Chicago/Turabian StyleKyaw, Phyo Htet, Jairo Gutierrez, and Akbar Ghobakhlou. 2024. "A Systematic Review of Deep Learning Techniques for Phishing Email Detection" Electronics 13, no. 19: 3823. https://doi.org/10.3390/electronics13193823
APA StyleKyaw, P. H., Gutierrez, J., & Ghobakhlou, A. (2024). A Systematic Review of Deep Learning Techniques for Phishing Email Detection. Electronics, 13(19), 3823. https://doi.org/10.3390/electronics13193823
 
         
                                                



 
       