MDPI - Publisher of Open Access Journals

19 pages, 1339 KiB

Open AccessFeature PaperArticle

Convolutional Graph Network-Based Feature Extraction to Detect Phishing Attacks

by Saif Safaa Shakir, Leyli Mohammad Khanli and Hojjat Emami

Future Internet 2025, 17(8), 331; https://doi.org/10.3390/fi17080331 - 25 Jul 2025

Viewed by 331

Phishing attacks pose significant risks to security, drawing considerable attention from both security professionals and customers. Despite extensive research, the current phishing website detection mechanisms often fail to efficiently diagnose unknown attacks due to their poor performances in the feature selection stage. Many [...] Read more.

Phishing attacks pose significant risks to security, drawing considerable attention from both security professionals and customers. Despite extensive research, the current phishing website detection mechanisms often fail to efficiently diagnose unknown attacks due to their poor performances in the feature selection stage. Many techniques suffer from overfitting when working with huge datasets. To address this issue, we propose a feature selection strategy based on a convolutional graph network, which utilizes a dataset containing both labels and features, along with hyperparameters for a Support Vector Machine (SVM) and a graph neural network (GNN). Our technique consists of three main stages: (1) preprocessing the data by dividing them into testing and training sets, (2) constructing a graph from pairwise feature distances using the Manhattan distance and adding self-loops to nodes, and (3) implementing a GraphSAGE model with node embeddings and training the GNN by updating the node embeddings through message passing from neighbors, calculating the hinge loss, applying the softmax function, and updating weights via backpropagation. Additionally, we compute the neighborhood random walk (NRW) distance using a random walk with restart to create an adjacency matrix that captures the node relationships. The node features are ranked based on gradient significance to select the top k features, and the SVM is trained using the selected features, with the hyperparameters tuned through cross-validation. We evaluated our model on a test set, calculating the performance metrics and validating the effectiveness of the PhishGNN dataset. Our model achieved a precision of 90.78%, an F1-score of 93.79%, a recall of 97%, and an accuracy of 93.53%, outperforming the existing techniques. Full article

(This article belongs to the Section Cybersecurity)

► Show Figures

Graphical abstract

28 pages, 1444 KiB

Open AccessArticle

Enhancing Cryptocurrency Security: Leveraging Embeddings and Large Language Models for Creating Cryptocurrency Security Expert Systems

by Ahmed A. Abdallah, Heba K. Aslan, Mohamed S. Abdallah, Young-Im Cho and Marianne A. Azer

Symmetry 2025, 17(4), 496; https://doi.org/10.3390/sym17040496 - 26 Mar 2025

Viewed by 1548

Abstract

In recent years, the rapid growth of cryptocurrency markets has highlighted the urgent need for advanced security solutions capable of addressing a spectrum of unique threats, from phishing and wallet hacks to complex blockchain vulnerabilities. This paper presents a comprehensive approach to fortifying [...] Read more.

In recent years, the rapid growth of cryptocurrency markets has highlighted the urgent need for advanced security solutions capable of addressing a spectrum of unique threats, from phishing and wallet hacks to complex blockchain vulnerabilities. This paper presents a comprehensive approach to fortifying cryptocurrency systems by harnessing the structural symmetry inherent in transactional patterns. By leveraging local large language models (LLMs), embeddings, and vector databases, we develop an intelligent and scalable security expert system that exploits symmetry-based anomaly detection to enhance threat identification. Cryptocurrency networks face increasing threats from sophisticated attacks that often exploit asymmetric vulnerabilities. To counteract these risks, we propose a novel security expert system that integrates symmetry-aware analysis through LLMs and advanced embedding techniques. Our system efficiently captures symmetrical transaction patterns, enabling robust detection of anomalies and threats while preserving structural integrity. By integrating a modular framework with LangChain and a vector database (Chroma DB), we achieve improved accuracy, recall, and precision by leveraging the symmetry of transaction distributions and behavioral patterns. This work sets a new benchmark for LLM-driven cybersecurity solutions, offering a scalable and adaptive approach to reinforcing the security symmetry in cryptocurrency systems. The proposed expert system was evaluated using a benchmark dataset of cryptocurrency transactions, including real-world threat scenarios involving phishing, fraudulent transactions, and blockchain anomalies. The system achieved an accuracy of 92%, a precision of 89%, and a recall of 93%, demonstrating a 10% improvement over existing security frameworks. Compared to traditional rule-based and machine learning-based detection methods, our approach significantly enhances real-time threat detection while reducing false positives. The integration of LLMs with embeddings and vector retrieval enables more efficient contextual anomaly detection, setting a new benchmark for AI-driven security solutions in the cryptocurrency domain. Full article

(This article belongs to the Special Issue Information Security in AI)

► Show Figures

Figure 1

30 pages, 3133 KiB

Open AccessArticle

In-Depth Analysis of Phishing Email Detection: Evaluating the Performance of Machine Learning and Deep Learning Models Across Multiple Datasets

by Abeer Alhuzali, Ahad Alloqmani, Manar Aljabri and Fatemah Alharbi

Appl. Sci. 2025, 15(6), 3396; https://doi.org/10.3390/app15063396 - 20 Mar 2025

Cited by 2 | Viewed by 6456

Abstract

Phishing emails remain a primary vector for cyberattacks, necessitating advanced detection mechanisms. Existing studies often focus on limited datasets or a small number of models, lacking a comprehensive evaluation approach. This study develops a novel framework for implementing and testing phishing email detection [...] Read more.

Phishing emails remain a primary vector for cyberattacks, necessitating advanced detection mechanisms. Existing studies often focus on limited datasets or a small number of models, lacking a comprehensive evaluation approach. This study develops a novel framework for implementing and testing phishing email detection models to address this gap. A total of fourteen machine learning (ML) and deep learning (DL) models are evaluated across ten datasets, including nine publicly available datasets and a merged dataset created for this study. The evaluation is conducted using multiple performance metrics to ensure a comprehensive comparison. Experimental results demonstrate that DL models consistently outperform their ML counterparts in both accuracy and robustness. Notably, transformer-based models BERT and RoBERTa achieve the highest detection accuracies of 98.99% and 99.08%, respectively, on the balanced merged dataset, outperforming traditional ML approaches by an average margin of 4.7%. These findings highlight the superiority of DL in phishing detection and emphasize the potential of AI-driven solutions in strengthening email security systems. This study provides a benchmark for future research and sets the stage for advancements in cybersecurity innovation. Full article

► Show Figures

Figure 1

18 pages, 343 KiB

Open AccessEditor’s ChoiceArticle

Comparative Investigation of Traditional Machine-Learning Models and Transformer Models for Phishing Email Detection

by René Meléndez, Michal Ptaszynski and Fumito Masui

Electronics 2024, 13(24), 4877; https://doi.org/10.3390/electronics13244877 - 11 Dec 2024

Cited by 4 | Viewed by 4661

Abstract

Phishing emails pose a significant threat to cybersecurity worldwide. There are already tools that mitigate the impact of these emails by filtering them, but these tools are only as reliable as their ability to detect new formats and techniques for creating phishing emails. [...] Read more.

Phishing emails pose a significant threat to cybersecurity worldwide. There are already tools that mitigate the impact of these emails by filtering them, but these tools are only as reliable as their ability to detect new formats and techniques for creating phishing emails. In this paper, we investigated how traditional models and transformer models work on the classification task of identifying if an email is phishing or not. We realized that transformer models, in particular distilBERT, BERT, and roBERTa, had a significantly higher performance compared to traditional models like Logistic Regression, Random Forest, Support Vector Machine, and Naive Bayes. The process consisted of using a large and robust dataset of emails and applying preprocessing and optimization techniques to maximize the best result possible. roBERTa showed an outstanding capacity to identify phishing emails by achieving a maximum accuracy of 0.9943. Even though they were still successful, traditional models performed marginally worse; SVM performed the best, with an accuracy of 0.9876. The results emphasize the value of sophisticated text-processing methods and the potential of transformer models to improve email security by thwarting phishing attempts. Full article

(This article belongs to the Special Issue Advanced Natural Language Processing Technology and Applications)

► Show Figures

Figure 1

10 pages, 1927 KiB

Open AccessProceeding Paper

AI-Driven Vishing Attacks: A Practical Approach

by Fabricio Toapanta, Belén Rivadeneira, Christian Tipantuña and Danny Guamán

Eng. Proc. 2024, 77(1), 15; https://doi.org/10.3390/engproc2024077015 - 18 Nov 2024

Cited by 1 | Viewed by 2742

Abstract

Today, there are many security problems at the technological level, especially in telecommunications. Cybercriminals invade and steal data from any system using vector attacks such as phishing through scam mail, fake websites and phone calls. This latter form of phishing is called vishing [...] Read more.

Today, there are many security problems at the technological level, especially in telecommunications. Cybercriminals invade and steal data from any system using vector attacks such as phishing through scam mail, fake websites and phone calls. This latter form of phishing is called vishing (phishing using voice). Through vishing and using social engineering techniques, attackers can impersonate family members or friends of potential victims and obtain information or money or a specific target objective. Traditionally, to carry out vishing attacks, attackers imitated the vocabulary, voice and tone of a person known to the victim. However, with current artificial intelligence (AI) tools, obtaining synthetic voices similar or identical to the person to be impersonated is more straightforward and precise. In this regard, this paper, using ChatGPT and three AI-enabled applications for voice synthesis presents a practical approach for deploying vishing attacks in an academic environment to identify the limitations, implications and possible countermeasures to mitigate the effects on Internet users. Results demonstrate the effectiveness of vishing attacks, and the maturity level of the employed AI tools. Full article

(This article belongs to the Proceedings of The XXXII Conference on Electrical and Electronic Engineering)

► Show Figures

Figure 1

18 pages, 4563 KiB

Open AccessArticle

Kashif: A Chrome Extension for Classifying Arabic Content on Web Pages Using Machine Learning

by Malak Aljabri, Hanan S. Altamimi, Shahd A. Albelali, Maimunah Al-Harbi, Haya T. Alhuraib, Najd K. Alotaibi, Amal A. Alahmadi, Fahd Alhaidari and Rami Mustafa A. Mohammad

Appl. Sci. 2024, 14(20), 9222; https://doi.org/10.3390/app14209222 - 11 Oct 2024

Cited by 1 | Viewed by 1670

Abstract

Search engines are significant tools for finding and retrieving information. Every day, many new web pages in various languages are added. The threats of cyberattacks are expanding rapidly with this massive volume of data. The majority of studies on the detection of malicious [...] Read more.

Search engines are significant tools for finding and retrieving information. Every day, many new web pages in various languages are added. The threats of cyberattacks are expanding rapidly with this massive volume of data. The majority of studies on the detection of malicious websites focus on English-language websites. This necessitates more studies on malicious detection on Arabic-content websites. In this research, we aimed to investigate the security of Arabic-content websites by developing a detection tool that analyzes Arabic content based on artificial intelligence (AI) techniques. We contributed to the field of cybersecurity and AI by building a new dataset of 4048 Arabic-content websites. We created and conducted a comparative performance evaluation for four different machine-learning (ML) models using feature extraction and selection techniques: extreme gradient boosting, support vector machines, decision trees, and random forests. The best-performing model was then integrated into a Chrome plugin, created based on a random forest (RF) model, and utilized the features selected via the chi-square technique. This produced plugin tool attained an accuracy of 92.96% for classifying Arabic-content websites as phishing, suspicious, or benign. To our knowledge, this is the first tool designed specifically for Arabic-content websites. Full article

(This article belongs to the Special Issue Data Mining and Machine Learning in Cybersecurity)

► Show Figures

Figure 1

25 pages, 10389 KiB

Open AccessArticle

Towards a Hybrid Security Framework for Phishing Awareness Education and Defense

by Peter K. K. Loh, Aloysius Z. Y. Lee and Vivek Balachandran

Future Internet 2024, 16(3), 86; https://doi.org/10.3390/fi16030086 - 1 Mar 2024

Cited by 4 | Viewed by 3980

Abstract

The rise in generative Artificial Intelligence (AI) has led to the development of more sophisticated phishing email attacks, as well as an increase in research on using AI to aid the detection of these advanced attacks. Successful phishing email attacks severely impact businesses, [...] Read more.

The rise in generative Artificial Intelligence (AI) has led to the development of more sophisticated phishing email attacks, as well as an increase in research on using AI to aid the detection of these advanced attacks. Successful phishing email attacks severely impact businesses, as employees are usually the vulnerable targets. Defense against such attacks, therefore, requires realizing defense along both technological and human vectors. Security hardening research work along the technological vector is few and focuses mainly on the use of machine learning and natural language processing to distinguish between machine- and human-generated text. Common existing approaches to harden security along the human vector consist of third-party organized training programmes, the content of which needs to be updated over time. There is, to date, no reported approach that provides both phishing attack detection and progressive end-user training. In this paper, we present our contribution, which includes the design and development of an integrated approach that employs AI-assisted and generative AI platforms for phishing attack detection and continuous end-user education in a hybrid security framework. This framework supports scenario-customizable and evolving user education in dealing with increasingly advanced phishing email attacks. The technological design and functional details for both platforms are presented and discussed. Performance tests showed that the phishing attack detection sub-system using the Convolutional Neural Network (CNN) deep learning model architecture achieved the best overall results: above 94% accuracy, above 95% precision, and above 94% recall. Full article

(This article belongs to the Special Issue Information and Future Internet Security, Trust and Privacy II)

► Show Figures

Figure 1

27 pages, 3161 KiB

Open AccessArticle

A Hybrid Approach for Alluring Ads Phishing Attack Detection Using Machine Learning

by Muhammad Waqas Shaukat, Rashid Amin, Muhana Magboul Ali Muslam, Asma Hassan Alshehri and Jiang Xie

Sensors 2023, 23(19), 8070; https://doi.org/10.3390/s23198070 - 25 Sep 2023

Cited by 27 | Viewed by 5297

Abstract

Phishing attacks are evolving with more sophisticated techniques, posing significant threats. Considering the potential of machine-learning-based approaches, our research presents a similar modern approach for web phishing detection by applying powerful machine learning algorithms. An efficient layered classification model is proposed to detect [...] Read more.

Phishing attacks are evolving with more sophisticated techniques, posing significant threats. Considering the potential of machine-learning-based approaches, our research presents a similar modern approach for web phishing detection by applying powerful machine learning algorithms. An efficient layered classification model is proposed to detect websites based on their URL structure, text, and image features. Previously, similar studies have used machine learning techniques for URL features with a limited dataset. In our research, we have used a large dataset of 20,000 website URLs, and 22 salient features from each URL are extracted to prepare a comprehensive dataset. Along with this, another dataset containing website text is also prepared for NLP-based text evaluation. It is seen that many phishing websites contain text as images, and to handle this, the text from images is extracted to classify it as spam or legitimate. The experimental evaluation demonstrated efficient and accurate phishing detection. Our layered classification model uses support vector machine (SVM), XGBoost, random forest, multilayer perceptron, linear regression, decision tree, naïve Bayes, and SVC algorithms. The performance evaluation revealed that the XGBoost algorithm outperformed other applied models with maximum accuracy and precision of 94% in the training phase and 91% in the testing phase. Multilayer perceptron also worked well with an accuracy of 91% in the testing phase. The accuracy results for random forest and decision tree were 91% and 90%, respectively. Logistic regression and SVM algorithms were used in the text-based classification, and the accuracy was found to be 87% and 88%, respectively. With these precision values, the models classified phishing and legitimate websites very well, based on URL, text, and image features. This research contributes to early detection of sophisticated phishing attacks, enhancing internet user security. Full article

(This article belongs to the Special Issue Security and Privacy in Cloud Computing Environment)

► Show Figures

Figure 1

17 pages, 5385 KiB

Open AccessArticle

Cyberattack Detection in Social Network Messages Based on Convolutional Neural Networks and NLP Techniques

by Jorge E. Coyac-Torres, Grigori Sidorov, Eleazar Aguirre-Anaya and Gerardo Hernández-Oregón

Mach. Learn. Knowl. Extr. 2023, 5(3), 1132-1148; https://doi.org/10.3390/make5030058 - 1 Sep 2023

Cited by 8 | Viewed by 3759

Abstract

Social networks have captured the attention of many people worldwide. However, these services have also attracted a considerable number of malicious users whose aim is to compromise the digital assets of other users by using messages as an attack vector to execute different [...] Read more.

Social networks have captured the attention of many people worldwide. However, these services have also attracted a considerable number of malicious users whose aim is to compromise the digital assets of other users by using messages as an attack vector to execute different types of cyberattacks against them. This work presents an approach based on natural language processing tools and a convolutional neural network architecture to detect and classify four types of cyberattacks in social network messages, including malware, phishing, spam, and even one whose aim is to deceive a user into spreading malicious messages to other users, which, in this work, is identified as a bot attack. One notable feature of this work is that it analyzes textual content without depending on any characteristics from a specific social network, making its analysis independent of particular data sources. Finally, this work was tested on real data, demonstrating its results in two stages. The first stage detected the existence of any of the four types of cyberattacks within the message, achieving an accuracy value of 0.91. After detecting a message as a cyberattack, the next stage was to classify it as one of the four types of cyberattack, achieving an accuracy value of 0.82. Full article

(This article belongs to the Section Privacy)

► Show Figures

Figure 1

25 pages, 692 KiB

Open AccessArticle

Attention-Based 1D CNN-BiLSTM Hybrid Model Enhanced with FastText Word Embedding for Korean Voice Phishing Detection

by Milandu Keith Moussavou Boussougou and Dong-Joo Park

Mathematics 2023, 11(14), 3217; https://doi.org/10.3390/math11143217 - 21 Jul 2023

Cited by 21 | Viewed by 6447

Abstract

In the increasingly complex domain of Korean voice phishing attacks, advanced detection techniques are paramount. Traditional methods have achieved some degree of success. However, they often fail to detect sophisticated voice phishing attacks, highlighting an urgent need for enhanced approaches to improve detection [...] Read more.

In the increasingly complex domain of Korean voice phishing attacks, advanced detection techniques are paramount. Traditional methods have achieved some degree of success. However, they often fail to detect sophisticated voice phishing attacks, highlighting an urgent need for enhanced approaches to improve detection performance. Addressing this, we have designed and implemented a novel artificial neural network (ANN) architecture that successfully combines data-centric and model-centric AI methodologies for detecting Korean voice phishing attacks. This paper presents our unique hybrid architecture, consisting of a 1-dimensional Convolutional Neural Network (1D CNN), a Bidirectional Long Short-Term Memory (BiLSTM), and Hierarchical Attention Networks (HANs). Our evaluations using the real-world KorCCVi v2 dataset demonstrate that the proposed architecture effectively leverages the strengths of CNN and BiLSTM to extract and learn contextually rich features from word embedding vectors. Additionally, implementing word and sentence attention mechanisms from HANs enhances the model’s focus on crucial features, considerably improving detection performance. Achieving an accuracy score of 99.32% and an F1 score of 99.31%, our model surpasses all baseline models we trained, outperforms several existing solutions, and maintains comparable performance to others. The findings of this study underscore the potential of hybrid neural network architectures in improving voice phishing detection in the Korean language and pave the way for future research. This could involve refining and expanding upon this model to tackle increasingly sophisticated voice phishing strategies effectively or utilizing larger datasets. Full article

(This article belongs to the Special Issue Advances in Mathematical Methods, Machine Learning and Deep Learning Based Applications, 2nd Edition)

► Show Figures

Figure 1

47 pages, 3831 KiB

Open AccessReview

A Systematic Literature Review and a Conceptual Framework Proposition for Advanced Persistent Threats (APT) Detection for Mobile Devices Using Artificial Intelligence Techniques

by Amjed Ahmed Al-Kadhimi, Manmeet Mahinderjit Singh and Mohd Nor Akmal Khalid

Appl. Sci. 2023, 13(14), 8056; https://doi.org/10.3390/app13148056 - 10 Jul 2023

Cited by 11 | Viewed by 7718

Abstract

Advanced persistent threat (APT) refers to a specific form of targeted attack used by a well-organized and skilled adversary to remain undetected while systematically and continuously exfiltrating sensitive data. Various APT attack vectors exist, including social engineering techniques such as spear phishing, watering [...] Read more.

Advanced persistent threat (APT) refers to a specific form of targeted attack used by a well-organized and skilled adversary to remain undetected while systematically and continuously exfiltrating sensitive data. Various APT attack vectors exist, including social engineering techniques such as spear phishing, watering holes, SQL injection, and application repackaging. Various sensors and services are essential for a smartphone to assist in user behavior that involves sensitive information. Resultantly, smartphones have become the main target of APT attacks. Due to the vulnerability of smartphone sensors, several challenges have emerged, including the inadequacy of current methods for detecting APTs. Nevertheless, several existing APT solutions, strategies, and implementations have failed to provide comprehensive solutions. Detecting APT attacks remains challenging due to the lack of attention given to human behavioral factors contributing to APTs, the ambiguity of APT attack trails, and the absence of a clear attack fingerprint. In addition, there is a lack of studies using game theory or fuzzy logic as an artificial intelligence (AI) strategy for detecting APT attacks on smartphone sensors, besides the limited understanding of the attack that may be employed due to the complex nature of APT attacks. Accordingly, this study aimed to deliver a systematic review to report on the extant research concerning APT detection for mobile sensors, applications, and user behavior. The study presents an overview of works performed between 2012 and 2023. In total, 1351 papers were reviewed during the primary search. Subsequently, these papers were processed according to their titles, abstracts, and contents. The resulting papers were selected to address the research questions. A conceptual framework is proposed to incorporate the situational awareness model in line with adopting game theory as an AI technique used to generate APT-based tactics, techniques, and procedures (TTPs) and normal TTPs and cognitive decision making. This framework enhances security awareness and facilitates the detection of APT attacks on smartphone sensors, applications, and user behavior. It supports researchers in exploring the most significant papers on APTs related to mobile sensors, services, applications, and detection techniques using AI. Full article

(This article belongs to the Special Issue New Trends in Network and Information Security)

► Show Figures

Figure 1

14 pages, 2901 KiB

Open AccessArticle

Hybrid Phishing Detection Based on Automated Feature Selection Using the Chaotic Dragonfly Algorithm

by Gharbi Alshammari, Majdah Alshammari, Tariq S. Almurayziq, Abdullah Alshammari and Mohammad Alsaffar

Electronics 2023, 12(13), 2823; https://doi.org/10.3390/electronics12132823 - 26 Jun 2023

Cited by 5 | Viewed by 2669

Abstract

Due to the increased frequency of phishing attacks, network security has gained the attention of researchers. In addition to this, large volumes of data are created every day, and these data include inappropriate and unrelated features that influence the accuracy of machine learning. [...] Read more.

Due to the increased frequency of phishing attacks, network security has gained the attention of researchers. In addition to this, large volumes of data are created every day, and these data include inappropriate and unrelated features that influence the accuracy of machine learning. There is therefore a need for a robust method of detecting phishing threats and improving detection accuracy. In this study, three classifiers were applied to improve the accuracy of a detection algorithm: decision tree, k-nearest neighbors (KNN), and support vector machine (SVM). Selecting the relevant features improves the detection accuracy for a target class and determines the class label with the greatest probability. The proposed work clearly describes how feature selection using the Chaotic Dragonfly Algorithm provides more accurate results than all other baseline classifiers. It also indicates the appropriate classifier to be applied when detecting phishing websites. Three publicly available datasets were used to evaluate the method. They are reliable datasets for training the model and measuring prediction accuracy. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

17 pages, 4295 KiB

Open AccessArticle

A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts

by Yanbin Wang, Wenrui Ma, Haitao Xu, Yiwei Liu and Peng Yin

Appl. Sci. 2023, 13(13), 7429; https://doi.org/10.3390/app13137429 - 22 Jun 2023

Cited by 15 | Viewed by 4406

Abstract

Phishing poses a significant threat to the financial and privacy security of internet users and often serves as the starting point for cyberattacks. Many machine-learning-based methods for detecting phishing websites rely on URL analysis, offering simplicity and efficiency. However, these approaches are not [...] Read more.

Phishing poses a significant threat to the financial and privacy security of internet users and often serves as the starting point for cyberattacks. Many machine-learning-based methods for detecting phishing websites rely on URL analysis, offering simplicity and efficiency. However, these approaches are not always effective due to the following reasons: (1) highly concealed phishing websites may employ tactics such as masquerading URL addresses to deceive machine learning models, and (2) phishing attackers frequently change their phishing website URLs to evade detection. In this study, we propose a robust, multi-view Transformer model with an expert-mixture mechanism for accurate phishing website detection utilizing website URLs, attributes, content, and behavioral information. Specifically, we first adapted a pretrained language model for URL representation learning by applying adversarial post-training learning in order to extract semantic information from URLs. Next, we captured the attribute, content, and behavioral features of the websites and encoded them as vectors, which, alongside the URL embeddings, constitute the website’s multi-view information. Subsequently, we introduced a mixture-of-experts mechanism into the Transformer network to learn knowledge from different views and adaptively fuse information from various views. The proposed method outperforms state-of-the-art approaches in evaluations of real phishing websites, demonstrating greater performance with less label dependency. Furthermore, we show the superior robustness and enhanced adaptability of the proposed method to unseen samples and data drift in more challenging experimental settings. Full article

► Show Figures

Figure 1

23 pages, 2507 KiB

Open AccessArticle

A Phishing-Attack-Detection Model Using Natural Language Processing and Deep Learning

by Eduardo Benavides-Astudillo, Walter Fuertes, Sandra Sanchez-Gordon, Daniel Nuñez-Agurto and Germán Rodríguez-Galán

Appl. Sci. 2023, 13(9), 5275; https://doi.org/10.3390/app13095275 - 23 Apr 2023

Cited by 31 | Viewed by 9788

Abstract

Phishing is a type of cyber-attack that aims to deceive users, usually using fraudulent web pages that appear legitimate. Currently, one of the most-common ways to detect these phishing pages according to their content is by entering words non-sequentially into Deep Learning (DL) [...] Read more.

Phishing is a type of cyber-attack that aims to deceive users, usually using fraudulent web pages that appear legitimate. Currently, one of the most-common ways to detect these phishing pages according to their content is by entering words non-sequentially into Deep Learning (DL) algorithms, i.e., regardless of the order in which they have entered the algorithms. However, this approach causes the intrinsic richness of the relationship between words to be lost. In the field of cyber-security, the innovation of this study is to propose a model that detects phishing attacks based on the text of suspicious web pages and not on URL addresses, using Natural Language Processing (NLP) and DL algorithms. We used the Keras Embedding Layer with Global Vectors for Word Representation (GloVe) to exploit the web page content’s semantic and syntactic features. We first performed an analysis using NLP and Word Embedding, and then, these data were introduced into a DL algorithm. In addition, to assess which DL algorithm works best, we evaluated four alternative algorithms: Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), Gated Recurrent Unit (GRU), and Bidirectional GRU (BiGRU). As a result, it can be concluded that the proposed model is promising because the mean accuracy achieved by each of the four DL algorithms was at least 96.7%, while the best performer was BiGRU with 97.39%. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

16 pages, 1992 KiB

Open AccessArticle

Detecting Phishing Domains Using Machine Learning

by Shouq Alnemari and Majid Alshammari

Appl. Sci. 2023, 13(8), 4649; https://doi.org/10.3390/app13084649 - 7 Apr 2023

Cited by 65 | Viewed by 24030

Abstract

Phishing is an online threat where an attacker impersonates an authentic and trustworthy organization to obtain sensitive information from a victim. One example of such is trolling, which has long been considered a problem. However, recent advances in phishing detection, such as machine [...] Read more.

Phishing is an online threat where an attacker impersonates an authentic and trustworthy organization to obtain sensitive information from a victim. One example of such is trolling, which has long been considered a problem. However, recent advances in phishing detection, such as machine learning-based methods, have assisted in combatting these attacks. Therefore, this paper develops and compares four models for investigating the efficiency of using machine learning to detect phishing domains. It also compares the most accurate model of the four with existing solutions in the literature. These models were developed using artificial neural networks (ANNs), support vector machines (SVMs), decision trees (DTs), and random forest (RF) techniques. Moreover, the uniform resource locator’s (URL’s) UCI phishing domains dataset is used as a benchmark to evaluate the models. Our findings show that the model based on the random forest technique is the most accurate of the other four techniques and outperforms other solutions in the literature. Full article

► Show Figures

Figure 1

Search Results (22)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (22)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI