MDPI - Publisher of Open Access Journals

34 pages, 2208 KB

Open AccessArticle

Small Language Models for Phishing Website Detection: Cost, Performance, and Privacy Trade-Offs

by Georg Goldenits, Philip König, Sebastian Raubitzek and Andreas Ekelhart

J. Cybersecur. Priv. 2026, 6(2), 48; https://doi.org/10.3390/jcp6020048 - 5 Mar 2026

Viewed by 827

Phishing websites pose a major cybersecurity threat, exploiting unsuspecting users and causing significant financial and organisational harm. Traditional machine learning approaches for phishing detection often require extensive feature engineering, continuous retraining, and costly infrastructure maintenance. At the same time, proprietary large language models (LLMs) have demonstrated strong performance in phishing-related classification tasks, but their operational costs and reliance on external providers limit their practical adoption in many business environments. This paper presents a detection pipeline for malicious websites and investigates the feasibility of Small Language Models (SLMs) using raw HTML code and URLs. A key advantage of these models is that they can be deployed on local infrastructure, providing organisations with greater control over data and operations. We systematically evaluate 15 commonly used SLMs, ranging from 1 billion to 70 billion parameters, benchmarking their classification accuracy, computational requirements, and cost-efficiency. Our results highlight the trade-offs between detection performance and resource consumption. While SLMs underperform compared to state-of-the-art proprietary LLMs, the gap is moderate: the best SLM achieves an F1-score of 0.893 (Llama3.3:70B), compared to 0.929 for GPT-5.2, indicating that open-source models can provide a viable and scalable alternative to external LLM services. Full article

(This article belongs to the Section Privacy)

► Show Figures

Figure 1

30 pages, 3060 KB

Open AccessArticle

LLM-Based Multimodal Feature Extraction and Hierarchical Fusion for Phishing Email Detection

by Xinyang Yuan, Jiarong Wang, Tian Yan and Fazhi Qi

Electronics 2026, 15(2), 368; https://doi.org/10.3390/electronics15020368 - 14 Jan 2026

Viewed by 626

Abstract

Phishing emails continue to evade conventional detection systems due to their increasingly sophisticated, multi-faceted social engineering tactics. To address the limitations of single-modality or rule-based approaches, we propose SAHF-PD, a novel phishing detection framework that integrates multi-modal feature extraction with semantic-aware hierarchical fusion, based on large language models (LLMs). Our method leverages modality-specialized large models, each guided by domain-specific prompts and constrained to a standardized output schema, to extract structured feature representations from four complementary sources associated with each phishing email: email body text; open-source intelligence (OSINT) derived from the key embedded URL; screenshot of the landing page; and the corresponding HTML/JavaScript source code. This design mitigates the unstructured and stochastic nature of raw generative outputs, yielding consistent, interpretable, and machine-readable features. These features are then integrated through our Semantic-Aware Hierarchical Fusion (SAHF) mechanism, which organizes them into core, auxiliary, and weakly associated layers according to their semantic relevance to phishing intent. This layered architecture enables dynamic weighting and redundancy reduction based on semantic relevance, which in turn highlights the most discriminative signals across modalities and enhances model interpretability. We also introduce PhishMMF, a publicly released multimodal feature dataset for phishing detection, comprising 11,672 human-verified samples with meticulously extracted structured features from all four modalities. Experiments with eight diverse classifiers demonstrate that the SAHF-PD framework enables exceptional performance. For instance, XGBoost equipped with SAHF attains an AUC of 0.99927 and an F1-score of 0.98728, outperforming the same model using the original feature representation. Moreover, SAHF compresses the original 228-dimensional feature space into a compact 56-dimensional representation (a 75.4% reduction), reducing the average training time across all eight classifiers by 43.7% while maintaining comparable detection accuracy. Ablation studies confirm the unique contribution of each modality. Our work establishes a transparent, efficient, and high-performance foundation for next-generation anti-phishing systems. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

26 pages, 1346 KB

Open AccessArticle

Benchmarking 21 Open-Source Large Language Models for Phishing Link Detection with Prompt Engineering

by Arbi Haza Nasution, Winda Monika, Aytug Onan and Yohei Murakami

Information 2025, 16(5), 366; https://doi.org/10.3390/info16050366 - 29 Apr 2025

Cited by 5 | Viewed by 5887

Abstract

Phishing URL detection is critical due to the severe cybersecurity threats posed by phishing attacks. While traditional methods rely heavily on handcrafted features and supervised machine learning, recent advances in large language models (LLMs) provide promising alternatives. This paper presents a comprehensive benchmarking study of 21 state-of-the-art open-source LLMs—including Llama3, Gemma, Qwen, Phi, DeepSeek, and Mistral—for phishing URL detection. We evaluate four key prompt engineering techniques—zero-shot, role-playing, chain-of-thought, and few-shot prompting—using a balanced, publicly available phishing URL dataset, with no fine-tuning or additional training of the models conducted, reinforcing the zero-shot, prompt-based nature as a distinctive aspect of our study. The results demonstrate that large open-source LLMs (≥27B parameters) achieve performance exceeding 90% F1-score without fine-tuning, closely matching proprietary models. Among the prompt strategies, few-shot prompting consistently delivers the highest accuracy (91.24% F1 with Llama3.3_70b), whereas chain-of-thought significantly lowers accuracy and increases inference time. Additionally, our analysis highlights smaller models (7B–27B parameters) offering strong performance with substantially reduced computational costs. This study underscores the practical potential of open-source LLMs for phishing detection and provides insights for effective prompt engineering in cybersecurity applications. Full article

► Show Figures

Figure 1

26 pages, 5241 KB

Open AccessArticle

Development of GUI-Driven AI Deep Learning Platform for Predicting Warpage Behavior of Fan-Out Wafer-Level Packaging

by Ching-Feng Yu, Jr-Wei Peng, Chih-Cheng Hsiao, Chin-Hung Wang and Wei-Chung Lo

Micromachines 2025, 16(3), 342; https://doi.org/10.3390/mi16030342 - 17 Mar 2025

Cited by 9 | Viewed by 3285

Abstract

This study presents an artificial intelligence (AI) prediction platform driven by deep learning technologies, designed specifically to address the challenges associated with predicting warpage behavior in fan-out wafer-level packaging (FOWLP). Traditional electronic engineers often face difficulties in implementing AI-driven models due to the specialized programming and algorithmic expertise required. To overcome this, the platform incorporates a graphical user interface (GUI) that simplifies the design, training, and operation of deep learning models. It enables users to configure and run AI predictions without needing extensive coding knowledge, thereby enhancing accessibility for non-expert users. The platform efficiently processes large datasets, automating feature extraction, data cleansing, and model training, ensuring accurate and reliable predictions. The effectiveness of the AI platform is demonstrated through case studies involving FOWLP architectures, highlighting its ability to provide quick and precise warpage predictions. Additionally, the platform is available in both uniform resource locator (URL)-based and standalone versions, offering flexibility in usage. This innovation significantly improves design efficiency, enabling engineers to optimize electronic packaging designs, reduce errors, and enhance the overall system performance. The study concludes by showcasing the structure and functionality of the GUI platform, positioning it as a valuable tool for fostering further advancements in electronic packaging. Full article

(This article belongs to the Special Issue Applications of Data Sciences in Semiconductor Industry: Design, Manufacturing, Packaging and Testing)

► Show Figures

Figure 1

42 pages, 1293 KB

Open AccessArticle

Enhancing Online Security: A Novel Machine Learning Framework for Robust Detection of Known and Unknown Malicious URLs

by Shiyun Li and Omar Dib

J. Theor. Appl. Electron. Commer. Res. 2024, 19(4), 2919-2960; https://doi.org/10.3390/jtaer19040141 - 26 Oct 2024

Cited by 5 | Viewed by 4095

Abstract

The rapid expansion of the internet has led to a corresponding surge in malicious online activities, posing significant threats to users and organizations. Cybercriminals exploit malicious uniform resource locators (URLs) to disseminate harmful content, execute phishing schemes, and orchestrate various cyber attacks. As these threats evolve, detecting malicious URLs (MURLs) has become crucial for safeguarding internet users and ensuring a secure online environment. In response to this urgent need, we propose a novel machine learning-driven framework designed to identify known and unknown MURLs effectively. Our approach leverages a comprehensive dataset encompassing various labels—including benign, phishing, defacement, and malware—to engineer a robust set of features validated through extensive statistical analyses. The resulting malicious URL detection system (MUDS) combines supervised machine learning techniques, tree-based algorithms, and advanced data preprocessing, achieving a high detection accuracy of 96.83% for known MURLs. For unknown MURLs, the proposed framework utilizes CL_K-means, a modified k-means clustering algorithm, alongside two additional biased classifiers, achieving 92.54% accuracy on simulated zero-day datasets. With an average processing time of under 14 milliseconds per instance, MUDS is optimized for real-time integration into network endpoint systems. These outcomes highlight the efficacy and efficiency of the proposed MUDS in fortifying online security by identifying and mitigating MURLs, thereby reinforcing the digital landscape against cyber threats. Full article

► Show Figures

Figure 1

26 pages, 3846 KB

Open AccessArticle

Analysis of the Performance Impact of Fine-Tuned Machine Learning Model for Phishing URL Detection

by Saleem Raja Abdul Samad, Sundarvadivazhagan Balasubaramanian, Amna Salim Al-Kaabi, Bhisham Sharma, Subrata Chowdhury, Abolfazl Mehbodniya, Julian L. Webber and Ali Bostani

Electronics 2023, 12(7), 1642; https://doi.org/10.3390/electronics12071642 - 30 Mar 2023

Cited by 118 | Viewed by 6470

Abstract

Phishing leverages people’s tendency to share personal information online. Phishing attacks often begin with an email and can be used for a variety of purposes. The cybercriminal will employ social engineering techniques to get the target to click on the link in the phishing email, which will take them to the infected website. These attacks become more complex as hackers personalize their fraud and provide convincing messages. Phishing with a malicious URL is an advanced kind of cybercrime. It might be challenging even for cautious users to spot phishing URLs. The researchers displayed different techniques to address this challenge. Machine learning models improve detection by using URLs, web page content and external features. This article presents the findings of an experimental study that attempted to enhance the performance of machine learning models to obtain improved accuracy for the two phishing datasets that are used the most commonly. Three distinct types of tuning factors are utilized, including data balancing, hyper-parameter optimization and feature selection. The experiment utilizes the eight most prevalent machine learning methods and two distinct datasets obtained from online sources, such as the UCI repository and the Mendeley repository. The result demonstrates that data balance improves accuracy marginally, whereas hyperparameter adjustment and feature selection improve accuracy significantly. The performance of machine learning algorithms is improved by combining all fine-tuned factors, outperforming existing research works. The result shows that tuning factors enhance the efficiency of machine learning algorithms. For Dataset-1, Random Forest (RF) and Gradient Boosting (XGB) achieve accuracy rates of 97.44% and 97.47%, respectively. Gradient Boosting (GB) and Extreme Gradient Boosting (XGB) achieve accuracy values of 98.27% and 98.21%, respectively, for Dataset-2. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

24 pages, 5325 KB

Open AccessArticle

An Effective Phishing Detection Model Based on Character Level Convolutional Neural Network from URL

by Ali Aljofey, Qingshan Jiang, Qiang Qu, Mingqing Huang and Jean-Pierre Niyigena

Electronics 2020, 9(9), 1514; https://doi.org/10.3390/electronics9091514 - 15 Sep 2020

Cited by 166 | Viewed by 13567

Abstract

Phishing is the easiest way to use cybercrime with the aim of enticing people to give accurate information such as account IDs, bank details, and passwords. This type of cyberattack is usually triggered by emails, instant messages, or phone calls. The existing anti-phishing techniques are mainly based on source code features, which require to scrape the content of web pages, and on third-party services which retard the classification process of phishing URLs. Although the machine learning techniques have lately been used to detect phishing, they require essential manual feature engineering and are not an expert at detecting emerging phishing offenses. Due to the recent rapid development of deep learning techniques, many deep learning-based methods have also been introduced to enhance the classification performance. In this paper, a fast deep learning-based solution model, which uses character-level convolutional neural network (CNN) for phishing detection based on the URL of the website, is proposed. The proposed model does not require the retrieval of target website content or the use of any third-party services. It captures information and sequential patterns of URL strings without requiring a prior knowledge about phishing, and then uses the sequential pattern features for fast classification of the actual URL. For evaluations, comparisons are provided between different traditional machine learning models and deep learning models using various feature sets such as hand-crafted, character embedding, character level TF-IDF, and character level count vectors features. According to the experiments, the proposed model achieved an accuracy of 95.02% on our dataset and an accuracy of 98.58%, 95.46%, and 95.22% on benchmark datasets which outperform the existing phishing URL models. Full article

(This article belongs to the Special Issue Data Security)

► Show Figures

Figure 1

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI