Evaluation of AI Models for Phishing Detection Using Open Datasets †
Abstract
1. Introduction
2. Literature Review
Previous Research
- Amani Alswailem, Bashayr Alabdullah, Norah Alrumayh, Aram Alsedrani [4] with the title Detecting Phishing Websites Using Filter Techniques on Machine Learning Models. The study found that the application of the Naïve Bayes method has an accuracy value of 60.4%, the Decision Tree method has an accuracy value of 94.4%, and the Random Forest method has an accuracy of 96.3%. Therefore, it can be concluded that the most effective method for detecting phishing websites is Random Forest because it has an accuracy level of 96.3%.
- Sowmya Jagadeesan, Sameer, Devender Singh, Ritika Ojha, Read Khalid Ibrahim, Malik Bader Alazzam [5] with the title Implementation of Artificial Intelligence-Based Cyber Security System to Overcome Phishing Attacks. This system uses machine learning algorithms such as Support Vector Machine (SVM), Random Forest, and Neural Networks to detect phishing emails and websites with high accuracy. The data used includes phishing emails and URLs collected from various sources. Data preprocessing involves cleaning, feature extraction, and normalization before training the AI model to recognize phishing patterns. Performance evaluation was carried out using metrics such as precision, recall, and F1-score to assess the effectiveness of the system. The results show that the AI-based system achieves 97% accuracy and an F1-score of around 96%, indicating high capability in detecting phishing attacks. The implementation of this system provides a proactive solution that reduces false positives and false negatives, thereby increasing data and information security.
- Shraddha Parekh, Dhwanil Parikh, Srushti Kotak, Smita Sankhe [6] with the title Detection of Phishing Websites From URL Analysis Using the Random Forest Algorithm. The contributions made by this study include using the Random Forest algorithm to detect phishing websites and adding detection features that are integrated into websites that discuss phishing. The Random Forest classification algorithm was used because of its high ability to process a large number of detection features. By using 30 detection features, the test results show that the system built is able to achieve optimal performance, with a prediction rate of 96%, recall 92%, accuracy 94%, and F1-score 93%. These results indicate that the proposed method is effective in detecting phishing attacks with a high level of accuracy, making it a very useful tool in protecting users from cyber threats and is considered to be able to solve existing problems because it can work optimally.
3. Research Methodology
- A.
- Dataset ExplanationThe dataset used in this study was obtained from open sources containing samples of phishing and non-phishing URLs. This dataset includes various features that reflect the characteristics of the URL, such as URL length, use of special symbols, presence of suspicious keywords, and SSL certificate information. In addition, the dataset also includes additional metadata such as domain creation time and hosting information that can contribute to phishing detection. This data will be used to train and test machine learning models.
- B.
- Data Preprocessing
- Data CleansingRemove duplication and handle missing data and ensure that all relevant features are available.
- Feature NormalizationConverting data into a uniform scale improves model performance, especially for algorithms such as SVM that are sensitive to data scale.
- Categorical EncodingConverting categorical features into numeric format using methods such as one-hot encoding or label encoding can be processed by machine learning models.
- Dataset SharingThe dataset is divided into training data and test data with a certain ratio (e.g., 80:20) for more accurate model evaluation.
- C.
- AI Models UsedThree machine learning models will be used in this study:
- Decision TreeA decision tree-based model that segments data based on the most significant features. Its advantages are high interpretability and fast execution.
- Random ForestAn ensemble model consisting of multiple decision trees to improve stability and accuracy and is more resistant to overfitting.
- Support Vector Machine (SVM)A model that searches for the optimal hyperlane to separate phishing and non-phishing classes, with high performance especially on high-dimensional data.
- D.
- Model EvaluationOnce the model is trained and tested, evaluation will be performed using several key metrics [9,10,11]:
- AccuracyMeasures the percentage of correct predictions against the total data.
- PrecisionMeasures the extent to which the model does not give false positive predictions.
- RecallRecall measures the extent to which the model can detect all correct phishing cases.
- F-1 ScoreCombines precision and recall in one metric to provide a balanced picture of model performance.
- Confusion MatrixUsed to further analyze classification errors.
4. Model Evaluation Results and Performance Analysis
- A.
- Model Evaluation Results
- (1)
- Decision TreeThe Decision Tree model shows an accuracy of 98.37% with the following evaluation metrics as shown by Table 1:Confusion matrix Decision Tree:[1924 35]39 2426
- False positives (FP): 35
- False negatives (FN): 37
From the confusion matrix, it can be seen from Table 1 that this model has 35 false positives (FP) and 37 false negatives (FN), which shows that this model has performed quite well. However, we are striving to improve accuracy and reduce FP and FN values in other algorithms. - (2)
- Random ForestThe Random Forest model has better performance than Decision Tree with an accuracy of 98.64%. The results of the evaluation of this model’s metrics are as shown by Table 2:Confusion Matrix Random Forest:[1920 29]21 2442
- False positives (FP): 39
- False negatives (FN): 21
Table 2 shows the confusion matrix of random forest model. This model has 39 false positives (FP) and 21 false negatives (FN), which is better than Decision Tree because the number of misclassifications is smaller. The Random Forest model is superior because it is an ensemble method that combines several decision trees to improve generalization and reduce overfitting. These results indicate that the Random Forest model is more stable and accurate than Decision Tree. - (3)
- Support Vector Machine (SVM)The SVM model has lower accuracy compared to Decision Tree and Random Forest, that is 92.76%. The results of the evaluation of this model’s metrics are as shown by Table 3:Confusion Matrix SVM:[1765 194]126 2337
- False positives (FP): 194
- False negatives (FN): 126
From the confusion matrix shown by Table 3, this model has 194 false positives (FP) and 126 false negatives (FN), indicating a higher number of misclassifications compared to other models. The SVM model has lower performance because the model may have difficulty handling complex data and overlapping between classes. However, this model still provides relatively good results, though it is still inferior to Decision Tree-based models such as Random Forest.
- B.
- Model Performance AnalysisBased on the evaluation results, the comparison of the accuracy of the three models is as shown by Table 4:
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Manguma, T.T.F.; Fatra, E. 2024 Performance Analysis of Classification Algorithms for Spam Detection in Email. Innov. J. Soc. Sci. Res. 2024, 4, 16461–16465. [Google Scholar]
- Windarni, V.A.; Nugraha, A.F.; Ramadhani, S.T.A.; Istiqomah, D.A.; Puri, F.M.; Setiawan, A. Phishing website detection using filter technique on machine learning model. Inf. Syst. J. (INFOS) 2023, 6, 39–43. [Google Scholar] [CrossRef]
- Fatiha, M.R.; Setiawan, I.; Ikhsan, A.N.; Yunita, I.R. Optimization of web-based phishing detection system using decision tree algorithm. IT CIDA Sci. J. Inf. Technol. Dissem. 2024, 10, 97–108. Available online: https://www.kaggle.com (accessed on 13 June 2025). [CrossRef]
- Alswailem, A.; Alabdullah, B.; Alrumayh, N.; Alsedrani, A. Detecting Phishing Websites Using Machine Learning. In Proceedings of the 2019 2nd International Conference on Computer Applications & Information Security (ICCAIS), Riyadh, Saudi Arabia, 1–3 May 2019; pp. 1–6. [Google Scholar] [CrossRef]
- Jagadeesan, S.; Sameer; Singh, D.; Ojha, R.; Ibrahim, R.K.; Alazzam, M.B. Implementation of an Artificial Intelligence with Cyber Security in E-Learning-Based Education Management System. In Proceedings of the 2023 4th International Conference on Computation, Automation and Knowledge Management (ICCAKM), Dubai, United Arab Emirates, 12–13 December 2023; pp. 1–5. [Google Scholar] [CrossRef]
- Parekh, S.; Parikh, D.; Kotak, S.; Sankhe, S. A New Method for Detection of Phishing Websites: URL Detection. In Proceedings of the 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), Coimbatore, India, 20–21 April 2018; pp. 949–952. [Google Scholar] [CrossRef]
- Nugraha, A.F.; Faticha, R.; Aziza, A.; Pristyanto, Y. Application of Stacking and Random Forest Methods to Improve Classification Performance in the Phishing Web Detection Process. Infomedia 2022, 7, 1. [Google Scholar] [CrossRef]
- Harahap, A.D.; Juardi, D.; Irawan, A.S.Y. Design of phishing link detection system using web-based random forest algorithm. J. Inform. Appl. Electr. Eng. 2024, 12, 2677–2686. [Google Scholar] [CrossRef]
- Mutmainnah, S.; Lorosae, T.A.; Ramadhan, S. Text Embedding and TF-IDF+Ngram Models to Improve the Performance of Binary Classifier Algorithms in Fake SMS Classification. J. Sist. Inf. (JSI) 2025, 4, 55–64. Available online: https://ojs.trigunadharma.ac.id/index.php/jsi (accessed on 19 June 2025).
- Raihan, A.; Fadhli, M. Implementation of deep learning for detecting phishing attacks on websites with combination of cnn and lstm. J. Inf. Eng. (JUTIF) 2024, 5, 1451–1459. [Google Scholar] [CrossRef]
- Vebriani, M.; Yustanti, W. Classification of DANA Kaget Phishing Link Detection Using Website-Based Support Vector Machine Method. J. Inform. Comput. Sci. 2024, 6, 408–416. [Google Scholar] [CrossRef]
Class | Precision | Recall | F1-Socre | Support |
---|---|---|---|---|
−1 | 0.98 | 0.98 | 0.98 | 1959 |
1 | 0.99 | 0.98 | 0.98 | 2463 |
Accuracy | 98.37% | 4422 |
Class | Precision | Recall | F1-Socre | Support |
---|---|---|---|---|
−1 | 0.98 | 0.98 | 0.98 | 1920 |
1 | 0.99 | 0.98 | 0.98 | 2442 |
Accuracy | 98.64% | 4362 |
Class | Precision | Recall | F1-Socre | Support |
---|---|---|---|---|
−1 | 0.93 | 0.90 | 0.92 | 1765 |
1 | 0.92 | 0.95 | 0.94 | 2337 |
Accuracy | 92.76% | 4102 |
Model | Accuracy | FP | FN |
---|---|---|---|
Decision Tree | 98.37% | 35 | 37 |
Random Forest | 98.64% | 39 | 21 |
SVM | 92.76% | 194 | 126 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Aniyansyah, N.; Rina, R.; Puspitasari, S.; Erfina, A. Evaluation of AI Models for Phishing Detection Using Open Datasets. Eng. Proc. 2025, 107, 37. https://doi.org/10.3390/engproc2025107037
Aniyansyah N, Rina R, Puspitasari S, Erfina A. Evaluation of AI Models for Phishing Detection Using Open Datasets. Engineering Proceedings. 2025; 107(1):37. https://doi.org/10.3390/engproc2025107037
Chicago/Turabian StyleAniyansyah, Nur, Rina Rina, Sarah Puspitasari, and Adhitia Erfina. 2025. "Evaluation of AI Models for Phishing Detection Using Open Datasets" Engineering Proceedings 107, no. 1: 37. https://doi.org/10.3390/engproc2025107037
APA StyleAniyansyah, N., Rina, R., Puspitasari, S., & Erfina, A. (2025). Evaluation of AI Models for Phishing Detection Using Open Datasets. Engineering Proceedings, 107(1), 37. https://doi.org/10.3390/engproc2025107037