In this section, we examine the related literature on proposed systems and techniques for credit card fraud detection. The existing work in this field is categorised into three sections based on the technique used, i.e., Statistical methods, Machine Learning Algorithms, and Deep Learning Techniques.
2.3. Machine Learning (ML) in Credit Card Fraud Detection
Due to the ability to learn from data, find complex patterns, and predict credit card theft, machine learning algorithms are important in credit card fraud detection. These algorithms are supervised and unsupervised learning methods. A few of the algorithms used for CCFD (Credit Card Fraud Detection) include Logistic Regression (LR), Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Naive Bayes (NB), Decision Trees (DT), Random Forest (RF), and Tree-Augmented Naive Bayes (TAN).
For credit card fraud detection, SVM, KNN, NB, DT, RF, and TAN are powerful machine learning models. SVM classifies data points using the best hyperplane [
18], KNN classifies transactions based on their K-Nearest Neighbors [
19], NB uses probabilistic learning to estimate class probabilities [
20], DT generates decision trees for feature-based classification [
20], RF combines decision trees to reduce overfitting [
21], and TAN enhances NB with a tree-like dependency structure to capture feature correlations [
22]. These models offer diverse approaches to identifying and preventing fraudulent transactions, contributing to robust fraud detection systems. Credit card fraud detection algorithms have pros and downsides. When choosing an algorithm for an application, dataset size, feature space, processing needs, interpretability, and fraud must be considered.
Several researchers have highlighted the route to improved fraud prevention and detection in this comprehensive analysis of credit card fraud detection with machine learning. In [
23], Prasad Chowdary et al. propose an ensemble technique to improve credit card fraud detection. The authors focus on optimising model parameters, improving performance measures, and integrating deep learning to fix identification errors and reduce false negatives. Decision Tree (DT), Gradient Boosting Classifier (XGBoost), Logistic Regression (LR), Random Forest (RF), and Support Vector Machine were used in this paper. The paper compares these algorithms across multiple evaluation metrics and finds that DT performs best with a 100% recall value, followed by XGBoost, LR, RF, and SVM with 85%, 74.49%, 75.9%, and 69%, respectively. By combining multiple classifier ensembles and rigorously assessing their performance, this project greatly improves CCFD system efficiency. However, the evaluation parameters reveal the low performance of the model.
Sahithi et al. [
1] developed a credit card fraud detection algorithm in 2022. Their model used a Weighted Average Ensemble to combine Logistic Regression (LR), Random Forest (RF), K-Nearest Neighbors (KNN), Adaboost, and Bagging. The paper used the European Credit Card Company dataset. Their model had 99% accuracy, topping base models like RF Bagging (98.91%), LR (98.90%), Adaboost (97.91%), KNN (97.81%), and Bagging (95.37%). This paper shows that their ensemble model can detect credit card theft in this key domain. Nevertheless, the feature selection process was not provided, which hinders reproducibility.
Also, in 2022, Qaddoura et al. [
24] investigated the effectiveness of oversampling methods: SMOTE, ADASYN, borderline1, borderline2, and SVM oversampling algorithms for credit card fraud detection. The paper used Random Forest (RF), Logistic Regression (LR), Naive Bayes (NB), K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Decision Tree. The authors found that oversampling can improve model performance, although the exact strategy depends on the machine learning algorithm. However, the applicability of the model in real-life situations can be affected due to the computational overhead.
Tanouz et al. [
25] extensively studied machine learning for credit card fraud classification. The Decision Trees classifier, Random Forest (RF), Logistic Regression (LR), and Naive Bayes (NB) were evaluated, with a focus on imbalanced datasets. This investigation showed that the Random Forest (RF) approach performed well, scoring 96.77%. Logistic Regression (LR), Naive Bayes (NB), and Decision Trees classifiers had accuracy scores of 95.16, 95.16, and 91.12%, respectively. The detailed investigation shows that Random Forest is effective at credit card fraud detection, which is vital to financial security. Nonetheless, the performance of the proposed models is hampered due to the lack of feature selection.
The fundamental objective of the study [
26] undertaken by Ruttala et al. was to provide a comparative examination of the Random Forest and AdaBoost algorithms in the context of credit card fraud detection. The findings of their analysis demonstrated similar levels of accuracy when comparing the two algorithms. It is worth mentioning that the Random Forest method demonstrated higher performance in terms of precision, recall, and F1-score compared to Adaboost. However, the dataset used by the authors is skewed, with no clear mention of how the issue was addressed.
The primary objective of the research performed by Sadgali et al. [
27] was to identify the most effective approaches for detecting financial fraud. The methodology employed in their paper involved the utilisation of a wide range of techniques, such as Support Vector Machine (SVM), Bayesian Belief Networks, Naive Bayes, Genetic Algorithm, Multilayer Feed Forward Neural Network (MLFF), and Classification and Regression Tree (CART). Significantly, as a comprehensive and evaluative investigation of previous scholarly studies, the present paper did not require the use of a particular dataset for analysis. Their results highlighted the dominant performance of Naive Bayes, which achieved the greatest accuracy rate of 99.02%. SVM closely followed it with an accuracy rate of 98.8%, and the genetic algorithm had an accuracy rate of 95%. Despite that, the authors limited their work to insurance fraud.
The study conducted by Raghavan et al. [
28] aimed to detect anomalies or fraudulent actions using data mining techniques. They utilised three distinct datasets from Australia (AU), Germany, and the European (EU) to achieve this objective. Their work employed Support Vector Machine (SVM), K Nearest Neighbor (KNN), and Random Forest algorithms, in addition to creating two separate ensembles: one integrating KNN, SVM, and Convolutional Neural Network (CNN) and another combining KNN, SVM, and Random Forest. Their findings highlighted the dominant performance of the Support Vector Machine (SVM) in terms of accuracy, achieving a notable rate of 68.57%. In comparison, Random Forest and KNN exhibited accuracy of 64.37% and 60.47%, respectively. The present paper offers a comprehensive examination that yields useful information regarding the effectiveness of various algorithms and ensemble tactics within the domain of fraud detection. However, the performance of the model was low for all the datasets used.
Saputra et al. [
29] compare the effectiveness of Decision Tree, Naïve Bayes, Random Forest, and Neural Network machine learning approaches. SMOTE was used to solve the problems of imbalanced datasets. This study’s dataset was provided by Kaggle. At 0.093% of records, the dataset included few fraudulent transactions. The examination using confusion matrices revealed that the Neural Network had the highest accuracy (96%), followed by Random Forest (95%), Naïve Bayes (95%), and Decision Tree (91%). SMOTE enhanced the average F1-Score and G-Score performance measures and addressed skewed data, proving its benefits. However, the dataset used in the paper does not fully represent all the e-commerce platforms.
A comparative analysis of credit card fraud detection methods was conducted by Tiwari et al. [
30]. The authors examined SVM, ANN, Bayesian Network, K-Nearest Neighbor (KNN), Hidden Markov Model, Fuzzy Logic-Based System, and Decision Trees. Analysis of the KDD dataset from the standard KDD CUP 99 Intrusion Dataset showed differing accuracy levels across approaches: SVM—94.65%, ANN—99.71%, Bayesian—97.52%, K-Nearest Neighbors—97.15%, Hidden Markov Model (HMM)—95.2%, Fuzzy Logic-Based System—97.93%, and Decision Trees—94.7%. This extensive assessment evaluated numerous credit card fraud detection methods. However, the dataset did not fully depict financial activities.
Naik et al. [
31] evaluated and compared some machine learning algorithms, including Naïve Bayes, J48, Logistic Regression, and AdaBoost, in the domain of Credit Card Fraud Detection (CCFD). Their approach utilised an online dataset consisting of 1000 items that contained both fraudulent and non-fraudulent transactions. The results indicated high levels of accuracy, with Logistic Regression and AdaBoost having a perfect accuracy rate of 100%. Naïve Bayes and J48 also displayed noteworthy accuracies of 83% and 69.93%, respectively. The findings above highlighted the diverse abilities of different algorithms in tackling the complexities associated with credit card fraud detection situations, providing useful insights for the advancement of resilient fraud detection systems. Nevertheless, the dataset used by the authors was limited to 1000 credit card transaction records, which is not typical of the credit card user population.
Karthik et al. [
9] introduced a novel model for credit card fraud detection that combines ensemble learning techniques such as boosting and bagging. The model incorporates the key characteristics of both techniques to obtain a hybrid model of bagging and boosting ensemble classifiers. The authors employed Adaboost for feature engineering of the behavioural feature space. The model’s predictive performance was analysed using the area under the precision-recall (AUPR) curve, showing marginal improvement in the range of 58.03–69.97% and 54.66–69.40% on the Brazilian bank dataset and UCSD-FICO dataset, respectively. Nevertheless, the paper did not provide an in-depth analysis of the computational complexity or resource requirements of the proposed model.
Similarly, Forough et al. [
8] proposed an ensemble model based on the sequential modelling of data using deep recurrent neural networks and a novel voting mechanism based on an artificial neural network to detect fraudulent action. The proposed model uses several recurrent networks as the base classifier, either LSTM or GRU networks, and aggregates their output using a feed-forward neural network (FFNN) as the voting mechanism. The ensemble model based on GRU achieves its best results using two base classifiers on both the European cards dataset and the Brazilian dataset. It outperforms the solo GRU model in all metrics and the baseline ensemble model in most metrics. However, the authors did not discuss the limitations of the proposed ensemble model based on the sequential modelling of data using deep recurrent neural networks and a novel voting mechanism.
Esenogho et al. [
32] proposed an efficient approach for credit card fraud detection using a neural network ensemble classifier and a hybrid data resampling method. The ensemble classifier was obtained using a long short-term memory (LSTM) neural network as the base learner in the adaptive boosting (AdaBoost) technique. The hybrid resampling technique used in this approach is the synthetic minority oversampling technique and modified nearest neighbour (SMOTE-ENN) method. SMOTE is an oversampling technique that balances the class distribution by adding synthetic samples to the minority class, while ENN is an under-sampling method that removes some majority class samples. SMOTE-ENN performs both oversampling and under-sampling to obtain a balanced dataset. However, the authors did not explore the impact of different hyperparameter settings or variations in the neural network architecture on the performance of the proposed method.
Table 1 presents a summary of ensemble machine-learning models used for credit card fraud detection.