# A Machine Learning Framework towards Bank Telemarketing Prediction

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{7}

^{8}

^{*}

## Abstract

**:**

## 1. Introduction

- The introduction of a novel approach that processes heterogeneous data by transforming separately each type of feature (numerical, Boolean, scaled and nominal), then a hybrid technique to replace missing values and implicitly select the most significant features. This helps to optimise the classification in terms of processing time and accuracy. Apart from the replacement of missing values, we do not transform nominal attributes in this step because they are directly treated in the classification; this allows reducing the processing time.
- The construction of the reduced table of training data. For each class, this table contains the averages of transformed features and the favourable attributes for nominal features.
- A simplification of the overall approach for the special case of binary classification. It incorporates a weighting scheme that improves the performance.
- Proposing and following a clear and transparent design and implementation process to efficiently solve a real and concrete machine learning problem.
- The successful implementation and use of the model designed to optimise the predictive performance of potential leads before a telemarketing bank campaign.

## 2. Prior Literature Review

## 3. The Proposed Approach

#### 3.1. Dataset

#### 3.2. Data Preprocessing

#### 3.2.1. Data Transformation

- ▸
- For numerical features (${f}_{1}$): We calculate directly the statistical parameters (min, max, mean, variance, and standard deviation) of each numerical feature.
- ▸
- For scaled Features (${f}_{4}$): We substitute items by their ordinal number. After that substitution, we calculate the statistical parameters.
- ▸
- For boolean features such as ${f}_{3}$: For this type of feature, we have only two possibilities yes or no (1/0); success or failure (1/0); and telephone/cellular (1/0).
- ▸
- For nominal features (example of ${f}_{2}$): These features are considered independent features and are directly associated to the classification step for our approach. This reduces the processing time of all nominal features by almost half while improving performance. Such features are processed in the classification step.

#### 3.2.2. Replacement of Missing Values

#### 3.2.3. Features Selection

Inst. | ${\mathit{f}}_{1}$ | ${\mathit{f}}_{2}$ | ${\mathit{f}}_{3}$ | ${\mathit{f}}_{4}$ | Label |
---|---|---|---|---|---|

${I}_{1}$ | 59 | married | 0 | 1 | 0 |

${I}_{2}$ | 39 | single | 1 | 2 | 1 |

${I}_{3}$ | 59 | married | 0 | 0 | 0 |

${I}_{4}$ | 41 | divorced | 0 | 1 | 1 |

${I}_{5}$ | 44 | married | 0 | 0 | 1 |

#### 3.2.4. Normalisation

#### 3.3. Classification Process

#### 3.3.1. Training: Reduced Table Construction

- ${n}_{k}\left({V}_{ij}\right)$ is the number of ${V}_{ij}$ variables j in the class ${C}_{k}$;
- ${N}_{k}$ is the total number of classes k.

- ▸
- Belonging coefficient of nominal features:

- The total number of yes is ${N}_{yes}$ = 3;
- The total number of no is ${N}_{no}$ = 2;
- The number of “$married$” in the class “$no$” is: ${n}_{k\left(married\right)}$ = 2;
- The number of “$married$” in the class “$yes$” is: ${n}_{k\left(married\right)}$ = 1.

#### 3.3.2. Testing: Decision Function

- For the feature ${f}_{1}$: $|{V}_{11}-\overline{{V\left({C}_{no}\right)}_{1}}|$ = $|38-59|$ = 21 and$|{V}_{11}-\overline{{V\left({C}_{yes}\right)}_{1}}|$ = $|38-41.33|$ = $3.33$$3.33<21$ and so, ${A\left(no\right)}_{11}$ = 0${A\left(yes\right)}_{11}$ = 1 and its relative weight is ${W}_{1}={\displaystyle \frac{0.98}{2.52}}=0.38$
- For the feature ${f}_{2}$ which is nominal, “$Single$” is more favourable to ${C}_{yes}$so, $A{\left(no\right)}_{12}$ = 0 and $A{\left(yes\right)}_{12}$ = 1and its relative weight is ${W}_{2}={\displaystyle \frac{0.8}{2.52}}=0.32$
- For the feature ${f}_{3}$: $|{V}_{13}-\overline{V{\left({C}_{no}\right)}_{3}}|$ = $|1-0|$ = 1 and$|{V}_{13}-\overline{V{\left({C}_{yes}\right)}_{3}}|$ = $|1-0.33|$ = $0.66$.$0.66<1$ so, $A{\left(no\right)}_{13}$ = 0 and $A{\left(yes\right)}_{13}$ = 1and its relative weight is ${W}_{3}={\displaystyle \frac{0.41}{2.52}}=0.16$
- For the feature ${f}_{4}$: $|{V}_{14}-\overline{V{\left({C}_{no}\right)}_{4}}|$ = $|0-0.5|$ = $0.5$ and$|{V}_{14}-\overline{V{\left({C}_{yes}\right)}_{4}}|$ = $|0-1|$ = 1.$0.5<1$ so, $A{\left(no\right)}_{13}$ = 1 and $A{\left(yes\right)}_{13}$ = 0and its relative weight is ${W}_{4}={\displaystyle \frac{0.33}{2.52}}=0.13$.

- $A\left(yes\right)$ = $1\ast 0.38+1\ast 0.32+0\ast 0.16+1\ast 0.13=0.83$
- $A\left(no\right)$ = $0\ast 0.38+0\ast 0.32+1\ast 0.16+0\ast 0.13=0.16$

## 4. Results analysis and Discussion

#### 4.1. Experimental Protocol

#### 4.2. Performance Measure

- ▸
- ${f}_{1}$-score (FM) from Equation (12), is a classification metric better suited to unbalanced classification problems such as ours here. It allows us to compare the true predictions made by our model (here, number a) to the errors it makes (here, numbers c and d). Hence, its formula is as follows:$$FM=\frac{2a}{2a+c+d}$$

- ▸
- The area under the curve (AUC) is a performance metric generated from the receiver operating characteristic (ROC) curve. The ROC curve is created by plotting the true positive rate (TPR) on the y-axis against the true negative rate (TNR) on the x-axis. It shows the portion of misclassified instances and is an ideal performance measurement for imbalance class datasets (Huang et al. 2018).

#### 4.3. Experimental Parameters

#### 4.4. Results Analysis and Discussion

#### Results of Basic CMB Approach Analysis

#### 4.5. Comparison of Our Results with Those of Other Methods of ML

#### 4.6. Comparison of Results with Previous Work

## 5. General Discussion

## 6. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## Note

1 | https://archive.ics.uci.edu/ml/index.php (accessed on 10 January 2018). |

## References

- Al-Garadi, Mohammed Ali, Amr Mohamed, Abdulla Khalid Al-Ali, Xiaojiang Du, Ihsan Ali, and Mohsen Guizani. 2020. A survey of machine and deep learning methods for internet of things (IoT) security. IEEE Communications Surveys & Tutorials 22: 1646–85. [Google Scholar] [CrossRef] [Green Version]
- Amini, Mohammad, Jalal Rezaeenour, and Esmaeil Hadavandi. 2015. A cluster-based data balancing ensemble classifier for response modeling in Bank Direct Marketing. International Journal of Computational Intelligence and Applications 14: 1550022. [Google Scholar] [CrossRef]
- Ballings, Michel, and Dirk Van den Poel. 2015. CRM in social media: Predicting increases in Facebook usage frequency. European Journal of Operational Research 244: 248–60. [Google Scholar] [CrossRef]
- Bhattacharyya, Siddhartha, Sanjeev Jha, Kurian Tharakunnel, and J. Christopher Westland. 2011. Data mining for credit card fraud: A comparative study. Decision Support Systems 50: 602–13. [Google Scholar] [CrossRef]
- Birant, Derya. 2020. Data Mining in Banking Sector Using Weighted Decision Jungle Method. In Data Mining-Methods, Applications and Systems. Rijeka: IntechOpen. [Google Scholar]
- Butcher, David, Xiangyang Li, and Jinhua Guo. 2007. Security challenge and defense in VoIP infrastructures. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 37: 1152–62. [Google Scholar] [CrossRef]
- Chawla, Nitesh V., Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16: 321–57. [Google Scholar] [CrossRef]
- Chen, Sheng, Haibo He, and Edwardo A. Garcia. 2010. RAMOBoost: Ranked minority oversampling in boosting. IEEE Transactions on Neural Networks 21: 1624–42. [Google Scholar] [CrossRef]
- Cherif, Walid. 2018. Optimization of K-NN algorithm by clustering and reliability coefficients: Application to breast-cancer diagnosis. Procedia Computer Science 127: 293–99. [Google Scholar] [CrossRef]
- Cioca, Marius, Andrada Iulia Ghete, Lucian Ionel Cioca, and Daniela Gifu. 2013. Machine learning and creative methods used to classify customers in a CRM systems. Applied Mechanics and Materials 371: 769–73. [Google Scholar] [CrossRef]
- Elsalamony, Hany A., and Alaa M. Elsayad. 2013. Bank direct marketing based on neural network and C5. 0 Models. International Journal of Engineering and Advanced Technology (IJEAT) 2. [Google Scholar]
- Elsalamony, Hany A. 2014. Bank direct marketing analysis of data mining techniques. International Journal of Computer Applications 85: 12–22. [Google Scholar] [CrossRef]
- Farooqi, Rashid, and Naiyar Iqbal. 2019. Performance evaluation for competency of bank telemarketing prediction using data mining techniques. International Journal of Recent Technology and Engineering 8: 5666–74. [Google Scholar]
- Fawei, Torubein, and Duke T. J. Ludera. 2020. Data Mining Solutions for Direct Marketing Campaign. In Proceedings of the SAI Intelligent Systems Conference. Cham: Springer, pp. 633–45. [Google Scholar] [CrossRef]
- Feng, Yi, Yunqiang Yin, Dujuan Wang, and Lalitha Dhamotharan. 2022. A dynamic ensemble selection method for bank telemarketing sales prediction. Journal of Business Research 139: 368–82. [Google Scholar] [CrossRef]
- Ghatasheh, Nazeeh, Hossam Faris, Ismail AlTaharwa, Yousra Harb, and Ayman Harb. 2020. Business Analytics in Telemarketing: Cost-Sensitive Analysis of Bank Campaigns Using Artificial Neural Networks. Applied Sciences 10: 2581. [Google Scholar] [CrossRef] [Green Version]
- Govindarajan, M. 2016. Ensemble strategies for improving response model in direct marketing. International Journal of Computer Science and Information Security 14: 108. [Google Scholar]
- Grzonka, Daniel, Grażyna Suchacka, and Barbara Borowik. 2016. Application of selected supervised classification methods to bank marketing campaign. Information Systems in Management 5: 36–48. [Google Scholar]
- Huang, Xiaobing, Xiaolian Liu, and Yuanqian Ren. 2018. Enterprise credit risk evaluation based on neural network algorithm. Cognitive Systems Research 52: 317–24. [Google Scholar] [CrossRef]
- Ilham, Ahmad, Laelatul Khikmah, and Ida Bagus Ary Indra Iswara. 2019. Long-term deposits prediction: A comparative framework of classification model for predict the success of bank telemarketing. Journal of Physics: Conference Series 1175: 012035. [Google Scholar] [CrossRef]
- Karim, Masud, and Rashedur M. Rahman. 2013. Decision tree and naive bayes algorithm for classification and generation of actionable knowledge for direct marketing. Journal of Software Engineering and Applications 6: 196. [Google Scholar] [CrossRef] [Green Version]
- Kawasaki, Yoshinori, and Masao Ueki. 2015. Sparse Predictive Modeling for Bank Telemarketing Success Using Smooth-Threshold Estimating Equations. Journal of the Japanese Society of Computational Statistics 28: 53–66. [Google Scholar] [CrossRef] [Green Version]
- Khalilpour Darzi, Mohammad Rasoul, Majid Khedmati, and Seyed Taghi Akhavan Niaki. 2021. Correlation-augmented Naïve Bayes (CAN) Algorithm: A Novel Bayesian Method Adjusted for Direct Marketing. Applied Artificial Intelligence 35: 1–24. [Google Scholar] [CrossRef]
- Kotler, Philip, and Kevin Lane Keller. 2016. A Framework for Marketing Management. Boston: Pearson Education Ltd. [Google Scholar]
- Koumétio, Cédric Stéphane Tékouabou, Walid Cherif, and Silkan Hassan. 2018. Optimizing the prediction of telemarketing target calls by a classification technique. Paper presented at 2018 6th International Conference on Wireless Networks and Mobile Communications (WINCOM), Marrakesh, Morocco, October 16–19; pp. 1–6. [Google Scholar]
- Koumétio, Cédric Stéphane Tékouabou, and Hamza Toulni. 2021. Improving KNN Model for Direct Marketing Prediction in Smart Cities. In Machine Intelligence and Data Analytics for Sustainable Future Smart Cities. Cham: Springer, pp. 107–18. [Google Scholar] [CrossRef]
- Krawczyk, Bartosz. 2016. Learning from imbalanced data: Open challenges and future directions. Progress in Artificial Intelligence 5: 221–32. [Google Scholar] [CrossRef] [Green Version]
- Ładyżyński, Piotr, Kamil Żbikowski, and Piotr Gawrysiak. 2019. Direct marketing campaigns in retail banking with the use of deep learning and random forests. Expert Systems with Applications 134: 28–35. [Google Scholar] [CrossRef]
- Lahmiri, Salim. 2017. A two-step system for direct bank telemarketing outcome classification. Intelligent Systems in Accounting, Finance and Management 24: 49–55. [Google Scholar] [CrossRef]
- Lakshminarayan, Kamakshi, Steven A. Harp, and Tariq Samad. 1999. Imputation of missing data in industrial databases. Applied Intelligence 11: 259–75. [Google Scholar] [CrossRef]
- Leppäniemi, Matti, and Heikki Karjaluoto. 2008. Mobile marketing: From marketing strategy to mobile marketing campaign implementation. International Journal of Mobile Marketing 3: 1. [Google Scholar]
- Marinakos, Georgios, and Sophia Daskalaki. 2017. Imbalanced customer classification for bank direct marketing. Journal of Marketing Analytics 5: 14–30. [Google Scholar] [CrossRef]
- Miguéis, Vera L., Ana S. Camanho, and José Borges. 2017. Predicting direct marketing response in banking: Comparison of class imbalance methods. Service Business 11: 831–49. [Google Scholar] [CrossRef]
- Moro, Sérgio, Paulo Cortez, and Paulo Rita. 2014. A data-driven approach to predict the success of bank telemarketing. Decision Support Systems 62: 22–31. [Google Scholar] [CrossRef] [Green Version]
- Moro, Sergio, Raul Laureano, and Paulo Cortez. 2011. Using data mining for bank direct marketing: An application of the crisp-dm methodology. Paper presented at the European Simulation and Modelling Conference—ESM’2011, Guimaraes, Portugal, October 24–26; pp. 117–21, EUROSIS-ETI. [Google Scholar]
- Moro, Sérgio, Paulo Cortez, and Paulo Rita. 2015. Using customer lifetime value and neural networks to improve the prediction of bank deposit subscription in telemarketing campaigns. Neural Computing and Applications 26: 131–39. [Google Scholar] [CrossRef]
- Moro, Sérgio, Paulo Cortez, and Paulo Rita. 2018. A divide-and-conquer strategy using feature relevance and expert knowledge for enhancing a data mining approach to bank telemarketing. Expert Systems 35: e12253. [Google Scholar] [CrossRef] [Green Version]
- Mustapha, SMFD Syed, and Abdulmajeed Alsufyani. 2019. Application of Artificial Neural Network and Information Gain in Building Case-based Reasoning for Telemarketing Prediction. International Journal of Advanced Computer Science and Application 10: 300–6. [Google Scholar] [CrossRef]
- Pedregosa, Fabian, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, and et al. 2011. Scikit-learn: Machine learning in Python. The Journal of Machine Learning Research 12: 2825–30. [Google Scholar]
- Rust, Tobias, Daniel Bruggemann, Wilhelm Dangelmaier, and Dominik Picker-Huchzermeyer. 2010. A Method for Simultaneous Production and Order Planning in a Cooperative Supply Chain Relationship with Flexibility Contracts. Paper presented at 2010 43rd Hawaii International Conference on System Sciences, Koloa, HI, USA, January 5–8; pp. 1–10. [Google Scholar] [CrossRef]
- Selma, Mokrane. 2020. Predicting the Success of Bank Telemarketing Using Artificial Neural Network. International Journal of Economics and Management Engineering 14: 1–4. [Google Scholar]
- Sihombing, Ester Hervina, and Nasib Nasib. 2020. The Decision of Choosing Course in the Era of Covid 19 through the Telemarketing Program, Personal Selling and College Image. Budapest International Research and Critics Institute (BIRCI-Journal): Humanities and Social Sciences 3: 2843–50. [Google Scholar] [CrossRef]
- Tekouabou, Stéphane Cédric Koumetio, Walid Cherif, and Hassan Silkan. 2019. A data modeling approach for classification problems: Application to bank telemarketing prediction. Paper presented at 2nd International Conference on Networking, Information Systems & Security, New York, NY, USA, March 27–29; pp. 1–7. [Google Scholar]
- Tekouabou, Stéphane Cédric Koumetio, Sri Hartini, Zuherman Rustam, Hassan Silkan, and Said Agoujil. 2021. Improvement in automated diagnosis of soft tissues tumors using machine learning. Big Data Mining and Analytics 4: 33–46. [Google Scholar]
- Thakar Pooja, Mehta Anil, and Sharma Manisha. 2018. Robust Prediction Model for Multidimensional and Unbalanced Datasets. International Journal of Information Systems & Management Science 1: 2. [Google Scholar]
- Tripathi, Diwakar, Damodar Reddy Edla, Venkatanareshbabu Kuppili, Annushree Bablani, and Ramesh Dharavath. 2018. Credit Scoring Model based on Weighted Voting and Cluster based Feature Selection. Procedia Computer Science 132: 22–31. [Google Scholar] [CrossRef]
- Turkmen, Egemen. 2021. Deep Learning Based Methods for Processing Data in Telemarketing-Success Prediction. Paper presented at 2021 Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV), Tirunelveli, India, Frbruary 4–6; pp. 1161–66. [Google Scholar] [CrossRef]
- Vafeiadis, Thanasis, Konstantinos I. Diamantaras, George Sarigiannidis, and Konstantinos C. Chatzisavvas. 2015. A comparison of machine learning techniques for customer churn prediction. Simulation Modelling Practice and Theory 55: 1–9. [Google Scholar] [CrossRef]
- Vajiramedhin, Chakarin, and Anirut Suebsing. 2014. Feature selection with data balancing for prediction of bank telemarketing. Applied Mathematical Sciences 8: 5667–72. [Google Scholar] [CrossRef]
- Yan, Chun, Meixuan Li, and Liu Wei. 2020. Prediction of bank telephone marketing results based on improved whale algorithms optimizing S_Kohonen network. Applied Soft Computing 92: 106259. [Google Scholar]
- Yu, Lei, and Huan Liu. 2003. Feature selection for high-dimensional data: A fast correlation-based filter solution. Paper presented at 20th International Conference on Machine Learning (ICML-03), Washington, DC, USA, August 21–24; pp. 856–63. [Google Scholar]
- Wankhede Prabodh, Singh Rohit, Rathod Rutesh, Patil Jayesh, and Khadtare TD. 2019. Improving Prediction of Potential Clients for Bank Term Deposits using Machine Learning Approaches. International Research Journal of Engineering and Technology 6: 7101–4. [Google Scholar]

**Figure 5.**The cumulative gain curve of the CMB model (Class 0 is for “yes” label and class 1 is for “no” label).

**Table 1.**Summary of the relevant papers dealing with direct bank telemarketing prediction using machine learning. SRAP = Scientific Research an Academy Publisher; CBWDJ = Class-based weighted decision jungle, JSCS = Japanese Society of Computational Statistics.

Ref. | Year | ${\mathbf{Nb}}_{\mathit{f}}$ | Tools | Algorithms | Metrics | Best Score (%) | Publisher | Type |
---|---|---|---|---|---|---|---|---|

Feng et al. (2022) | 2022 | 21 | Python | META-DES-AAP | Acc, AUC | 89.39; 89.44 | Elsevier | Article |

Koumétio and Toulni (2021) | 2021 | 13 | Python | improved KNN | Acc, AUC, ${f}_{1}$ | 96.91 | Springer | Chapter |

Yan et al. (2020) | 2020 | 21 | - | S_Kohonen network | Acc | 80 | Elsevier | Article |

Ghatasheh et al. (2020) | 2020 | 21 | - | CostSensitive-MLP | Acc | 84.18 | MDPI | Article |

Selma (2020) | 2020 | 21 | - | ANN | Acc; ${f}_{1}$ | 98.93; 95.00 | Waset | Article |

Birant (2020) | 2020 | 21 | - | CBWDJ | (Acc; Arec; Rec) | (92.70; 84.92; 75.93) | IntechOpen | Chapter |

Tekouabou et al. (2019) | 2019 | 21 | Python | DT C5.0 | Acc, Prec, Rec, ${f}_{1}$ | 100 | ACM | Conf |

Farooqi and Iqbal (2019) | 2019 | 21 | WEKA | DT J48 | Acc, Spe, Sen, prec, AUC, ${f}_{1}$ | 91.2; 95.9; 53.8; 62.7; 88.4; 58 | IJRTE | Article |

Mustapha and Alsufyani (2019) | 2019 | 17 | - | ANN | Info Gain, Entropy | - | The SAI | Article |

Ilham et al. (2019) | 2019 | 21 | RapidMiner | SVM | Acc, AUC | 97.7; 92.5 | IOP | Chapter |

Ładyżyński et al. (2019) | 2019 | 21 | H2O | RF, DL | prc, rec | Elsevier | Article | |

Koumétio et al. (2018) | 2018 | 18 | RapidMiner | DT C4.5 | Acc, ${f}_{1}$ | 87.6; 81.4 | IEEE | Conf |

Moro et al. (2014) | 2014 | 22 | R/rminer | LR, DT, NN, SVM | AUC; ALIFT | 80.0; 70.0 | Elsevier | Article |

Vajiramedhin and Suebsing (2014) | 2014 | 8 | - | C4.5 | Acc, AUC | 92.14; 95.60 | Hikari | Article |

Elsalamony (2014) | 2014 | 17 | SPSS | MLPNN, TAN, LR, C5.0 | Acc, Sens, Spec | 90.49; 62.20; 93.12 | FCS | Article |

Karim and Rahman (2013) | 2013 | 21 | WEKA | NB; C4.5 | Acc, Prec, AUC | 93.96; 93.34; 87.5 | SRAP | Article |

Elsalamony and Elsayad (2013) | 2013 | 18 | - | BC, RF, SC, GB (C5.0) | Acc; AUC; Kappa | 96.11; 99.3; 91.70 | SRP | Article |

Moro et al. (2011) | 2011 | 29 | R/rminer | NB; DT; SVM | AUC; ALIFT | 93.8; 88.7 | EUROSIS-ETI | Article |

Inst. | ${\mathit{f}}_{1}$ | ${\mathit{f}}_{2}$ | ${\mathit{f}}_{3}$ | ${\mathit{f}}_{4}$ | Label |
---|---|---|---|---|---|

${I}_{1}$ | 59 | married | no | regular | no |

${I}_{2}$ | 39 | single | yes | very regular | yes |

${I}_{3}$ | 59 | married | no | irregular | no |

${I}_{4}$ | 41 | divorced | no | Unknown | yes |

${I}_{5}$ | 44 | married | no | irregular | yes |

Inst. | ${\mathit{f}}_{1}$ | ${\mathit{f}}_{2}$ | ${\mathit{f}}_{3}$ | ${\mathit{f}}_{4}$ | Label |
---|---|---|---|---|---|

${I}_{1}$ | 1 | married | 0 | 0.5 | 0 |

${I}_{2}$ | 0 | single | 1 | 1 | 1 |

${I}_{3}$ | 1 | married | 0 | 0 | 0 |

${I}_{4}$ | 0.1 | divorced | 0 | 0.5 | 1 |

${I}_{5}$ | 0.25 | married | 0 | 0 | 1 |

${\mathit{f}}_{1}$ | ${\mathit{f}}_{2}$ | ${\mathit{f}}_{3}$ | ${\mathit{f}}_{4}$ | |
---|---|---|---|---|

${C}_{j}$ | 0.98 | 0.8 | 0.41 | 0.33 |

${C}_{no}$ | 59 | married | 0 | 0.5 |

${C}_{yes}$ | 41.33 | single; divorced | 0.33 | 1 |

Model | AUC | Accuracy | ${\mathit{f}}_{1}$-Measure | Processing Time (s) |
---|---|---|---|---|

basic | 78.1% | 90.0% | 57.9% | 0.81 |

with FN | 76.9% | 89.9% | 56.8% | 0.45 |

with FNW | 95.9% | 97.3% | 93.2% | 0.52 |

Acc | AUC | ${\mathit{N}}_{\mathit{f}}$ | Best Model | Year | |
---|---|---|---|---|---|

Feng et al. (2022) | 89.39% | 89.44% | 21 | META-DES-AAP | 2022 |

Moro et al. (2011, 2014) | NA | 0.938 | 22 | ANN | 2014 |

Elsalamony and Elsayad (2013); Elsalamony (2014) | 90.09% | NA | 17 | DT(C4.5) | 2014 |

Vajiramedhin and Suebsing (2014) | 92.14% | 21 | NA | DT(C4.5) | 2014 |

Grzonka et al. (2016) | 89.4% | NA | 8 | Random Forest | 2016 |

Karim and Rahman (2013) | 93.96% | 0.9334 | NA | DT(C4.5) | 2013 |

Lahmiri (2017) | 71% | 0.59 | 18 | Two-stage system | 2017 |

Koumétio et al. (2018) | 69.1% | 0.55 | 18 | DT | 2018 |

Tekouabou et al. (2019) | 100% | - | 21 | DT | 2019 |

Farooqi and Iqbal (2019) | 91.2% | - | 21 | DT | 2019 |

Selma (2020) | 98.93% | 0.95 | 21 | ANN | 2020 |

Koumétio and Toulni (2021) | 96.91% | 95.9 | 12 | KNN | 2021 |

CMB approach | 97.3% | 95.9 | 18 | CMB | 2022 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Tékouabou, S.C.K.; Gherghina, Ş.C.; Toulni, H.; Neves Mata, P.; Mata, M.N.; Martins, J.M.
A Machine Learning Framework towards Bank Telemarketing Prediction. *J. Risk Financial Manag.* **2022**, *15*, 269.
https://doi.org/10.3390/jrfm15060269

**AMA Style**

Tékouabou SCK, Gherghina ŞC, Toulni H, Neves Mata P, Mata MN, Martins JM.
A Machine Learning Framework towards Bank Telemarketing Prediction. *Journal of Risk and Financial Management*. 2022; 15(6):269.
https://doi.org/10.3390/jrfm15060269

**Chicago/Turabian Style**

Tékouabou, Stéphane Cédric Koumétio, Ştefan Cristian Gherghina, Hamza Toulni, Pedro Neves Mata, Mário Nuno Mata, and José Moleiro Martins.
2022. "A Machine Learning Framework towards Bank Telemarketing Prediction" *Journal of Risk and Financial Management* 15, no. 6: 269.
https://doi.org/10.3390/jrfm15060269