Machine Learning (ML) Technologies for Digital Credit Scoring in Rural Finance: A Literature Review
Abstract
:1. Introduction
2. Relevant Literature and Motivation of Study
2.1. Traditional Method vs. Digital Method for Credit Assessment
2.2. Fintech and Big Tech Companies Are Using Digital Channels for Providing Specific and Speedy Banking Solutions
2.3. Empirical Analysis of Existing Research on ML Methods Adopted by Various Financial Institutions Worldwide for Credit Scoring
3. Materials and Methods
4. Findings and Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Abuhusain, Maha. 2020. The role of artificial intelligence and big data on loan decisions. Accounting 6: 1291–96. [Google Scholar] [CrossRef]
- Afonso Fontes, G. G. 2021. Using Machine Learning to Generate Test Oracles: A Systematic Literature Review. The 1st International Workshop on Test Oracles. New York: ACM. [Google Scholar]
- Aliija, Ronald, and Bernard Wakabi Muhangi. 2017. The Effect of Loan Appraisal Process Management on Credit Performance in Microfinance Institutions (MFIs): A Case of MFIs in Uganda. International Journal of Science and Research (IJSR) 6: 2283–9. [Google Scholar]
- Ampountolas, Apostolos, Titus Nyarko Nde, Paresh Date, and Corina Constantinescu. 2021. A Machine Learning Approach for Micro-Credit Scoring. Risks 9: 50. [Google Scholar] [CrossRef]
- Aniceto, Maisa Cardoso, Flavio Barboza, and Herbert Kimura. 2020. Machine learning predictivity applied to consumer creditworthiness. Future Business Journal 6: 37. [Google Scholar] [CrossRef]
- Antunes, José Américo Pereira. 2021. To supervise or to self-supervise: A machine learning based comparison on credit supervision. Financial Innovation 7: 26. [Google Scholar] [CrossRef]
- Assef, Fernanda M., and Maria Teresinha A. Steiner. 2020. Machine Learning Techniques in Bank Credit Analysis. International Journal of Economics and Management Engineering 14: 517–20. [Google Scholar]
- Beck, Thorsten. 2020. Fintech and Financial Inclusion: Opportunities and Pitfalls. ADBI Working Paper 1165. Tokyo: Asian Development Bank Institute. [Google Scholar]
- Bennouna, Ghita, and Mohamed Tkiouat. 2018. Fuzzy logic approach applied to credit scoring for microfinance in Morocco. ScienceDirect 127: 274–83. [Google Scholar] [CrossRef]
- Boughaci, Dalila, and Abdullah Ash-shuayree Alkhawaldeh. 2018. Three local search-based methods for feature selection in credit scoring. Vietnam Journal of Computer Science 5: 107–21. [Google Scholar] [CrossRef] [Green Version]
- Criado, Natalia, and Jose M. Such. 2019. Digital Discrimination. Algorithmic Regulation, 82–97. [Google Scholar] [CrossRef]
- Chen, Keqin, Amit Yadav, Asif Khan, and Kun Zhu. 2020. Credit Fraud Detection Based on Hybrid Credit Scoring Model. ScienceDirect 167: 2–8. [Google Scholar] [CrossRef]
- Eletter, Shorouq Fathi, Saad Ghaleb Yaseen, and Ghaleb Awad Elrefae. 2010. Neuro-Based Artificial Intelligence Model for Loan Decisions. American Journal of Economics and Business Administration 2: 27–34. [Google Scholar] [CrossRef]
- Fairooz, H. M. M., and C. N. Wickramasinghe. 2019. Innovation and Development of Digital Finance: A Review on Digital Transformation in Banking & Financial Sector of Sri Lanka. Asian Journal of Economics, Finance and Management 105: 69–78. [Google Scholar]
- Goh, R. Y., and Lai Soon Lee. 2019. Credit Scoring: A Review on Support Vector Machines and Metaheuristic Approaches. Advances in Operations Research 2019: 1–30. [Google Scholar] [CrossRef]
- Goh, R. Y., Lai Soon Lee, Hsin-Vonn Seow, and Kathiresan Gopal. 2020. Hybrid Harmony Search-Artificial Intelligence Models in Credit Scoring. Entropy 22: 989. [Google Scholar] [CrossRef] [PubMed]
- Ifft, Jennifer, Ryan Kuhns, and Kevin Patrick. 2018. Can machine learning improve prediction—An application with farm survey data. International Food and Agribusiness Management Review 21: 1083–98. [Google Scholar] [CrossRef]
- Kandpal, Vinay, and Rajat Mehrotra. 2019. Financial Inclusion: The Role of Fintech and Digital Financial Services in India. Indian Journal of Economics & Business 19: 85–93. [Google Scholar]
- Kumar, Madapuri Rudra, and Vinit Kumar Gunjan. 2020. Review of Machine Learning models for Credit Scoring Analysis. Ingeniería Solidaria 16: 1. [Google Scholar]
- Leo, Martin, Suneel Sharma, and K. Maddulety. 2019. Machine Learning in Banking Risk Management: A Literature Review. Risks 7: 29. [Google Scholar] [CrossRef] [Green Version]
- Linh, Ta Nhat, Hoang Thanh Long, Le Van Chi, Le Thanh Tam, and Philippe Lebailly. 2019. Access to Rural Credit Markets in Developing Countries, the Case of Vietnam: A Literature Review. Sustainability 11: 1468. [Google Scholar] [CrossRef] [Green Version]
- Mandala, I. Gusti Ngurah Narindra, Catharina Badra Nawangpalupi, and Fransiscus Rian Praktikto. 2012. Assessing Credit Risk: An Application of Data Mining in a Rural Bank. International Conference on Small and Medium Enterprises Development (ICSMED). Elsevier Ltd.: pp. 406–12. [Google Scholar]
- Munkhdalai, Lkhagvadorj, Tsendsuren Munkhdalai, Oyun-Erdene Namsrai, Jong Yun Lee, and Keun Ho Ryu. 2019. An Empirical Comparison of Machine-Learning Methods on Bank Client Credit Assessments. Sustainability 11: 699. [Google Scholar] [CrossRef] [Green Version]
- O’Neill, Felicity, and Margarete Biallas. 2020. Artificial Intelligence Innovation in Financial Services. International Finance Corporation, a member of the World Bank Group 85: 1–8. [Google Scholar]
- Ozgur, Önder, Erdal Tanas Karagol, and Fatih Cemil Ozbugday. 2021. Machine learning approach to drivers of bank lending: Evidence from an emerging economy. Financial Innovation 7: 20. [Google Scholar] [CrossRef]
- Pławiaka, Paweł, Moloud Abdar, Joanna Pławiak, Vladimir Makarenkovc, and U. Rajendra Acharya. 2020. DGHNL: A new deep genetic hierarchical network of learners for prediction of credit scoring. Science Direct 516: 401–18. [Google Scholar] [CrossRef]
- Rafiei, Farimah Mokhatab, and Somayeh Moradi. 2019. A dynamic credit risk assessment model with data mining techniques: Evidence from Iranian banks. Financial Innovation 5: 15. [Google Scholar]
- Ranjbarfard, Mina, and Shahideh Ahmadi. 2020. A Study of Data Requirements for Data Mining Applications in Banking. Journal of Digital Information Management 18: 109. [Google Scholar] [CrossRef]
- SAFIRA, and Grow Asia. 2019. Digital Credit Scoring in Agriculture: Best Practices of Assessing Credit Risks in Value Chains, Digital Credit Scoring for Agribusiness. Available online: http://exchange.growasia.org/system/files/GA_Digital%20Scoring%20Guide_Double.pdf (accessed on 7 July 2021).
- Sánchez, José Francisco Martínez, and Gilberto Pérez Lechuga. 2016. Assessment of a credit scoring system for popular bank savings and credit. Contaduría y Administración 61: 391–417. [Google Scholar] [CrossRef] [Green Version]
- Spicka, Jindřich, Thomas Hlavsa, Katerina Soukupova, and Marie Stolbova. 2019. Approaches to estimation the farm-level economic viability and sustainability in agriculture: A literature review. Agricultural Economics (Zemědělská ekonomika) 65: 289–97. [Google Scholar] [CrossRef] [Green Version]
- Stephens, Bryce, and Nicholas Schmidt. 2019. An Introduction to Artificial Intelligence and Solutions to the Problems of Algorithmic Discrimination. Algorithmic Discrimination, Quarterly Report arXiv:1911.05755. [Google Scholar]
- Wijewardhana, Udnai, Chinthaka Bandara, and Thesath Nanayakkara. 2018. A Mathematical Model for Predicting Debt Repayment: A Technical Note. Australasian Accounting, Business, and Finance Journal 12: 127–35. [Google Scholar] [CrossRef]
- Yu, Lean, Xinxue Li, Ling Tang, Zongyi Zhang, and Gang Kou. 2015. Social credit: A comprehensive literature review. Financial Innovation 1: 6. [Google Scholar] [CrossRef]
- Zeng, Yiwu, Fu Jia, Li Wan, and Hongdong Guo. 2017. E-commerce in the agri-food sector: A systematic literature review. International Food and Agribusiness Management Review 20: 439–60. [Google Scholar] [CrossRef]
- Zhu, Shuzhen, Yutao Chen, Wenwen Wang, and Y. Wu. 2020. Risk Assessment of Biological Asset Mortgage Loans of China’s New Agricultural Business Entities. Complexity 2020: 1–12. [Google Scholar]
Author (from Reference List) | Year | Country or Financial Institution | Credit Scoring Techniques Followed | Datasets/Variable Used | Key Findings or Recommendations |
---|---|---|---|---|---|
Fernanda M. Assef, Maria Teresinha A. Steiner | 2020 | Brazilian financial institution | Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Logistic Regression (LR) and Support Vector Machines (SVM) | 5432 companies (2600 clients—non-defaulters, 1551—defaulters, and 1281—temporarily defaulters) | Hybrid techniques for credit risk assessments may be followed for better results |
Somayeh Moradi and Farimah Mokhatab Rafiei | 2019 | Iranian banks | Fuzzy Logic | Behavioral features of banking customers during special political and economic conditions | A few qualitative predictors like accountability, commitment, honesty, reputation, and ethics should also be added for the risk analysis |
José Francisco Martínez Sánchez, Gilberto Pérez Lechuga | 2016 | Mexican financial system/SOFIPO | NPV, IRR, and payback period | Banking infrastructure and human capital for credit risk assessment | Evaluation of credit scoring system in terms of cost-efficiency, for the finance companies’ community SOFIPOs |
Ghita Bennouna, Mohamed Tkiouat | 2019 | Morocco (microfinance institutions) | Fuzzy Logic | History of client behavior (descriptive variable, a behavioral variable, and variable characterizing loans contracted) of microfinance institutions | Evaluation of customer behavior by using the fuzzy logic approach, to reduce loan default |
Maisa Cardoso Aniceto, Flavio Barboza and Herbert Kimura | 2020 | Brazilian bank | AdaBoost and Random Forest models, and compare with a benchmark based on a Logistic Regression model | Database (large Brazilian financial institution) of 124,624 consumers’ loans and their repayment schedule | Random Forest and AdaBoost perform better when compared to other ML models for borrower’s adequacy classification |
I Gusti Ngurah Narindra Mandalaa, Catharina Badra Nawangpalupia, Fransiscus Rian Praktikto | 2012 | Rural bank (Bank Perkreditan Rakyat), Indonesia | Decision Tree model (data mining methodology) | Variables like gender, collateral type, source of fund, business activity, etc., taken for credit risk assessment | Critical factors identification for a rural bank (Bank Perkreditan Rakyat) to assess the credit application |
Dalila Boughaci, Abdullah Ash-shuayree Alkhawaldeh | 2018 | Vietnam | LS, SLS, and VNS for feature selection, combine these methods with SVM classifier | German and Australian credit datasets | Future research is recommended to know the impact of the feature selection-based method with the other machine-learning techniques for credit scoring |
Ronald Aliija, Bernard Wakabi Muhangi | 2017 | Uganda/microfinance institutions | Linear Regression | 38 loan officers and six credit managers in six microfinance institutions in Fort Portal municipality, Western Uganda | To examine the challenges faced by credit officers at the loan appraisal stage |
Onder Ozgur, Erdal Tanas Karagol and Fatih Cemil Ozbugday | 2021 | Turkey | Comparing the performance of six ML techniques (Tree Regression, Bagging, Boosting, Random Forest, Extra-Trees, and Xgboost) with the standard Linear Regression | 19 deposit banks in Turkey, the data set contains nine bank-specific variables, seven macroeconomic indicators, and three global factors to determine the lending behavior of the bank, for the period 2002Q4–2019Q2 | This study analyzes that the Random Forest model has the lowest predicting error |
Paweł Pławiaka, Moloud Abdar, Joanna Pławiak, Vladimir Makarenkovc, U Rajendra Acharya | 2020 | - | Genetic Algorithm | Statlog German credit approval data (1000 instances—accepted/good applicants—700 and rejected/bad applicants—300) | Proposed Deep Genetic Hierarchical Network of Learners (DGHNL) model with a 29-layer structure helps in getting the prediction accuracy of 94.60% |
Rui Ying Goh, Lai Soon Lee, Hsin-Vonn Seow and Kathiresan Gopal | 2020 | - | Hybrid Model (HS-SVM and HS-RF) | German and Australian data sets which are publicly available at the UCI repository (https://archive.ics.uci.edu/) | A Modified Harmony Search (MHS) model is proposed to achieve comparable results for credit scoring |
Credit Scoring Method | Type | No. of Articles Referred |
---|---|---|
ANN | AI method | 3 |
SVM | AI method | 3 |
Decision Tree | AI method | 2 |
Logistic Regression | Econometric | 4 |
GA | AI method | 1 |
Fuzzy Logic | AI method | 2 |
Random Forest | AI method | 3 |
XGBoost | AI method | 1 |
Descriptive analytical approach | Econometric | 1 |
Hybrid model | Hybrid system | 2 |
Linear Regression | Mathematical/Statistical | 1 |
Theoretical/Subjective judgement/Other | Expert system | 2 |
Credit Scoring Technique | ||||||
---|---|---|---|---|---|---|
Expert System (based on 5Cs) | ||||||
Linear Programming | ||||||
Logistic Regression | ||||||
AI-ML-Based | ||||||
Genetic Algorithm (GA) | ||||||
Hybrid Model (AI + AI) OR (AI + other) | ||||||
Year | 1970 | 1980 | 1990 | 2000 | 2010 | 2020–2021 |
Popular Author/Researcher | Credit Scoring Technique Studied/Employed | Year | Studied on |
---|---|---|---|
Chatterjee and Barcun | KNN | 1970 | Individual credit risk estimation |
Henley and Hand | KNN | 1997 | Individual credit risk estimation |
Rivoli and Brewer | Logistic Regression | 1998 | Credit risk estimation |
Mangasarian | Linear Programming | 1965 | Prediction classification |
Altman et al. | Logistic Regression | 1980 | Credit risk estimation for SMEs |
Goovaerts and Steenackers | Logistic Regression | 1989 | Personal credit scoring |
Tam and Kiang | ANN | 1992 | Bankruptcy prediction |
Desai et al. | ANN | 1996 | Individual credit risk estimation |
Lee et al. | CART and MARS | 2006 | Individual credit risk estimation |
Desai et al. | GA | 1997 | Individual credit risk estimation |
Huang et al. | 2 stage genetic programming | 2006 | Individual credit risk estimation |
Chen et al. | Hybrid SVM and three strategies | 2009 | Individual credit risk estimation |
Jacky | Machine Learning | 2018 | Credit fraud detection |
Keqin Chen et al. | Hybrid (Logistic Regression and Evidence Weight) | 2020 | Individual credit risk estimation |
Rui Ying Goh et al. | Hybrid model—HS-SVM and HS-RF | 2020 | Individual credit risk estimation |
Parameters | Comparative Analysis—Credit Scoring Techniques | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Weights | ANN | SVM | RF/XG Boost | Logistic Regression | GA | Hybrid Model | |||||||
Rating | Score | Rating | Score | Rating | Score | Rating | Score | Rating | Score | Rating | Score | ||
Accuracy | 0.30 | 4 | 1.2 | 4 | 1.2 | 5 | 1.5 | 3 | 0.9 | 4 | 1.2 | 4 | 1.2 |
Performance | 0.30 | 4 | 1.2 | 3 | 0.9 | 5 | 1.5 | 3 | 0.9 | 4 | 1.2 | 4 | 1.2 |
Robustness | 0.20 | 3 | 0.6 | 3 | 0.6 | 3 | 0.6 | 3 | 0.6 | 4 | 0.8 | 5 | 1 |
Volume of Data | 0.20 | 3 | 0.6 | 3 | 0.6 | 3 | 0.6 | 2 | 0.4 | 3 | 0.6 | 5 | 1 |
Total | 1.00 | 3.6 | 3.3 | 4.2 | 2.8 | 3.8 | 4.4 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kumar, A.; Sharma, S.; Mahdavi, M. Machine Learning (ML) Technologies for Digital Credit Scoring in Rural Finance: A Literature Review. Risks 2021, 9, 192. https://doi.org/10.3390/risks9110192
Kumar A, Sharma S, Mahdavi M. Machine Learning (ML) Technologies for Digital Credit Scoring in Rural Finance: A Literature Review. Risks. 2021; 9(11):192. https://doi.org/10.3390/risks9110192
Chicago/Turabian StyleKumar, Anil, Suneel Sharma, and Mehregan Mahdavi. 2021. "Machine Learning (ML) Technologies for Digital Credit Scoring in Rural Finance: A Literature Review" Risks 9, no. 11: 192. https://doi.org/10.3390/risks9110192