Financial Inclusion in Emerging Economies: The Application of Machine Learning and Artiﬁcial Intelligence in Credit Risk Assessment

: In banking and ﬁnance, credit risk is among the important topics because the process of issuing a loan requires a lot of attention to assessing the possibilities of getting the loaned money back. At the same time in emerging markets, the underbanked individuals cannot access traditional forms of collateral or identiﬁcation that is required by ﬁnancial institutions for them to be granted loans. Using the literature review approach through documentary and conceptual analysis to investigate the impact of machine learning and artiﬁcial intelligence in credit risk assessment, this study discovered that artiﬁcial intelligence and machine learning have a strong impact on credit risk assessments using alternative data sources such as public data to deal with the problems of information asymmetry, adverse selection, and moral hazard. This allows lenders to do serious credit risk analysis, to assess the behaviour of the customer, and subsequently to verify the ability of the clients to repay the loans, permitting less privileged people to access credit. Therefore, this study recommends that ﬁnancial institutions such as banks and credit lending institutions invest more in artiﬁcial intelligence and machine learning to ensure that ﬁnancially excluded households can obtain credit.


Introduction
In banking and finance, credit risk is among the important topics because the process of issuing a loan requires a lot of attention to assess the possibilities of getting the money back (Danėnas and Garšva 2010). Attention and care should be put in practise to avoid unnecessary losses. When issuing loans, financial institutions should evaluate the current and historical state of the debtor (Danėnas and Garšva 2010) According to Gu et al. (2018), credit risk can take various forms and it always depends on the type of debtor, the class of the financial instrument. For instance, the government, the private company, and the individual are different debtors, and their characteristics are different as well (Danėnas and Garšva 2010;Gu et al. 2018). Also, issuing a loan and transactions of financial derivatives are completely different from each other. As a result, there are many different statistical, mathematical as well as intelligent models used in the process of predicting and analysing risk (Gu et al. 2018). These techniques are used concerning the circumstances around the scoring and evaluation of probability default. Biallas and O'Neill (2020) argued that in emerging markets individuals who are underbanked especially women, the youth, and small businesses cannot access traditional forms of collateral or identification required by financial institutions such as banks. The use of alternative data sources such as public data, images from satellite, registered from companies and data from social media like SMS and messenger services interaction data makes artificial intelligence (AI) assist lenders to assess the behaviour of the consumer and subsequently verify the ability of the clients to repay the loans (Biallas and O'Neill 2020). The broad application of AI in emerging in the financial sector was through the analysis of Int. J. Financial Stud. 2021, 9, 39 2 of 16 the alternative data points and the real-time behaviour to be effective. It is believed that the use of AI has improved credit decisions, improved the identification of threats to financial institutions, and has assisted in meeting compliance obligations and addressing financing gaps faced by businesses in emerging markets (Biallas and O'Neill 2020). The evolution of the field of AI and machine learning is becoming important in the field of finance and credit risk. The idea behind AI is to simulate human intelligence and thinking through mathematical modelling techniques. The development of new models and algorithms in one of the branches of AI called machine learning is transforming the field of finance and credit risk. New machine learning techniques are developed and applied in credit risk. Machine learning is assisting a lot because credit risk involves the collection of data that should be analysed, tested, and processed accurately. Gui (2019) stated that it is important for financial institutions to have a risk prediction model so that they will be able to identify and predict the characteristics of individuals with a higher probability of default on loans. Again Gui (2019) indicated robust machine learning models are critical as they allow not only banks but even the clients to be able to know the behaviour that may damage their credit scores. A study by Breeden (2020) highlighted that machine learning is now dominating many industries and is increasingly being applied in credit scoring and credit risk management. (Breeden 2020) indicated that the use of machine learning techniques comes with risks so the research now should focus on how to use machine learning models in a regulatory-compliant business context. The work by Breeden (2020) highlighted a range of machine learning methods and their application in areas of credit risk through a survey. Also, a study by Nyoni and Matshisela (2018) highlighted that credit mitigation is one of the areas of interest, especially after the 2007-2008 global financial crisis. The study by Nyoni and Matshisela (2018) insinuates that there is a lot of data collected by various companies which can be used in the determination of credit risk worthiness of a company or an individual through the application of machine learning techniques. Using various machine learning techniques to do a comparative analysis, the results showed that Lasso regression gives the best estimation for default with an Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.8048 followed by Random Forest Model with 0.7869 AUROC. The results further revealed that the commonly used logit model was better than the Support Vector Machine with 0.7678 AUROC compared to 0.7581 AUROC of the Support Vector Machine. Chow (2017) also used different machine learning techniques to learn the relationship between the company's current state and its fate. The results indicated that it was possible to achieve 95% accuracy using machine learning techniques compared to the use of pure financial factors to do the prediction on whether a company will be bankrupt or not. Using the pure financial factors, the correlation was not strong compared to machine learning methods. Caporale et al. (2016) also estimated a reduced-form model to assess the credit risk of general insurance of non-life firms in the United Kingdom. Using data with a large sample of 515 firms, the results revealed that macroeconomic and firm-specific factors play important roles. The results also revealed that credit risk varies from one firm to another depending on the type of business. The current study is building on the findings and approaches from the above studies, but this study is unique in that it combines AI and machine learning in assessing how these applications can help lenders in credit risk analysis to achieve the financial inclusion of less privileged people. Therefore, the objective of the current study, taking insights from the findings of other studies, is to investigate the impact of machine learning and AI on credit risk assessment, with the overall intention to assess whether the influence of artificial intelligence and machine learning in credit risk analysis can help to improve financial inclusion. The rest of the paper is organised as follows. Section 2 describes the background of financial inclusion, an overview of AI and machine learning, the theories of financial inclusion and credit risk analysis, a brief review-information asymmetry and credit risk, the adverse selection theory, the moral hazard theory, and the empirical literature review. The methodology and the discussion of the findings are given in Section 3. The last section gives the conclusion of the study.

Background of Financial Inclusion
There are many definitions of financial inclusion from different scholars as reflected in the literature (Mhlanga 2020). The definition by Leeladhar (2005) spoke of financial inclusion as the process where banking services are delivered in a manner that they become affordable to many sections of the disadvantaged groups, especially the low-income earners. Thorat (2007) also came up with a definition of financial inclusion where financial inclusion is defined as how financial services are provided at an affordable rate by the formal financial institutions to the disadvantaged groups. The other definition of financial inclusion was provided by Sarma (2008) where financial inclusion was defined as the art of making sure that there is the ease of access, availability, and the usage of formal financial services to all the people in the economy. Arun and Kamath (2015) also highlighted that financial inclusion should be viewed as a situation where people have access to financial services and products of good quality which are affordable and convenient with dignity for all the clients. Some scholars also came forth with the definitions of financial exclusion and in some instances, these terms are financial inclusion and financial exclusion (Mhlanga 2020). Leyshon and Thrift (1995) described financial exclusion as the circumstances that limit people in societies and various social groups from having access to the formal financial system. Sinclair (2001) also described financial exclusion as a condition where people are not able to access the necessary financial services appropriately. Carbó et al. (2005) also defined financial exclusion as the inability or the reluctance of groups of people in a society to be able to access mainstream financial services. Mohan (2006) defines financial exclusion as all the circumstances that limit access by groups or segments of society to be unable to access, low cost, fair and safe financial products, and services from formal financial service providers. According to Mhlanga (2020), the European Commission (EC) defined financial exclusion as the situation where individuals are faced with access challenges in the use of financial services and products that can help satisfy their needs and wants and to allow them to live normal social life in their communities. The EC also stated that the features in a product or financial service can act as constraints in the access and use of financial services. Other issues that were highlighted were the laws in the selling and use of these financial services to customers (Mhlanga 2020). The importance of financial inclusion started to develop in the literature in the early 2000s due to the sure reality of financial exclusion and its direct and indirect impact on developmental issues such as food security, inequality, and poverty (Levine 2005). The United Nations (UN) also come up with the goals of financial inclusion such as access to financial services and products at an affordable cost by every individual in a community, the creation of safe and sound institutions with clear regulations, and industry performance standards. The other goal of financial inclusion is viewed as the maintenance of financial and institutional sustainability and to ensure that competition is maintained so that variety of affordable products and services for clients are available (Mhlanga 2020).

AI, Machine Learning, a Brief Overview
The term machine learning was coined in 1959 by Arthur Samuel (Gui 2019). Arthur Samuel described machine learning as the study that allows computers to learn without being explicitly programmed (Gui 2019). It is believed that machine learning has the responsibility to train computers to make them observe the previous data to decide on this data in the future. Machine learning makes use of algorithms to solve problems and no algorithm is always better than another as it is difficult to have an algorithm that can always be used to solve all the problems. It is always advisable to have a variety of algorithms for different problems. Gui (2019) believes that it is always better to evaluate the different algorithms before applying the best algorithms. Ullah et al. (2020) stated that machine learning can be divided into three classes which are supervised learning, unsupervised learning, and reinforcement learning. The supervised machine learning algorithms use trained data set to make predictions on the output values and are applied when the data set is not classified and labelled (Sousa et al. 2016;Ullah et al. 2020). In supervised learning data set of input and the target, values are used in training AI network to establish mapping function used in mapping input and output. Ullah et al. (2020) also argued that supervised learning is further divided into regression and classification. Some of the common examples of supervised learning are linear regression, support vector machine and random forest. Unsupervised machine learning algorithms use data with unknown possible outputs (Lynn et al. 2019;Mhlanga 2021;Yigitcanlar et al. 2020). Danėnas and Garšva (2010) argued that many machines are learning and artificial intelligence techniques which can offer solutions to various problems, and these can be grouped into "classification, clustering, rule extraction, optimization and expert knowledge extraction techniques". Danėnas and Garšva (2010) went on to argue that the widely applied machine learning methods for data mining are clustering and classification. Ullah et al. (2020) also stated that in unsupervised learning there is no guidance available; non-labelled and non-classified input data set are provided and used in the training of AI networks to discover the hidden patterns, answers, and distributions. Some examples of unsupervised learning are k-means and auto-encoder algorithm. Classification is mainly responsible for mapping possible classes; this is critical in the evaluation of current and predicted values. Danėnas and Garšva (2010) went on to argue that even though there are many machine learning techniques, the most common applied machine learning tools in credit risk evaluation and bankruptcy prediction are discriminant analysis, logistic regression, neural networks, evolutionary computing among many. Strusani and Houngbonon (2019) also stated that AI on the other hand combines large volumes of data with massive computing power to simulate human cognitive abilities like reasoning, perception, spatial processing, vision, and language. It has been argued that the performance of AI has been improved by the new generation of algorithms labelled machine learning. Strusani and Houngbonon (2019) stated that machine learning algorithms are built from data and the quality of the dataset in most cases influences the performance of these algorithms.

The Theories of Financial Inclusion and Credit Risk Analysis a Brief Review-Information Asymmetry and Credit Risk
Information asymmetry is a phenomenon that arises in contract theory and economics where it deals with the study of decisions in transactions where one member is equipped with more information than the other member. In its right information, asymmetry causes imbalances power in transactions to existing causing transactions to be inefficient resulting in market failure. Information asymmetry generates two problems the Ex-Post Moral Hazard and the Ex-Ante adverse selection.

The Adverse Selection Theory
Adverse selection theory emanated from the works of Akerlof (1970), who argued that the presence of information asymmetry causes adverse selection. Adverse selection theory is usually viewed as a problem that happens before the signing of a contract between the borrower and the lender because one economic agent will have more information compared to the other partner. The work of Stiglitz and Weiss (1981) initiated the beginning of the theory of adverse selection in the credit market. In this theory it is assumed that credit providers have difficulties in differentiating the borrowers according to their risk profiles which leads credit contracts to be characterised by limited liability where debt commitments are greater than returns of the project and the borrower will have no responsibility to pay out of pocket (Karlan and Zinman 2009;Mhlanga 2020). Under these circumstances, where the borrower is characterised by unobservable features adverse selection will manifest itself strongly (Karlan and Zinman 2009). According to Hellwig (1987) credit providers can deal with this risk through the application of two methods which are direct and indirect. The direct method is viewed as the process where the credit provider directly scrutinises the characteristics of the borrower while the other method is the indirect method where the credit provider provides the terms and conditions of the loan in a way to reduce risk of credit by allowing only credit risk providers to take the loan. The most common method used is the pledging of collateral security by the borrower such as title deeds of property (Eaton and Gersovitz 1981). The process of pledging collateral security makes it difficult for disadvantaged people like small businesses, the youth, and women to access credit as they will be viewed as high-risk borrowers when this method is applied. According to Eaton and Gersovitz (1981), the task of separating bad risk borrowers and good borrowers is a mammoth task for many lenders which automatically push other potential borrowers from having access to credit to the mainstream credit market. Eaton and Gersovitz (1981) went on to argue that many lenders do not have a proper method to separate good borrowers and bad risk borrowers, a method which does not exclude other potential borrowers from the market.

Moral Hazard Theory
Moral hazard is a problem mainly caused by ex-post information asymmetry in a contract especially after signing the contract. This problem arises due to the inability of agents to be able to observe the actions of other agents. The credit risk market can best be explained by the moral hazard theory. The moral hazard theory explains what happens if the borrower or the insured concerning the insurance market has a significant control the insurer does not have (Berhanu 2005). Moral hazard is also viewed in a way that the performance of a project about payoffs rests in the behaviour and actions of the borrower. In insurance, the insured will have control over the risk that was insured by the insurer. In normal projects, the borrower will have control over the effort applied in a project, the inputs to be applied, and the quality of labour force recruited, among many factors that affect the performance of the project. When the lender lends money to the borrower the expectation will be the borrower will be able to generate more money that can repay the loan amount, but it depends on the actions of the borrower which the lender cannot assess after signing the contract, generating the risk of moral hazard (Boot and Thakor 1994;Mhlanga 2020). Information asymmetry forces the borrower to take actions that will not optimise the returns of the project, especially where collateral security is not present. Mohiuddin (1993) stated that lenders to avoid moral hazard they tie tying credit and savings together through creating built-in mechanisms for an emergency fund to handle unforeseen shocks. In the process, this will limit those who desire to borrow due to limited information on the possible returns from the project (Berhanu 2005). As a result, some economic agents will not be of the credit market hence the problem of financial exclusion arises.

Empirical Literature Review
The empirical literature on the application of machine learning in credit default is available. The literature provides a wide range of factors that can influence credit default, for instance, Lynn et al.  Caporale et al. (2016) considered many determinants of credit risk. The empirical results from the study revealed that macroeconomic and firm-specific factors were important in influencing credit risk. The study also discovered that credit risk varies from one firm to the other depending on the business line of the business. The other factor that was discovered was differences in reinsurance levels of the firms.
A study by Galindo and Tamayo (2000) highlighted that risk assessment of financial intermediaries is one of the areas that should be focused on due to financial crises such as the one in the 1980's and 90's. Galindo and Tamayo (2000) also indicated that the proper estimation of risk in the global financial models can help in the efficient use of resources, but Galindo and Tamayo (2000) asserted that the only way to achieve this is through finding dependable predictors of individual risk in the credit portfolios. Using various machine learning methods of classification on mortgage loan data set the study found out that decision tree models provide the best estimation for default with an average 8.13% error rate. The results also revealed that neural networks provided the second-best results with an average error of 11%, followed by the K-Nearest Neighbour algorithm and the Probit algorithm. Danėnas and Garšva (2010) highlighted that support vector machines which are almost like neural networks are examples of machine learning techniques that are becoming very popular in research as they are widely used in industrial applications and can be used in many statistical and intelligent fields such as credit risk analysis, pattern recognition systems, and others. The study by Danėnas and Garšva (2010) went on to highlight that machine learning techniques are critical in different fields of science like bioinformatics, text and document classification, pattern recognition, image recognition and credit risk default. Another study by Lynn et al. (2019) insinuated those digital technologies are transforming access to finance by small businesses. Lynn et al. (2019) went on to state that online peer to peer lending as one form of crowdfunding is connecting borrowers and lenders. However, information asymmetry is a critical variable in online lending which can cause moral hazard or adverse section which can affect the visibility and performance of individual platforms.
Tfaily (2017) presented the current issues regarding issues related to information asymmetry and credit risk analysis. Tfaily (2017) argued that banking risk is a phenomenon that is always present in the banking companies, and they usually present the uncertainty of attaining some levels of profit or even the probability of loss. The multitude of operations and procedures highlight influences bank risk. Tfaily (2017) also stated that bank risks are changing every day in complexity regardless of the traditional risks, other risks are now affecting financial institutions such as financial risks, operational risks, strategic risks, country risks, human risks and even fraud risks. Saito and Tsuruta (2018) investigated the presence of a moral hazard or adverse selection in credit guarantee schemes for small and medium enterprises (SMEs) in Japan. Saito and Tsuruta (2018) discovered that most credit guarantee firms have serious problems in differentiating low risk from risky borrowers, as a result, they tend to attract a large proportion of risky borrowers which results in inefficient resource allocation. Using the bank-level data to assess whether the default rate is positively associated with the ratio of guaranteed loans to total loans, the study discovered that adverse selection and moral hazard were consistent with the data. Yin et al. (2020) in a study discovered that loan application assessments of SMEs are difficult due to information asymmetry. Yin et al. (2020) went on to find out that to avoid information asymmetry in firms it is possible to use legal judgements involving the company and its principles and combines them with information containing financial and firm-specific information to assist the evaluation of credit risk of SMEs. As a result, Yin et al. (2020) proposed a framework that helps in the identification of legal judgements that are effective in the prediction of credit risk and extract relevant information contained in these judgements. The study discovered that the features that are contained ineffective legal judgements significantly improve the discrimination of performance and the granting of performance of the developed model compared to the baseline model which uses financial and firm-specific features alone. Asongu and Odhiambo (2020) assessed the relevance of decreasing information asymmetry on issues related to life and non-life insurance consumption using data from 48 African countries for the period of 2004-2014. Using the generalised method of moments, the study discovered that information sharing offices increase insurance consumption with a comparatively higher magnitude in life insurance penetration, concerning non-life insurance penetration. Wu et al. (2020) assessed the SMEs constraints of cash shortages in operations and cost uncertainty due to the risks of information asymmetry. Through adopting the distributor perspective and the application of a credit guarantee mechanism with an incentive contract as one of the risk management tools, the study discovered that the distributor could adopt incentive contracts to reveal the type of its supplier. Also, it was established that the higher the inefficient supplier's contribution to the distributor the smaller the gap between procurement and contract quantities with the inefficient supplier.

of 16
Concerning financial inclusion, the study stated that financial inclusion is gaining attention in the policy world in Africa allowing many studies in the African continent to emerge, for instance (Mhlanga 2020;Mhlanga and Denhere 2021;Ndanshau and Frank 2021;Okoroafor et al. 2018;Ozili 2020). Mhlanga and Denhere (2021) investigated the determinants of financial inclusion in Southern Africa. Using the logistic regression, the study discovered that financial inclusion is driven by many factors including age, education level, income, race gender and marital status. Ozili (2020) discovered that financial inclusion in Europe is driven by allowing access to credit markets to increase the number of individual borrowers in the credit market and to make sure that the credit market is stable. Ozili (2020) believed that the extent of access to the credit market is different across the different European nations. A study by Sinclair (2001) concluded that access to mainstream banking services for low-income customers was a problem. Sinclair (2001) also believes that many low-income customers do not have access to affordable credit; as a result, there has been some controversy as to whether Britain is denying access to banking services to lower-income customers or whether banks were moving away from the deprived communities. Ozili (2020) also pointed out that in some Asian and Australian nations financial inclusion is a policy priority and many of these nations adopted the UK model of financial inclusion. In the Middle East and North Africa countries also known as MENA countries, the goal of financial inclusion is to ensure that low-income populations have full access to the financial market. Bussmann et al. (2021) proposed an explainable AI model that can be used in the analysis of credit risk and management particularly when credit is borrowed in peer-to-peer lending platforms. The model that Bussmann et al. (2021) proposed is to apply correlation networks to Shapley values to ensure that AI predictions are grouped concerning similarity in the underlying explanations. Using this model Bussmann et al. (2021) usefully analysed 15,000 small and medium firms by asking for credit that reveals both risky and not risky borrowers to be grouped with their financial characteristics. Using this model Bussmann et al. (2021) believes that it is possible to explain the credit score of the borrowers and to subsequently predict their future behaviour. Punniyamoorthy and Sridevi (2016) also stated that credit risk assessments have gained a lot of attention in recent years motivated by the global financial crisis and credit crunch. As a result, various financial institutions seek to support credit rating agencies to come up with predictions on the ability of the creditor to meet financial obligations. Therefore, Punniyamoorthy and Sridevi (2016) prepared a neural network (NN) and fuzzy support vector machine (FSVM) to discriminate between good creditors and bad creditors and came up with the best classifier for credit risk assessments. They discovered that the FSVM model performs better than the backpropagation neural network. Moscatelli et al. (2020) came forth with an analysis performance for machine learning models in predicting default risk. The study used standard statistical models like logistic regression as a benchmark. The study discovered that machine learning models give meaningful gains in discriminatory power and precision compared to statistical models. The benefits of these models diminish when confidential information like credit behavioural indicators is also available, and they become negligible when the data set is small. Moscatelli et al. (2020) also assessed the consequences of the use of credit allocation rule based on machine learning ratings on the overall supply of credit and the number of borrowers gaining access to credit. Machine learning models proved to assist lenders to offer credit towards safer and larger borrowers which result in lower credit losses for lenders. Bhatore et al. (2020) also stated that credit risk is the financial loss that lenders bear when borrowers fail to meet their financial commitments. Bhatore et al. (2020) believe that many factors constitute credit risk because due diligence when issuing loans should be applied among other initiatives such as continuous monitoring of customers' payments, and other behaviour patterns to reduce the number of non-performing assets. Bhatore et al. (2020) went on to argue that despite the presence of various credit rating agencies, the researcher is exploring various machine learning techniques to improve credit risk evaluation.

Methodology
The paper is using a literature review approach to investigate the impact of machine learning and artificial intelligence in credit risk assessment. A review of secondary sources of data such as governments reports, international statistics, media articles, peer-reviewed journal articles, and books was carried out to establish the impact of AI and machine learning in credit risk analysis. Unobtrusive research techniques, such as documentary analysis and conceptual analysis, were used to examine authoritative sources to conceptualize and contextualise the impact of AI and machine learning in credit risk analysis. To avoid limiting the study to a few research articles the author utilised a variety of documents and the information that was published by scholars and private organizations on the topic. The paper is attempting to addressing the question on, what is the impact of artificial intelligence and machine learning in credit risk analysis, with the overall intention to assess whether the influence of artificial intelligence and machine learning in credit risk analysis can help to improve financial inclusion.

Discussion of the Findings on Application of Machine Learning and AI in Credit Risk Assessments
Credit risk is defined as the likelihood of a potential borrower failing to meet their obligations concerning the agreed terms (Witzany 2017). Most banks do risk assessments and management to maximise the bank's risk-adjusted rate of return through maintaining credit risk exposure with the required parameters. Effective management of credit risk is one of the important components of risk management and critical for the long-term success of any financial organization (Witzany 2017). In many banks and other credit providing institutions loans takes the first position to be the most obvious source of credit risk (Imarticus 2019; Witzany 2017). Witzany (2017) stated that the major causes of banking problem among others mainly comes from lax credit standards for borrowers and counterparties and poor portfolio risk management among other problems related to lack of attention to the economic changes that can compromise the credit standing for a bank's counterparties. Witzany (2017) also stated that some of these experiences are most common in G-10 and non-G-10 nations. In many emerging markets some groups of people are excluded from the mainstream formal financial markets due to the problems of lack of information. Some of these people like the youth, small businesses, and women do not have credit history and sometimes collateral security which limits their ability to get credit. Scholars refer to this problem as information asymmetry. It is also believed that AI can help to agree with this problem. Appendix A one summarises the findings of this study.

AI, Machine Learning, and Asymmetric Information and Credit Risk Assessments
Information asymmetry is a situation where agents do not have the same level of information (Marwala and Hurwitz 2015;Mhlanga 2020;Tfaily 2017). Marwala (2015) also defined information asymmetry as a study of decisions made by human beings in circumstances where one agent has more information than another human agent. Information asymmetry sometimes is not desirable for instance in the credit market and the labour market including other settings. In the labour market, when interviews are taking place information asymmetry is not desirable especially when the potential employer requires more information about the potential employee as articulated by the Nobel Laureate Michael Spence (Spence 1973;Marwala 2015). The only way the potential employer can get as much information as possible is through signalling where the employer signals to the employee to reveal more information as possible and at the same time, the potential employee signals to the potential employer information such their qualifications to send the message that he/she is qualified for the job. Tfaily (2017) also posits that in the credit market the problem of information asymmetry can also bring some problems. It is believed that the debtor claims sometimes are in a weak position because of a lack of precise information on the financing project and at the same time the bank does not have accurate information for it to be able to assess credit risk. The bank should try its best to minimise credit risk to avoid losses. Minimising credit risk is based on the capacity of the bank to collect and be able to process information when they accept credit applications (Tfaily 2017). The information about the characteristics of the borrower is required at the acceptance of credit. The bank also needs information about the borrower after the credit has been issued so that the bank can control the actions of the borrower. Looking for this information the bank is confronted with the problem of information asymmetry. The problem of information asymmetry on its own generates two more problems, adverse selection, and moral hazard.
The figure above is explaining the problem of information asymmetry. As shown in Figure 1, information asymmetry generates two problems: the Ex-Post Moral Hazard and the Ex-Ante adverse selection. At this moment, the study will try to show how AI and machine learning can help to address the problem of information asymmetry, moral hazard, and adverse selection. ployee signals to the potential employer information such their qualifications to send the message that he/she is qualified for the job. Tfaily (2017) also posits that in the credit market the problem of information asymmetry can also bring some problems. It is believed that the debtor claims sometimes are in a weak position because of a lack of precise information on the financing project and at the same time the bank does not have accurate information for it to be able to assess credit risk. The bank should try its best to minimise credit risk to avoid losses. Minimising credit risk is based on the capacity of the bank to collect and be able to process information when they accept credit applications (Tfaily 2017). The information about the characteristics of the borrower is required at the acceptance of credit. The bank also needs information about the borrower after the credit has been issued so that the bank can control the actions of the borrower. Looking for this information the bank is confronted with the problem of information asymmetry. The problem of information asymmetry on its own generates two more problems, adverse selection, and moral hazard.
The figure above is explaining the problem of information asymmetry. As shown in Figure 1, information asymmetry generates two problems: the Ex-Post Moral Hazard and the Ex-Ante adverse selection. At this moment, the study will try to show how AI and machine learning can help to address the problem of information asymmetry, moral hazard, and adverse selection.

How Does AI Help to Solve the Problem of Information Asymmetry?
As articulated by Marwala (2015); Moloi and Marwala (2020a); Moloi and Marwala (2020b), AI can help to solve the problem of information asymmetry which can go a long way in addressing the huge developmental problem of financial exclusion, especially in the credit market. The first way in which AI can help to solve the problem of information asymmetry is through signaling and the use of big data and deep learning. One example given by Marwala and Hurwitz (2015) was the issue of social networks which are powered by AI to an extent that they can signal information in a much more accurate fashion than what a human agent can do. In this way, it is believed that AI can help to solve the problem of information asymmetry in many circumstances including the credit market. Marwala and Hurwitz (2015) also came up with screening as one of the critical ways in which AI can help to solve the problem of information asymmetry. The screening came because of the work of the Nobel Laureate Joseph Stiglitz (1974) where a human agent that knows little induces the human agent with more information to reveal more information (Marwala and Hurwitz 2015). The emergence AI is now not necessary to induce another human agent to reveal more information about himself. This is now possible due to the existence of the internet. With the internet, one agent can successfully use the internet to come up with a profile of the other human agent which in most cases is more accurate and informative as compared to the information that could be obtained from the party question. Marwala and Hurwitz (2015) posits that the reason for this is that usually, human agents forget facts very easily and, in many cases, they may not be able to reveal all the information for a variety of reasons.
One example where AI and machine learning are doing a lot in addressing financial exclusion by solving the problem of information asymmetry is through Branch, a mobile application digital lender which operates in countries such as Kenya, Mexico, Nigeria, India, and Tanzania (Biallas and O'Neill 2020). Through this application, financial services, and products, such as credit, is made available, affordable, and accessible by vulnerable groups like smallholder farmers. For instance, a total of 15 million loans were provided to over three million customers disbursing a total of 350 million US dollars since the inception of Branch (Biallas and O'Neill 2020). Branch makes use of machine learning to come up with an algorithm approach to assess the creditworthiness of the potential borrower using thousands of data points of the individual and the accumulated experience of the different borrowers. Branch applies its algorithm to mobile data such as text messages, call logs, contacts, and Global Positioning System (GPS) combined with the credit history of the borrower to come up with a lending decision. The application can access this information as soon as the borrower download the application, verify their identity, and give consent to the Branch application to use the cell phone data (Biallas and O'Neill 2020). The system can create personalised options in seconds giving the branch the authority to approve a loan within minutes. Using this application, loan durations range from few weeks to more than a year giving loans of little amounts of 50 US dollars which is practically impossible using the traditional credit assessments methods. Using this application low-income earners, small businesses, the women youth can have access to the formal financial market. This is in line with the arguments given by Henze and Ulrichs (2016) who argued that the low levels of agricultural output experienced by many farmers are due to the inability to use modern technologies and their financial challenges are due to their financial exclusion. The argument we put in this study is that the application of machine learning and AI in the agricultural sector can help to solve the problem of financial exclusion which can allow small businesses to be productive. Tinsley and Agapitova (2018) also came with the view that private sector solutions are assisting smallholder farmers to be able to fight the challenges they face related to access to affordable financial products. Tinsley and Agapitova (2018) went on to argue traditional finance have not addressed smallholder farmers' need for financial services because of the perceived high risk associated with them as well as incompatible financial products. However, Tinsley and Agapitova (2018) stated that social enterprises are coming up with more efficient, cost-effective, and customised financial solutions which are doing well in unlocking credit and managing risk which involve the application of AI and machine learning in credit risk analysis.

AI, Machine Learning and Adverse Selection
Information asymmetry in the credit market generates two problems as indicated in Figure 1. These two problems are adverse selection and moral hazard (Moloi and Marwala 2020a;Tfaily 2017). Information asymmetry between the bank and the borrower is the source of a problem called adverse selection or anti-selection (Tfaily 2017). As articulated by Tfaily (2017) adverse selection usually manifest itself before signing the credit agreement. Usually after signing the agreement and granting of credit information asymmetry becomes a source of moral hazard (Marwala 2015;Moloi and Marwala 2020a). The problem of adverse selection arises when there is a limitation on observing the characteristics of a product or service. According to Akerlof (1970), the problem is commonly known as the adverse selection, which happens before signing the contract because the information from the borrower is dissimulated. As articulated by Tfaily (2017), and Moloi and Marwala (2020a), credit relationships between poor-quality borrowers and quality borrowers are such that poor quality borrowers always strive to portray themselves a good quality borrower through hiding critical information. All this is done to try and show that they are less risky clients.
The problem is worsened by the fact that banks in most cases cannot discriminate against borrowers concerning their quality. This makes it a barrier in the financial market because good customers are driven out of the market. After all, the bank in many cases will come up with conditions to try and differentiate these borrowers. The perception to accept credit conditions will be used by the bank to differentiate credit applicants concerning their quality. As articulated by Stiglitz and Weiss (1981), identifying good credit applicants is something very difficult for many banks due to the problem of information asymmetry; as a result, banks will offer a single interest rate to all the credit applicants to maximise the anticipated returns. As such, the single interest rate will act as a barrier to small businesses and smallholder farmers who cannot afford to pay that interest rate, causing financial exclusion. However, Moloi and Marwala (2020b) argued that the era of intense automation and digitization powered by AI can push economic agents to form some peculiar relationships which include the sharing of certain information that will help in opening the opportunities to harvest and store big data that can be used by economic agents such as banks to do effective credit analysis of the individuals seeking credit. Using AI, economic agents can build, link, and analyses the big new data sets which are difficult for human beings. In a way, the problem of adverse selection will greatly be reduced. The big data will also allow economic agents to use machine learning algorism to do credit assessments which can allow the low-income earners and the poor to be able to access credit. This was supported by Emeana et al. (2020) who also argued that the application of technology in the agricultural sector like the use of mobile phone-enabled agricultural information services powered by AI has significantly improved the livelihoods of smallholder farmers in emerging economies, especially in Africa. Emeana et al. (2020) believe that technology is assisting in the facilitation of the smallholder farmer's access to affordable financial services as well as the sourcing of agricultural information related to the use of inputs, good agricultural practices, and market prices. The other study by Xie (2019) stated that the rapid developments in AI and machine learning made it possible for the financial sector to take advantage of it. Xie (2019) went further to state that AI and machine learning are influencing the financial sector on many fronts which include the delivery of innovative financial services, use of an intelligent consultant, intelligent lending, monitoring, and warning as well as intelligent customer service. Xie (2019) also reported that AI applications in the financial sector have allowed several risks which include credit risk to be addressed. In this study, we noted that AI and machine learning are useful tools that can be applied in the credit market to deal with risks that prevent other groups like small businesses to be able to access credit.

AI, Machine Learning and Moral Hazard
As articulated before the existence of ex-post information asymmetry in a contract generates moral hazard especially after signing the contract. This problem arises due to the inability of agents to be able to observe the actions of other agents (Tfaily 2017). Moral hazard mainly takes place when a certain condition arises, and it normally gives rise to different situations. According to Tfaily (2017), moral hazard normally occurs in two different situations, the uninformed agent does not have information about the actions of the partners and at the same time, the partners sometimes engage in opportunistic behaviour taking advantage of the fact that the agent is uninformed and sometimes they act in their own interests (Berger et al. 2011). The fact that the uninformed agent does not know the circumstances in which the actions take place makes it very difficult for the agent to be unable to verify the validity of the actions. Stiglitz and Weiss (1981) stated that when the borrower exploits the information advantage, in this way the bank will face the risk of asset substitution of moral hazard. This mainly results because the borrower can take risky actions that lead to the failure of the funded project. The main issue here is that non-compliance with the credit agreement is the main cause of moral hazard. Moloi and Marwala (2020c) also argued that moral hazard cannot be separated from the concept of adverse selection. Moloi and Marwala (2020c) stated that the coming of AI will be able to reduce the problems associated with moral hazard because with AI there is no need to depend more on economic agents to be fair by disclosing material information. At the same time, it was highlighted that there is also no need to come up with innovative ways to persuade economic agents through incentives to disclose material information or using threats of penalties for them to disclose important information. As noted earlier, the era of intense automation and digitization powered by AI can push economic agents to a form of a peculiar relationship which includes the sharing of certain information that will help in opening the opportunities to harvest and store big data that can be used by economic agents such as banks to do effective credit analysis if the individuals seeking credit (Marwala and Hurwitz 2015).
Another case study where AI and machine learning techniques are used is through FarmDrive (Biallas and O'Neill 2020). FarmDrive is an agricultural data analytics company that assists in the delivery of financial services to unbanked and under saved smallholder farmers and at the same time assist financial institutions to increase their agricultural loan portfolios in a cost-effective manner. FarmDrive uses machine earning technology, simple mobile phone technology, and alternative credit scoring to allow smallholder farmers to be able to access financial services through closing the data gap that has been keeping the smallholder farmers from accessing financial services and products. The first step for smallholder farmers to get credit is for FarmDrive to collect a farmers' data using questions and answers through text messages to get the information related to the location of the farmer, the crops they are cultivating, the size of the farm, the assets the farmer possess like tractors and the activities of the farmer. This information is then used to create a farmer's credit profile by combining it with the existing agricultural data. The profile will be shared with financial institutions for credit assessment and funding.
In a way, AI is doing a lot to improve credit decisions using alternative data sources. The traditional data that was used in generating credit scores include formal identification, bank transactions, credit history, income statements and asset value. In emerging markets, the underbanked individuals like women, the youth and the poor find it difficult to access the traditional forms of collateral or identification required by the creditors for them to have the financial services. The use of AI and machine learning and alternative data sources like company registers, social media data like messages, satellite images, and public data is helping lenders and credit lending institutions to do credit assessments particularly consumer behaviour and verify their ability to repay the loans. The other examples include the evidence provided by MyBucks which suggest that predictive scorecards assist in reducing the rates of non-performance of various loans and portfolios. MyBucks is providing microloans and insurance directly to customers in countries such as Zambia, Malawi, and Uganda through the application of AI technology Jessie to scrape data from potential borrowers' phone to generate a lending profile. Through using predictive scoring, the default rate on MyBucks's loan portfolio was reduced by 18% in South Africa over the 2017-2018 financial year. This information is showing clearly that AI and machine learning can successfully be used in credit assessments which can allow those who are financial excluded to enjoy formal financial services.

Conclusions and Policy Recommendations
In banking and finance, credit risk is among the important topics because the process of issuing a loan requires a lot of attention to assess the possibilities of getting the money back. At the same time, in emerging markets, the underbanked individuals especially women, the youth, the small businesses cannot access traditional forms of collateral or identification that is required by financial institutions such as banks for them to be granted loans. However, the use of alternative data sources such as public data, images from satellite, registered from companies, and data from social media like SMS and messenger services and interaction data makes AI and machine learning assist lenders to do serious credit risk analysis, to assess the behaviour of the customer and subsequently do the verification of the ability of the clients to repay the loans. Using the literature review approach through content analysis and conceptual analysis of authoritative documents such as governments reports, international statistics, media articles, peer-reviewed journal articles, books to investigate the impact of machine learning and artificial intelligence in credit risk assessment, the study discovered that AI and machine learning has a strong impact on credit risk assessments.
The results pointed to that the use of alternative data makes it possible to use machine learning techniques to assess the creditworthiness of the previously excluded individuals allowing them to also access credit. It was also discovered that using alternative data, the problems that affect the credit market which mainly manifest through information asymmetries such as moral hazard and adverse selection can be dealt with if AI and machine learning are applied to alternative data. For instance, social networks which are powered by AI can signal information in a much more accurate fashion than what human agents can do which can help to solve the problem of information asymmetry in the credit market. Again, using AI economic agents can build, link, and analyse the big new data sets which are difficult for human beings, and this can help to solve the problems associated with adverse selection. The big data will also allow economic agents to use machine learning algorithms to do credit assessments which can allow the low-income earners, and the poor to be able to access credit. Therefore, financial institutions such as banks and credit lending institutions must invest more in AI and machine learning which can allow small businesses, smallholder farmers, women, and low-income groups to be able to get credit. This can help in improving financial inclusion, improving the incomes of these groups which can also help in the reduction of poverty and underdevelopment of the emerging markets. The study also recommends that the investment partnerships between the governments and private corporations be promoted to ensure that emerging economies be able to start investing in AI and machine learning. This is important because many emerging economies find it hard to do serious investments due to the costs related to these technologies. Table A1. Summary of the Impact of Artificial Intelligence and Machine Learning in Credit Risk Analysis.

Impact of Artificial Intelligence and Machine Learning Brief Description
AI, Machine Learning, and Asymmetric Information and Credit Risk Assessments The first way in which AI can help to solve the problem of information asymmetry is through signalling and the use of big data and deep learning. One example given by Marwala and Hurwitz (2015) was the issue of social networks which are powered by AI to an extent that they can signal information in a much more accurate fashion than what a human agent can do. In this way, it is believed that AI can help to solve the problem of information asymmetry in many circumstances including the credit market

AI, Machine Learning and Adverse Selection
Information asymmetry in the credit market generates two problems, adverse selection, and moral hazard (Moloi and Marwala 2020a;Tfaily 2017). Moloi and Marwala (2020b) argued that the era of intense automation and digitization powered by AI can push economic agents to a form of some peculiar relationships which include the sharing of certain information that will help in opening the opportunities to harvest and store big data that can be used by economic agents such as banks to do effective credit analysis of the individuals seeking credit. Using AI economic agents can build, link, and analyses the big new data sets which are difficult for human beings. In a way, the problem of adverse selection will greatly be reduced.

AI, Machine Learning and Moral Hazard
The existence of ex-post information asymmetry in a contract generates moral hazard especially after signing the contract. This problem arises due to the inability of agents to be able to observe the actions of other agents. Moloi and Marwala (2020c) stated that the coming of AI will be able to reduce the problems associated with moral hazard because with AI there is no need to depend more on economic agents to be fair by disclosing material information. At the same time, it was highlighted that there is also no need to come up with innovative ways to persuade economic agents through incentives to disclose material information or using threats of penalties for them to disclose important information. As a result, AI presents a better way to harvest information about the borrower helping lenders to address the problem of moral hazard.