Machine Learning and Financial Literacy: An Exploration of Factors Inﬂuencing Financial Knowledge in Italy

: In recent years, machine learning techniques have assumed an increasingly central role in many areas of research, from computer science to medicine, including ﬁnance. In the current study, we applied it to ﬁnancial literacy to test its accuracy, compared to a standard parametric model, in the estimation of the main determinants of ﬁnancial knowledge. Using recent data on ﬁnancial literacy and inclusion among Italian adults, we empirically tested how tree-based machine learning methods, such as decision trees, random, forest and gradient boosting techniques, can be a valuable complement to standard models (generalized linear models) for the identiﬁcation of the groups in the population in most need of improving their ﬁnancial knowledge.


Introduction
In the wake of the global financial crisis of 2007-2008 and of the recent events concerning the COVID-19 global pandemic crisis, the debate on the importance of financial literacy (FL) has gained further momentum, because more vulnerable and less informed investors are the most exposed to crises, and not only financial ones. As of May 2020, more than 70 countries were designing or implementing national strategies for financial literacy. Thus, the OECD developed a set of recommendations to assist governments or other public authorities to design, implement, and evaluate policies to support financial resilience and well-being, in addition to addressing the needs of vulnerable groups (OECD 2020). In fact, those who are financially illiterate have been proven to have a lower ability to cope with emergency expenses and income shocks (Hasler et al. 2018), and a lower propensity to withdraw deposits from distressed banks (Brown et al. 2016) and to leave the stock market before crashes (Guiso and Viviano 2015), than the more financially conscious population. In this sense, financial knowledge can help individuals with the process of financial decision-making and savings because it enables them to plan for wealth accumulation , to be more financially included (Grohmann et al. 2018), and to choose investments that are the most suitable for their needs, considering all the possible risks (Bianchi 2018).
At the micro level, several papers document a positive correlation between measures of FL and wiser financial decisions in various domains about both assets and debts. For example, individuals with higher levels of financial knowledge are more likely to participate in financial markets and to invest in stocks (Christelis et al. 2010;Yoong 2011;Van Rooij et al. 2011), to have better diversified portfolios (Guiso and Jappelli 2008;Von Gaudecker 2015), and to earn higher yields on deposit accounts (Deuflhard et al. 2019). Hsiao and Tsai (2018) provide evidence of a positive impact of FL on trading in leveraged derivative products, an important means of hedging financial risks in portfolios. Financial literates are less prone to over-indebtedness (Lusardi and Tufano 2009;Lusardi and Scheresberg 2013;Lusardi et al. 2016) and to choose adjustable rate mortgages instead of less risky mortgages (Gathergood and Weber 2017). They usually better perform in peer-to-peer lending markets (Chen et al. 2018) and choose mutual funds with lower fees (Hastings and J. Risk Financial Manag. 2021, 14, 120 2 of 21 Tejeda-Ashton 2008). They are more likely to plan for retirement (Goda et al. 2020) and, as a result, to better allocate resources over their lifetimes in a world that, especially in recent years, is increasingly complex and uncertain (Clark et al. 2015;Behrman et al. 2012). Recently, Feng et al. (2019), following a Bayesian two-part latent variable modelling approach, identified the simultaneous impact of FL on household debt and assets. They found that households with insufficient financial knowledge are more financially vulnerable because they are more likely both to have fewer assets and to choose high-cost unsecured debts that expose them more to potential financial constraints.
Another stream of literature investigates the impact of FL at the macro level (Lusardi and Mitchell 2014). For example, Gerardi et al. (2010) stresses how limited financial knowledge can be considered a cause of the 2007 U.S. financial crisis. Fornero and Lo Prete (2019) found a clear association between the FL of the electorate (or the ability to understand essential concepts of economic reforms, mainly regarding the pension systems) and electoral outcomes. They conclude that "financial illiteracy may also harm reformist efforts and has clear policy implications" (Fornero and Lo Prete 2019, p. 24) in terms of successful implementation of economic reforms. Moreover, using a life-cycle approach, Lusardi et al. (2017) show that gaps in FL amplify differences in wealth accumulation patterns and the consequent perpetuation in wealth inequality. In this direction, Lo Prete (2013Prete ( , 2018 empirically tested how the ability to take advantage of different financial opportunities, measured by financial knowledge, may help to reduce inequality across countries and over time. She found that the level of economic literacy is associated with income inequality across countries, using a sample of advanced and developing countries observed over the 1980-2007 period. All of these studies have evident policy implications because inequality appears to decrease not when more complex and sophisticated financial instruments are available but only when the ability to understand and use these instruments increases among all of the population. In fact, the debate on the relationship between finance and inequality poses FL as relevant to the policy agenda of many countries, as defined above. Consequently, as the OECD (2020) underlined in the Recommendation on Financial Literacy, it is crucial to collect high-quality, comparable data on levels of financial knowledge and to analyze these "data to identify aspects of financial literacy that cause particularly significant issues as well as the groups in the population in most need of improving" (OECD 2020, p. 7).
This paper aims to contribute to the analysis of FL, extending the common methodology to machine learning (ML) techniques. Although ML has been widely used in finance (e.g., see, Dixon et al. 2020;Bracke et al. 2019;, to the best of our knowledge there are still no analyses of ML techniques applied to financial knowledge. Nonetheless, we state that ML techniques can be valuable as a complement to standard parametric models in the study of financial literacy. Demonstrating that analytical steps of the econometric processes, like the logistic regression model that we apply to real data, has a homologous step in ML analyses, we clearly find a correspondence between parametric and ML techniques, with the goal to also facilitate and reconcile the adoption of ML techniques in the context of financial literacy. ML can meet the need of in-depth investigation, which is of paramount importance in financial literacy analyses. Due to its flexibility, the ML framework can provide more information about the heterogeneity and commonality across different subpopulations and can help researchers and policy-makers to understand the characteristics of individuals with lower levels of financial literacy and therefore at higher risk of financial fragilities. Our analysis provides preliminary evidence that ML techniques can produce reliable information for financial literacy that is consistent with the literature, although it can also identify different patterns of correlations than traditional parametric models (i.e., high variable importance of financial behavior and attitude as determinants of financial knowledge).
In detail, we propose a comparison among a parametric model (a logistic regression model) and ML models to identify the precision accuracy of the different models. We also use tree models to assess model selection and the approximate direction and functional form of the relationships between the inputs and the output, discussing the measures of variable importance in three tree models: decision trees (Breiman et al. 1984), random forest (Breiman 2001), and gradient boosting machine (Friedman 2001). These models are classified on the basis of the outcome variable type: classification models in the case of categorical variables and regression models otherwise. Because we refer to categorical variables, the algorithms used for this study are classification models. We test empirically these models using data available for Italy collected by the Bank of Italy.
We concentrate on Italy because is a negative outlier among the most advanced economies considering the level of financial competencies of adults (Klapper et al. 2015;Di Salvatore et al. 2018). According to the Standard & Poor's Ratings Services Global Financial Literacy Survey (S&P Global FinLit Survey), only 37 per cent of adult Italians correctly understand basic financial concepts, compared with 52 per cent on average in the EU. In addition, the G20/OECD International Network for Financial Education (INFE) report on adult financial literacy shows a very low level of financial literacy in Italy compared with the G20 average ( Figure 1). Thus, the financial knowledge score in Italy is 3.5 out of a maximum of 7 points on average, compared with a G20 average of 4.3. According to Di Salvatore et al. (2018), the lower level of financial knowledge in Italy can be explained by the higher share of individuals with low levels of education, in fact "about 47 per cent of the adult Italian population has a primary level of education, while the same group accounts for only 14 per cent of the population in Germany and does not exceed 10 per cent in Canada and the UK" (p. 9). We can add that Italy has also higher unemployment rates than most of the countries compared in Figure 1. Therefore, our analysis aims to test parametric and ML techniques to define the main determinants of financial literacy gaps among Italians, who on average are less financially educated than G20 citizens. J. Risk Financial Manag. 2021, 14, x 3 of 21 In detail, we propose a comparison among a parametric model (a logistic regression model) and ML models to identify the precision accuracy of the different models. We also use tree models to assess model selection and the approximate direction and functional form of the relationships between the inputs and the output, discussing the measures of variable importance in three tree models: decision trees (Breiman et al. 1984), random forest (Breiman 2001), and gradient boosting machine (Friedman 2001). These models are classified on the basis of the outcome variable type: classification models in the case of categorical variables and regression models otherwise. Because we refer to categorical variables, the algorithms used for this study are classification models. We test empirically these models using data available for Italy collected by the Bank of Italy.
We concentrate on Italy because is a negative outlier among the most advanced economies considering the level of financial competencies of adults (Klapper et al. 2015;Di Salvatore et al. 2018). According to the Standard & Poor's Ratings Services Global Financial Literacy Survey (S&P Global FinLit Survey), only 37 per cent of adult Italians correctly understand basic financial concepts, compared with 52 per cent on average in the EU. In addition, the G20/OECD International Network for Financial Education (INFE) report on adult financial literacy shows a very low level of financial literacy in Italy compared with the G20 average ( Figure 1). Thus, the financial knowledge score in Italy is 3.5 out of a maximum of 7 points on average, compared with a G20 average of 4.3. According to Di Salvatore et al. (2018), the lower level of financial knowledge in Italy can be explained by the higher share of individuals with low levels of education, in fact "about 47 per cent of the adult Italian population has a primary level of education, while the same group accounts for only 14 per cent of the population in Germany and does not exceed 10 per cent in Canada and the UK" (p. 9). We can add that Italy has also higher unemployment rates than most of the countries compared in Figure 1. Therefore, our analysis aims to test parametric and ML techniques to define the main determinants of financial literacy gaps among Italians, who on average are less financially educated than G20 citizens. We are conscious that we focus on a limited case study, but we think that it can be seen as a first step to encourage the adoption of ML techniques in applied economics and among researchers in the context of financial literacy. ML is, in fact, a transparent research tool with an important role to play because it has the advantage of: (i) focusing on out-ofsample predictability over variance adjudication; (ii) using computational methods to avoid relying on (potentially unrealistic) assumptions; (iii) having the ability to "learn" complex specifications, including non-linear, hierarchical, and non-continuous interaction We are conscious that we focus on a limited case study, but we think that it can be seen as a first step to encourage the adoption of ML techniques in applied economics and among researchers in the context of financial literacy. ML is, in fact, a transparent research tool with an important role to play because it has the advantage of: (i) focusing on out-of-sample predictability over variance adjudication; (ii) using computational methods to avoid relying on (potentially unrealistic) assumptions; (iii) having the ability to "learn" complex specifications, including non-linear, hierarchical, and non-continuous interaction effects in a high-dimensional space; and (iv) featuring importance analyses robust to multicollinearity. For all of the above reasons, ML can be useful to researchers, and policy makers or financial analysts, to analyze complex data and a large volume of information simultaneously, thus providing a more nuanced and detailed picture of the phenomenon of financial literacy. This paper is organized into three main sections. The first section summarizes research findings of recent literature about the main determinates of FL, providing an accurate mapping of methodologies and the main variables used to explain the phenomenon. The second section provides readers with foundational knowledge of the ML algorithms used. The third section introduces the data used and the main results of the empirical analysis. The final section summarizes research findings and identifies future research needs. The proposed ML methodology can be used above and beyond our empirical analysis, because ML offers the opportunity to gain insight from: (a) new datasets that cannot be modelled with econometric methods; and (b) old datasets that incorporate complex relationships that are still unexplored.

Factors Influencing Financial Knowledge: A Literature Review
Financial literacy, as described in the introduction, is increasingly attracting the attention of international organizations, financial regulators, policymakers, and academics (for a review of the most cited papers on the issue see Goyal and Kumar 2020). Findings around the world are sobering. FL is low even in advanced economies with well-developed financial markets. On average, only about one-third of the global population has a familiarity with the basic concepts that underlie everyday financial decisions (Lusardi and Mitchell 2011;Lusardi 2019).
Despite the importance of the issue, there is still no consensus on its best definition and the most suitable tools to measure the level of financial knowledge (Rieger 2020). Different data and definitions have been used, from mathematical skills at school age in Programme for International Student Assessment (PISA) test scores (Jappelli and Padula 2013) to numerical ability and other dimensions of cognitive function in older adults . However, in the literature, an increasing number of papers (Bianchi 2018;Fornero and Monticone 2011;Kadoya and Khan 2019;Klapper and Panos 2011) uses the same measure for assessing the level of financial knowledge of adults, based on three basic concepts, commonly called the "Big Three" (Lusardi and Mitchell 2008). These three concepts, that can be easily applied to every context and economic environment, are: (1) numeracy, the capacity to do interest rate calculations, and to understand how to calculate interest compounding; (2) the knowledge of inflation and how it interacts with purchasing power; and (3) the comprehension of the importance of portfolio diversification to reduce risks. At the international level, OECD International Network for Financial Education (INFE) integrates the understanding of the three basic concepts described above with measures of financial attitude and behavior necessary to make sound financial decisions and ultimately achieve individual financial wellbeing.
Considering the main determinants of different levels of financial knowledge, the variables used in the literature are heterogeneous, depending on the countries analyzed and the different perspectives of researchers. Highlighting some common trends in determinants and main results, we can summarize an extensive literature on financial literacy by dividing the variables that correlate to financial literacy into seven categories.
These categories are as follows: 1. Gender: One of the main common results in the literature is that women have lower FL than men. In fact, in 2011 the OECD found a gender gap in FL in 13 countries, with Hungary the only exception . Bucher-Koenen et al. (2017), extending the evidence for other countries, found that only ex-Soviet countries (Russia, Romania, and East Germany) have an equal distribution of financial knowledge between sexes. However, recent literature stresses how, when asked to answer questions that measure knowledge of basic financial concepts, women are less likely than men to indicate that they do not know the answer (Bucher-Koenen et al. 2017;Kim and Mountain 2019;Ooi 2020). Therefore, the lower scores of women compared to men in financial literacy surveys reflects more the differences in the genders' self-reported confidence than the gender differences in their actual level of financial knowledge. Al-Bahrani et al. (2020) found the origins of the gender-based financial literacy gap early in life (early college age), before individuals have the opportunity to develop financial skills through experience or specialization in household roles. Jappelli and Padula (2013) explain the gender gap in the fact that women generally have less wealth than men and therefore fewer incentives to invest in FL.

2.
Education: Higher education is usually reported as one of the most important factors in ensuring an adequate understanding of financial concepts. Many studies have shown that individuals with higher levels of education, i.e., who completed a university or college degree, are the most likely to be financially literate (Lusardi and Mitchell 2008;Cole et al. 2011). In addition, Mandell (2008) and Al-Bahrani et al. (2020) have shown that the correlation between financial literacy and education is present at the early stages of lifecycle, and is highly correlated with mathematics ability. Morgan and Trinh (2019), using the OECD/INFE data for Cambodia and Viet Nam, found that both financial literacy and general education levels are found to be positively and significantly related to savings behavior and financial inclusion, also controlling for possible endogeneity of financial literacy.

3.
Financial fragility: Financial knowledge is usually associated with household's income levels and financial fragility. The concept of financial fragility is of paramount importance in the period of crises (such as the COVID-19 pandemic) to understand whether households lack capacity to face shocks. The concept, as defined in Demertzis et al. (2020), encompasses the state of household balance sheets, including indebtedness, and also relies on individual perceptions of the ability to rely on family and friends and other methods to deal with shocks. Previati et al. (2020) examined financial fragility in Italy using pre-COVID-19 data, and documented the strong link between financial fragility and financial literacy: almost 45% of low financial educated Italian households do not have sufficient financial resources to cover a lack of income even for short periods (2 months or less). Therefore, households with a low level of financial education are also less resilient.

4.
Age: The impact of age is controversial, even if the age effect is widely described as an inverse U-shaped pattern ( Employment status: This is also an important determinant of financial knowledge, with the lowest level of FL usually recorded among those who are not in the formal paid labor markets (Kadoya and Khan 2019). However, retired people have higher levels of financial knowledge, perhaps due to the increasing privatization of national pension systems, which implies a personal choice among different pension investment plans and solutions for retirement. 6.
Family status: Mixed effects are reported in the literature with reference to marital status and family size. According to Jappelli and Padula (2013) and Klapper and Panos (2011), singles have a significant propensity for lower financial literacy levels compared those who are married. In contrast, Bianchi (2018), for France, finds that financial knowledge is negatively correlated with marital status. Moreover, Jappelli and Padula (2013) report a significant negative correlation between financial literacy and family size, whereas Klapper and Panos (2011), for Russia, find a positive but not significant relation. 7.
Geography: In addition to personal characteristics, recent literature demonstrates how different cultural backgrounds and embedded social norms can impact on financial knowledge and skills, and thus the importance of analyzing data disaggregated by different geographical contexts (Brown et al. 2018;De Beckker et al. 2020). For Italy, Fornero and Monticone (2011), exploiting data from the Bank of Italy's Survey on Household Income and Wealth, found evidence of main differences within the same national territory: they identified a significant difference in FL among residents of different regions, with North-Central Italian residents having higher literacy levels than those of the South of Italy. They also reported a positive correlation between individual FL and the household level of digital alphabetization (measured by the presence of at least one member of the household using a computer).
Different studies have also analyzed the relationship among financial knowledge, attitude, and behavior with mixed results. Xiao et al. (2011) found that financial knowledge predicts financial attitude and the latter contributes to the financial behavior of a person. Chaulagain (2017), instead, argues that behaviours are influenced by literacy but not by attitude, and vice versa. Finally, Kadoya and Khan (2019), for Japan, emphasized the importance of psychological variables, in addition to demographic and socio-economic variables, as determinants of FL.
With the exception of psychological variables, for which data are not readily available, we apply all of the dimensions described above, including other controls (see Table A1 in the Appendix A) to define how ML techniques can be used to describe the financially literate population and how accurate they are. Machine learning has been used in the financial services industry for over 40 years, however, it is only in recent years that it has become more pervasive across investment management and trading. Several recent articles have been published that provide evidence of superior performance of non-linear regression techniques for fundamental factor models, such as regression trees (López de Prado 2019; Jain and Jain 2019). Many contributions apply machine learning for predicting portfolio returns. Among others, Moritz and Zimmermann (2016) predict portfolio returns considering tree-based models, Gu et al. (2020) address the prediction of individual stock returns and compare the forecasting performance of different machine learning methods for aggregate portfolio returns to ordinary least squares (OLS) regression, obtaining better accuracy. We apply ML techniques to finance literacy data to show that they can be a useful tool for integrating traditional econometric analysis.

Estimation Techniques: Machine Learning
Before empirically applying non-parametric models to study FL and test their predictive performance compared with parametric models, we briefly explain the main characteristics and differences of ML approaches, in particular decision trees (Section 3.1), random forest (Section 3.2), and gradient boosting machine (Section 3.3) techniques.

Decision Tree
Following a hierarchical structure, a decision tree (DT) partitions the predictor space R by a sequence of binary splits, giving rise to a tree (Hastie et al. 2016). In this manner, the predictor space is recursively split into simple regions, and the response for a given observation can be predicted using the mean of the training observations in the region to which that observation belongs (James et al. 2017).
Let (R j ) j∈J be the partition of R, where J is the number of distinct and non-overlapping regions. The DT estimator, given a set of variables x = x 1 , . . . , x p , is defined as follows: f DT (x) = ∑ j JŶ Rj and family size, whereas Klapper and Panos (2011), for Russia, find a positive but not significant relation. 7. Geography: In addition to personal characteristics, recent literature demonstrates how different cultural backgrounds and embedded social norms can impact on financial knowledge and skills, and thus the importance of analyzing data disaggregated by different geographical contexts (Brown et al. 2018;De Beckker et al. 2020). For Italy, Fornero and Monticone (2011), exploiting data from the Bank of Italy's Survey on Household Income and Wealth, found evidence of main differences within the same national territory: they identified a significant difference in FL among residents of different regions, with North-Central Italian residents having higher literacy levels than those of the South of Italy. They also reported a positive correlation between individual FL and the household level of digital alphabetization (measured by the presence of at least one member of the household using a computer).
Different studies have also analyzed the relationship among financial knowledge, attitude, and behavior with mixed results. Xiao et al. (2011) found that financial knowledge predicts financial attitude and the latter contributes to the financial behavior of a person. Chaulagain (2017), instead, argues that behaviours are influenced by literacy but not by attitude, and vice versa. Finally, Kadoya and Khan (2019), for Japan, emphasized the importance of psychological variables, in addition to demographic and socio-economic variables, as determinants of FL.
With the exception of psychological variables, for which data are not readily available, we apply all of the dimensions described above, including other controls (see Table A1 in the Appendix A) to define how ML techniques can be used to describe the financially literate population and how accurate they are. Machine learning has been used in the financial services industry for over 40 years, however, it is only in recent years that it has become more pervasive across investment management and trading. Several recent articles have been published that provide evidence of superior performance of non-linear regression techniques for fundamental factor models, such as regression trees (López de Prado 2019; Jain and Jain 2019). Many contributions apply machine learning for predicting portfolio returns. Among others, Moritz and Zimmermann (2016) predict portfolio returns considering tree-based models, Gu et al. (2020) address the prediction of individual stock returns and compare the forecasting performance of different machine learning methods for aggregate portfolio returns to ordinary least squares (OLS) regression, obtaining better accuracy. We apply ML techniques to finance literacy data to show that they can be a useful tool for integrating traditional econometric analysis.

Estimation Techniques: Machine Learning
Before empirically applying non-parametric models to study FL and test their predictive performance compared with parametric models, we briefly explain the main characteristics and differences of ML approaches, in particular decision trees (Section 3.1), random forest (Section 3.2), and gradient boosting machine (Section 3.3) techniques.

Decision Tree
Following a hierarchical structure, a decision tree (DT) partitions the predictor space ℝ by a sequence of binary splits, giving rise to a tree (Hastie et al. 2016). In this manner, the predictor space is recursively split into simple regions, and the response for a given observation can be predicted using the mean of the training observations in the region to which that observation belongs (James et al. 2017).
Let (Rj)j∈J be the partition of ℝ , where J is the number of distinct and non-overlapping regions. The DT estimator, given a set of variables x = x1, ..., xp, is defined as follows: where family size, whereas Klapper and Panos (2011), for Russia, find a positive but not ificant relation. graphy: In addition to personal characteristics, recent literature demonstrates different cultural backgrounds and embedded social norms can impact on ncial knowledge and skills, and thus the importance of analyzing data ggregated by different geographical contexts (Brown et al. 2018;De Beckker et 020). For Italy, Fornero and Monticone (2011), exploiting data from the Bank of 's Survey on Household Income and Wealth, found evidence of main differences in the same national territory: they identified a significant difference in FL ng residents of different regions, with North-Central Italian residents having er literacy levels than those of the South of Italy. They also reported a positive elation between individual FL and the household level of digital alphabetization asured by the presence of at least one member of the household using a puter). erent studies have also analyzed the relationship among financial knowledge, and behavior with mixed results. Xiao et al. (2011) found that financial ge predicts financial attitude and the latter contributes to the financial behavior on. Chaulagain (2017), instead, argues that behaviours are influenced by literacy by attitude, and vice versa. Finally, Kadoya and Khan (2019), for Japan, zed the importance of psychological variables, in addition to demographic and nomic variables, as determinants of FL. h the exception of psychological variables, for which data are not readily , we apply all of the dimensions described above, including other controls (see in the Appendix A) to define how ML techniques can be used to describe the ly literate population and how accurate they are. Machine learning has been used ancial services industry for over 40 years, however, it is only in recent years that ome more pervasive across investment management and trading. Several recent ave been published that provide evidence of superior performance of non-linear n techniques for fundamental factor models, such as regression trees (López de 19; Jain and Jain 2019). Many contributions apply machine learning for predicting returns. Among others, Moritz and Zimmermann (2016) predict portfolio returns ing tree-based models, Gu et al. (2020) address the prediction of individual stock nd compare the forecasting performance of different machine learning methods gate portfolio returns to ordinary least squares (OLS) regression, obtaining better . We apply ML techniques to finance literacy data to show that they can be a ol for integrating traditional econometric analysis.
tion Techniques: Machine Learning re empirically applying non-parametric models to study FL and test their e performance compared with parametric models, we briefly explain the main ristics and differences of ML approaches, in particular decision trees (Section 3.1), forest (Section 3.2), and gradient boosting machine (Section 3.3) techniques.
ion Tree owing a hierarchical structure, a decision tree (DT) partitions the predictor space quence of binary splits, giving rise to a tree (Hastie et al. 2016). In this manner, ictor space is recursively split into simple regions, and the response for a given ion can be predicted using the mean of the training observations in the region to at observation belongs (James et al. 2017). The size of the tree is controlled by a stopping criterion that sets a limit to its growth, to prevent the splitting process continuing until the terminal nodes of the tree become pure (a node is pure when all of the data belong to the same class). The number of terminal nodes is represented by the complexity parameter cp. Small values of cp produce large trees, increasing the risk of overfitting, whereas large values can underfit the response variable. DTs have the main advantage of being easily interpreted and able to capture any kind of correlation in data. However, they lack robustness in predicting data and small input modification can lead to very different trees. This drawback is due to the use of locally optimal solutions that could be unable to guarantee globally optimal trees. The DT predictive performance can be improved by aggregating many decision trees, thus reducing the variance with respect to a single tree. This technique is behind the ensemble methods, which also include random forest and gradient boosting machine.

Random Forest
Random forest (RF) is an ML technique consisting of the aggregation of many DTs, obtained by generating bootstrap training samples from the original dataset (Breiman 2001). The idea behind this algorithm is to insert a random perturbation in the learning system to differentiate the trees and combine their predictions through an aggregation technique. The RF technique is based on a bootstrap aggregation (bagging), but its peculiarity is the way it considers the predictors: at each split the algorithm selects a random subset of predictors as split candidates from the final set of predictors, thus preventing the predominance of strong predictors in the splits of each tree (James et al. 2017). Specifically, the random subset consists of two-thirds of the data that are sampled with replacement for training, while the remaining third of the data (called "out-of-bag" observations) are excluded for validation. Therefore, in each bootstrap sample, the data of the training set that are not in the sample can be used as a test set. This technique is called out-of-bag (OOB) and allows for easy estimation of the prediction errors.
The RF is defined by: where B is the number of bootstrap samples andf DT ( x|b) is the decision tree estimator developed on the sample b. The number of trees in the forest must be chosen with the goal of explaining the largest percentage of variance and the lowest mean of squared residuals (MSR). It should be quite large so that each predictor has enough possibilities to be selected, although a relatively smaller number of trees (a few hundred) could be sufficient to achieve high accuracy (Probst and Boulesteix 2018). To understand the relevance of the variables for prediction, we refer to the Mean Decrease Gini (MDG), which is a variable importance measure based on the Gini impurity index, i.e., the average (over the forest) of the decrease in the Gini impurity index for a predictor. Let i(t) be the Gini impurity in node t, we denote ∆i(s t , t) as the decrease in impurity of a binary split s t dividing node t into a left node t l and a right node t r . We define ∆i(s t , t) as follows: where p(t l ) = N tl N is the proportion of samples reaching the left node t l and p(t r ) = N tr N the proportion of samples reaching the right node t r , with N the sample size, and N tl and N tr the number of samples reaching the left and right node, respectively. Hence, MDG evaluates the importance of a given variable, x m , in predicting the response variable and is defined as follows: where N T is the number of trees in the forest, v(s t ) is the variable used to split node t and p(t) = N t N is the proportion of samples reaching the node t.

Gradient Boosting Machine
The gradient boosting machine (GBM) is a tree-based algorithm proposed by Friedman (2001) that essentially uses decision trees of a fixed size as weak learners. The prediction is obtained by a sequential approach and not by parallelizing the tree-building process as in RF. More precisely, in GBM, each decision tree uses the information from the previous decision tree to improve the current fit, i.e., "boosting (improving) the error (gradient)" (Ayyadevara 2018, p. 117). In the following, we briefly describe the algorithm's functioning. Given a current model fit, F m−1 , GBM provides a new estimate, F m , as follows: where λ is the learning rate scaling the contribution of each weak learner and h m (x) is the weak learners, defined as: representing the negative gradient of the loss function, L (y i , F m−1 (x i )), evaluated at the current model F m−1 . In summary, the new weak predictor h m tries to minimize the loss function L, given the previous ensemble F m−1 . The accuracy of GBM depends on three fundamental parameters: the number of trees, their depth (i.e., the maximum nodes for each tree), and the learning rate, usually called shrinkage. It is important to choose the right number of trees to obtain a high reduction of the error on the training set. A high number of trees (at least 500) is generally preferable, as a low number might induce overfitting. However, to achieve the minimum predictive error, an appropriate combination of number of trees, tree complexity, and learning rate is necessary.

Data and Methods
We use data from the Bank of Italy's 2017 survey that investigates FL and inclusion among Italian adults, with a questionnaire developed by the INFE. The Italian sample consists of about 2500 persons interviewed using two different methods: 40 per cent were interviewed face-to-face whereas the remainder used a tablet to record their responses. The survey questionnaire, designed according to the INFE framework, measures financial knowledge, behavior, and attitudes. We focus our analysis on the knowledge component that assesses the understanding of basic concepts that are a pre-requisite for making sound and conscious financial decisions (Lusardi and Mitchell 2011): understanding simple and compound interest, inflation, and the benefits of portfolio diversification. There were 7 questions about financial knowledge; we calculated from this dimension a composite FL index that ranges from 0 to 7. Since the average score for Italy is 3.5 out of a maximum of 7 points, lower than the G20 average of 4.3 (see Figure 1), we split the responders into two groups: those with higher financial literacy than the average of Italians-namely those who correctly answered at least 4 questions-and those who are less financial educated. To define the main determinants of higher financial education of Italian adults, we consider a set of personal observable characteristics commonly used in the literature and described in Section 1, such as gender, age, education, household composition, and employment status. We also controlled for migrant status, because migrants are usually more exposed to financial exclusion, and for geographic macro areas of Italy, given the evident macroeconomic gaps among different areas of Italy, specifically the Northern and Southern/"Mezzogiorno" areas (the so-called Italian socio-economic dualism). We also used two different variables to assess the financial fragility of responders: the household economic stress, or whether, in the 12 months before the interview, the household income was insufficient to cover monthly expenses, and risk capacity, or the ability to sustain unexpected expenses without asking for formal or informal loans. We enriched the analysis by also considering financial variables such as financial behavior and attitudes, the propensity for pension planning (to have private pension plans, any pension product, or savings for retirement), and respondents' high self-assessment of financial knowledge (on a scale ranging from 1 to 5). We used the International Network for Financial Education (INFE) framework to measure the three areas of financial literacy: knowledge, behavior, and attitudes (OECD INFE 2011). Therefore, the behavior index was based on questions assessing whether people manage household financial resources by formulating a budget, are able to pay their debts and utilities with no concerns, and acquire information before making investments. Following the OECD/INFE framework, the Bank of Italy measures financial behavior by incorporating a variety of questions to identify three potentially prudent financial behaviours, namely: -Saving, financial assets, and long-term planning: a set of questions is used to understand if individuals purchased financial assets in the two years before the survey, therefore, if they are actively saving or borrowing, and whether they set themselves long-term financial goals.  (2011), on a scale of 0 to 9. Financial attitude instead evaluated personal traits such as preferences, beliefs, and non-cognitive skills, which are likely to affect personal well-being, on a scale from 0 to 5; the main driver of the index is a positive saving orientation, mainly for the long term. Because Di Salvatore et al. (2018) found that "the response behavior of Italian respondents appears to be influenced by the survey mode" (p. 8), we also included in our estimates a dummy variable to identify if the responder had a face-to-face interview or used a tablet to record their responses (in Appendix A, Table A1 provides a full description of the variables considered). It is clear from the first descriptive statistics in Table 1 that the level of FL is not uniform throughout the population in Italy. Although small, there are gender gaps in financial knowledge, with men slightly more financially literate than women. In addition, we find the abovecited reverse U-shaped curve for age, because financial knowledge increases with age but decreases for older adults, with a peak for the working age group 40-49 years old. FL is higher for those employed in paid work but lower among those in unpaid domestic work and those unemployed or seeking their first employment. FL is higher in the North Western regions of Italy. However, on average a low share of Italians (8%) rates their financial knowledge as being high. Finally, among financial literates, the average levels of good financial behaviours and attitudes are still low (4.5 on a scale of 0-9 and 2.1 on a scale of 0-5, respectively), but their ability to cope with unexpected expenses without asking for formal or informal loans or to cover monthly expenses is quite high, a peculiar characteristic of Italians, who achieve, on average, a high level of savings.
Considering the data described above, we formulated our model to estimate the main determinants of FL in Italy as follows: Higher financial literacy~gender + education + risk.capacity + HE.stress + age + employment.status + household.composition + geographic.area + native + financial.behavior + financial.attitude + FL.self.assessment + pension.savings + pension products in the last 2 years (PP.in.the.last.year)+ pension.fund +interview.type We split the data set into a training set and a test set, according to the common splitting rule of 80-20%. Therefore, the training and the test sets consisted of 1901 and 475 observations, respectively 1 . We are conscious that the size of the sample considered is relatively small but, due to cross-validation, machine learning can be used to validate the predictive accuracy without problems for small datasets.

Results
The results obtained for the tree-based ML algorithm (see Figure A1 in Appendix A) depict the best decision tree for FL data used. We see that the best tree has 10 terminal nodes (nine splits) and the root node splits on risk capacity = 0 in yes and no. Each node shows the predicted class (1 or 0) and the percentage of observations in the node. To assess the performance of the model, we refer to the OOB technique.

Predictive Quality: Models' Validation, Accuracy and Performance Evaluation
The tree-based algorithms are usually validated using the OOB score, that is, the average prediction error calculated on each training sample x j , using only the trees that did not have x j in their bootstrap sample. Sub-sampling allows one to define an OOB estimate of the prediction performance improvement by evaluating predictions on those observations that were not used in the building of the next base learner.
The variation of the OOB error with respect to the number of trees used in the RF algorithm shows that the OOB error rate stabilizes around 0.4 when 100 trees are used for building the forest, suggesting a good capacity of the RF algorithm to predict the FL (Figure 2, panel a). Panel b in Figure 2 shows the GBM performance evolution, based on the Bernoulli deviance, when the algorithm combines a progressively larger number of weak learners. Smaller deviance values indicate better performance. The black line represents the training Bernoulli deviance, whereas the green line shows the testing Bernoulli deviance, which is the result of the cross validation. The blue dashed line shows the optimal number of iterations. The plot highlights that, beyond a certain point (in our case, 58 trees), the model generalization power starts decreasing, explaining only the training data. This point represents the optimal number of iteration.
incorrectly classified ( Table 2). The diagonal elements of the confusion matrix indicate correct predictions, whereas the other elements indicate incorrect predictions.    Table 3 for the ML algorithms and the logistic regression model. The ML algorithm's performance is compared to the results of a logistic regression model (LR), that is, a generalized linear model (GLM) with a logit link function g(.) = logit and a binomial distribution for the response (binary) variable Y. Letting µ denote the expectation of the response variable Y, the structure of a logistic model is:

Predicted Negative Predicted Positive
where β1, ..., βp−1 are the regression parameters that need to be estimated and β0 is the intercept. The covariates enter a logistic regression model through the linear predictor logit (µ), leading to interpretable effects of the explanatory variables on the response. The model's accuracy is measured according to a set of indicators that can be easily determined by the confusion matrix, reporting the number of observations correctly or incorrectly classified ( Table 2). The diagonal elements of the confusion matrix indicate correct predictions, whereas the other elements indicate incorrect predictions. The metrics used in this paper are listed below, expressed according to the elements of the confusion matrix: Note that the accuracy, i.e., the proportion of correct predictions, can be written as: is the indicator function. The overall performance of the ML algorithms, summarized over all possible thresholds, can be represented by the Receiver Operating Characteristics (ROC) curve and in particular by the Area Under (this) Curve (AUC), that is, the integral area of plotting the sensitivity (TPR) on the y-axis vs. 1-specificity (FPR) on the x-axis. Specifically, ROC shows how TPR and FPR vary with different threshold values and can be used to compare different classification algorithms.
The values of the accuracy measures applied to the FL data in Italy are reported in Table 3 for the ML algorithms and the logistic regression model. The ML algorithm's performance is compared to the results of a logistic regression model (LR), that is, a generalized linear model (GLM) with a logit link function g(.) = logit and a binomial distribution for the response (binary) variable Y. Letting µ denote the expectation of the response variable Y, the structure of a logistic model is: where β 1 , . . . , β p−1 are the regression parameters that need to be estimated and β 0 is the intercept. The covariates enter a logistic regression model through the linear predictor logit (µ), leading to interpretable effects of the explanatory variables on the response. The RF algorithm accurately identifies the individuals who are financially literate in the test set with an accuracy level equal to 67.37%, which is the highest among the set of models taken into account.
To measure the precision's accuracy, we show the ROC curve for all of the models considered in Figure 3. According to the ROC curve, the GBM model provides the highest AUC, hence resulting in the best model. We can conclude that machine learning can improve the accuracy of some standard parametric models in the estimation of the main determinants of financial literacy.
J. Risk Financial Manag. 2021, 14, x 12 of 21 The RF algorithm accurately identifies the individuals who are financially literate in the test set with an accuracy level equal to 67.37%, which is the highest among the set of models taken into account. To measure the precision's accuracy, we show the ROC curve for all of the models considered in Figure 3. According to the ROC curve, the GBM model provides the highest AUC, hence resulting in the best model. We can conclude that machine learning can improve the accuracy of some standard parametric models in the estimation of the main determinants of financial literacy.

Variable Importance and Partial Dependence
ML algorithms are usually viewed as a black box because gaining insight into a RF prediction rule is hard due to the large number of trees. One of the most common approaches to extract interpretable information on the contribution of different variables from the random forest consists of the computation of the so-called variable importance measures. Variable importance is determined according to the relative influence of each predictor, by measuring the number of times a predictor is selected for splitting during the tree-building process, weighted by the squared error improvement in the model each split, and averaged over all trees. We plot in Figure 4 the relative importance of the predictors for different ML techniques. The most important variables are at the top of each plot, and the less important are at the bottom. From the results we observe the predominance of age in determining the FL, especially for RF and GBM algorithms. Education and financial behavior follow. The gradient boosting machine (GBM) model, which we identified as the best model in the previous session, highlights the importance of financial attitude and financial behavior to explain different levels of financial knowledge. It is also interesting to note that gender is not among the most relevant dimensions to explain Italian adults' financial literacy and the geographical distribution in the national territory is more relevant.
J. Risk Financial Manag. 2021, 14, x 13 of 21 ML algorithms are usually viewed as a black box because gaining insight into a RF prediction rule is hard due to the large number of trees. One of the most common approaches to extract interpretable information on the contribution of different variables from the random forest consists of the computation of the so-called variable importance measures. Variable importance is determined according to the relative influence of each predictor, by measuring the number of times a predictor is selected for splitting during the tree-building process, weighted by the squared error improvement in the model each split, and averaged over all trees. We plot in Figure 4 the relative importance of the predictors for different ML techniques. The most important variables are at the top of each plot, and the less important are at the bottom. From the results we observe the predominance of age in determining the FL, especially for RF and GBM algorithms. Education and financial behavior follow. The gradient boosting machine (GBM) model, which we identified as the best model in the previous session, highlights the importance of financial attitude and financial behavior to explain different levels of financial knowledge. It is also interesting to note that gender is not among the most relevant dimensions to explain Italian adults' financial literacy and the geographical distribution in the national territory is more relevant.   We also define the partial dependence plots to show the marginal effect of the selected predictor on the target variable averaged over the joint values of the other predictors provided by a tree structure (see Friedman (2001) for further details). The function explaining the partial dependence is: ̂ ( = ∑ , , , where xs is the variable of interest and xi,C is the complementary variable in the dataset. Figure 5 illustrates three one-way partial dependence plots for our dataset, with the GBM regressor (the best model for the precision's accuracy, as shown in the next section). The plots show that the most important predictors are age, financial attitude, and financial behavior. The results are in line with the main results of the literature in the field. We confirm the correlation of age with a higher level of FL among working age adults and a lower level for younger and older adults, as described in Section 1. Moreover, there is a clear positive correlation between financial knowledge and both financial attitude and financial behavior. In the latter case, because one of the elements contributing to the good behavior score we used is the purchase of financial assets in the two years before the survey, we can speculate that experience has a positive effect in the acquisition of financial knowledge. In that sense, many studies suggest that experience plays an important role in a person's motivation to become financially literate. For example, Mandell (2008) found that financial education programs that include experiential components have a higher impact; for example, participation in a stock market game results in a 6%-8% improvement in FL among respondents. Frijns et al. (2014) suggest that "people with more financial experience acquire more financial knowledge either through self-education or by becoming more receptive to financial education programmes" (p. 125). Our results do not identify the causal relationship between financial knowledge and behaviours; however, we can speculate that causality runs in both directions, either when more financially literate people engage in more financial activity and therefore become more experienced, or when people may learn from their financial experiences and therefore become more literate. The main implication of these results is that policy makers should consider ways to increase the financial experience of people, through experiences in real-world situations, as a way of improving FL. We also define the partial dependence plots to show the marginal effect of the selected predictor on the target variable averaged over the joint values of the other predictors provided by a tree structure (see Friedman (2001) for further details). The function explaining the partial dependence is: , where x s is the variable of interest and x i,C is the complementary variable in the dataset. Figure 5 illustrates three one-way partial dependence plots for our dataset, with the GBM regressor (the best model for the precision's accuracy, as shown in the next section). The plots show that the most important predictors are age, financial attitude, and financial behavior. The results are in line with the main results of the literature in the field. We confirm the correlation of age with a higher level of FL among working age adults and a lower level for younger and older adults, as described in Section 1. Moreover, there is a clear positive correlation between financial knowledge and both financial attitude and financial behavior. In the latter case, because one of the elements contributing to the good behavior score we used is the purchase of financial assets in the two years before the survey, we can speculate that experience has a positive effect in the acquisition of financial knowledge. In that sense, many studies suggest that experience plays an important role in a person's motivation to become financially literate. For example, Mandell (2008) found that financial education programs that include experiential components have a higher impact; for example, participation in a stock market game results in a 6-8% improvement in FL among respondents. Frijns et al. (2014) suggest that "people with more financial experience acquire more financial knowledge either through self-education or by becoming more receptive to financial education programmes" (p. 125). Our results do not identify the causal relationship between financial knowledge and behaviours; however, we can speculate that causality runs in both directions, either when more financially literate people engage in more financial activity and therefore become more experienced, or when people may learn from their financial experiences and therefore become more literate. The main implication of these results is that policy makers should consider ways to increase the financial experience of people, through experiences in real-world situations, as a way of improving FL.

Conclusions
One of the main recent developments in financial research is the availability of new administrative, unstructured, micro-level data that are difficult to analyze with traditional econometric models. In this scenario, machine learning techniques can offer the capabilities and functional flexibility needed to identify complex patterns in a high-dimensional spaces and datasets. There are clearly advantages and disadvantages for both parametric and ML models. The latter are nonparametric and do not postulate a functional form linking the target variable to the explanatory variables, so their main strength is their high flexibility in learning from data and their high predictive performance. Their main drawbacks are the risk of overfitting and the interpretability of the results generated by the algorithms. In contrast, parametric models, such as the generalized linear model (GLM), have the advantages of being parsimonious and easy to interpret and estimate; their drawbacks are that they have limited complexity and generally poor predictive power.
We do not wish to define which of the two methods, the parametric or ML model, is the best approach. However, this study examined how they can be used and integrated with each other to gain a better understanding of the phenomenon of financial literacy. We demonstrated that analytical steps of the econometric processes, such as the logit analysis that we applied to the Italian data on financial literacy, has a homologous step in ML analyses. By clearly stating this correspondence, we hope that the adoption of ML techniques in the context of financial literacy will be facilitated.
In detail, we tested the improvement in the accuracy in explaining the determinants of FL using not only the decision tree, but also two more powerful ML algorithms: random forest and gradient boosting. Our results demonstrate that the gradient boosting machine

Conclusions
One of the main recent developments in financial research is the availability of new administrative, unstructured, micro-level data that are difficult to analyze with traditional econometric models. In this scenario, machine learning techniques can offer the capabilities and functional flexibility needed to identify complex patterns in a high-dimensional spaces and datasets. There are clearly advantages and disadvantages for both parametric and ML models. The latter are nonparametric and do not postulate a functional form linking the target variable to the explanatory variables, so their main strength is their high flexibility in learning from data and their high predictive performance. Their main drawbacks are the risk of overfitting and the interpretability of the results generated by the algorithms. In contrast, parametric models, such as the generalized linear model (GLM), have the advantages of being parsimonious and easy to interpret and estimate; their drawbacks are that they have limited complexity and generally poor predictive power.
We do not wish to define which of the two methods, the parametric or ML model, is the best approach. However, this study examined how they can be used and integrated with each other to gain a better understanding of the phenomenon of financial literacy. We demonstrated that analytical steps of the econometric processes, such as the logit analysis that we applied to the Italian data on financial literacy, has a homologous step in ML analyses. By clearly stating this correspondence, we hope that the adoption of ML techniques in the context of financial literacy will be facilitated.
In detail, we tested the improvement in the accuracy in explaining the determinants of FL using not only the decision tree, but also two more powerful ML algorithms: random forest and gradient boosting. Our results demonstrate that the gradient boosting machine methodology outperforms conventional methods. Moreover, ML analyses produce reliable information consistent with the literature, because FL is highly correlated with demo-graphic variables such as educational attainment, age, and household financial fragility. The results of ML models also highlight, in contrast to the traditional parametric model, the importance of financial behaviours in defining the level of financial knowledge. Because we used the INFE-OECD's definition of financial behavior, which accounts for the purchase of financial assets in the two years before the survey, we can speculate that experience has a positive effect in the acquisition of financial knowledge. Therefore, these results have policy implications because they suggest that effective strategies to tackle financial illiteracy should involve experiences in real-world situations. In that sense, banks and financial institutions could play an essential role in the field of education and training in FL (Trunk et al. 2017).
We are conscious that we tested ML models based on a limited case study, using the few available microdata distributed by the Bank of Italy on the levels of adults' financial literacy in Italy. However, we hope that this could be a first step to encourage the adoption of ML techniques in applied economics and among finance researchers and policy makers in the context of financial literacy. We can conclude that machine learning techniques can be valuable as a complement to standard models, which can be further extended in several directions. The ML approaches can be useful for analyzing complex data structures and a large amount of information simultaneously. Thus, they can provide a more nuanced picture of the phenomenon to give policy makers, national bodies, and financial institutions a clearer framework for effectively targeting the problem of financial illiteracy in accordance with the OECD's Recommendation on Financial Literacy. In the era of Big Data, where massive amounts of very high-dimensional or unstructured data are continuously produced and stored, ML techniques provide new opportunities in data analysis, both for exploring the hidden structures and correlation of each variable considered, which traditionally has not been feasible, and for extracting important common features across many subpopulations, even when there are large individual variations. However, substantial efforts are also required for advancement in data collection and the availability of individual information on financial behaviours, attitude, and knowledge at the national and international levels.
Based on our analysis, further applications of machine learning methods to highdimensional data (big data) on financial literacy, once available, would help understanding the heterogeneity and commonality in levels of financial literacy across different subpopulations. This is particularly relevant in the wake of the COVID-19 pandemic, which is exacerbating social and economic inequalities globally.
Author Contributions: Conceptualization, S.L. and G.Z.; methodology, S.L.; validation, S.L.; resources, G.Z.; writing-original draft preparation, S.L. and G.Z.; writing-review and editing, S.L. and G.Z. All authors have read and agreed to the published version of the manuscript.