How to Measure Financial Literacy?

: Financial (il-)literacy and its effects have been studied extensively in recent years. The measurement of this concept is, however, tricky and numerous measurement instruments exist. In this paper, we study the connection between these measures empirically. We ﬁnd that these measures are often only slightly related and that this is a so-far overlooked empirical problem in this ﬁeld. As a result of our analysis, we suggest the combination of two measures as the best potential alternative to the existing measures. Finally, we analyze the predictive power of this suggested measure for stock investment decisions.


Introduction
Financial literacy is defined as the "person's competency for managing money", according to Remund (2010), or, according to a more comprehensive definition, as "a combination of awareness, knowledge, skill, attitude and behaviour necessary to make sound financial decisions and ultimately achieve individual financial well-being" (INFE 2011). It is a necessary precondition for reasonable financial decisions and, thus, for financial well-being. Lack of financial literacy and its causes have been discussed by academics and practitioners alike, see, e.g., Reference Bucher-Koenen and  and Burke and Manz (2014). Furthermore, the effects of financial illiteracy have also been investigated in numerous studies (Bianchi 2018;Fernandes et al. 2014;Gathergood and Weber 2017;Lusardi and Mitchell 2014;Van Ooijen and van Rooij 2016;Van Rooij et al. 2011). In all these studies, the measurement of financial literacy is of central importance. This is a non-trivial task (Marcolin and Abraham 2006), for the completion of which various scales have been applied. It has been criticized that, in many studies, the concept of financial literacy was not adequately defined (Huston 2010). The underlying assumption seems to be that financial literacy can be measured in essentially every possible way and would still lead to the same general conclusions. In this paper, we aim to investigate the following question: Do different scales of financial literacy correlate or do they measure entirely different variables that are not much related? Hung et al. (2009) have already studied this issue for a small set of measures, but we want to consider more (and also newer) measures and take a step forward to illustrate which of these scales are able to best represent the overall effect and whether the appropriate combination of scales can improve this. Knoll and Houts (2012) used the Item Response Theory (IRT) as a tool for selecting the relevant items from a list of existing questions of a larger survey.
Our approach differs from the previous ones in that we start with a number of scales that have been applied in literature to measure financial literacy and conduct a survey that includes these financial literacy scales. None of these scales has been developed by the author of this article, so the analysis can be conducted freely and without any danger of conflicting interests.
We will see that the relationship between these different measures is weaker than one might expect, and this raises certain concerns with regard to their application. As a result of the conducted analysis, we will derive a general recommendation for the measurement of financial literacy. Then, we will test whether the proposed scale can be used as a predictive instrument for investment decisions and the question of whether they yield long-term returns. This scale will also be compared to a scale of self-assessed financial literacy which did not prove to be a sufficiently appropriate instrument for measuring financial literacy nor predicting outcomes of financial decisions, while the suggested scale performs very well in this context.
The novelty of our study is that we measure financial literacy using multiple scales within one subject population (which, so far, has been done only in a small number of studies by Hung et al. 2009). This allows to test the existing various scales against each other. Moreover, we derive practical recommendations from this analysis and test the ability of the resulting scale to predict actual financial decisions.
The remainder of this paper is structured as follows: In Section 2, we present the methodology of our research, in particular the survey design and the subject pool. In Section 3, we discuss the results, describing the financial literacy questions by topic and then following the research agenda outlined above. Conclusions and recommendations are presented in Section 4.

Survey Design
Our survey consists of three parts:

•
In the first part, we introduce various financial literacy scales that have been proposed in literature. Of course, it would be unfeasible to include every single question that has been raised with regard to this topic in the past, but we identified eight typical and important articles in this field (Alexeev et al. 2014;Anderson et al. 2017;Bianchi 2018;Burke and Manz 2014;Ćumurović and Hyll 2019;Gathergood and Weber 2017;Lusardi and Mitchell 2011;Van Ooijen and van Rooij 2016). • In the second part, we measure a number of control variables, in particular self-assessed financial literacy, attitudes towards money (Fünfgeld and Wang 2009), and demographics. • Finally, we elicit hypothetical investment questions (incentivized) and put forward general ideas on long-term investment (see the Appendix A for details).
The survey was programmed in Unipark and administered online. As announced in the advertisement for the survey, one of the subjects was randomly selected to receive a monetary prize following the expiration of the survey. The exact amount was determined based on their answer given to the (risky) investment decision and chance, as specified below.

Subjects
The survey was conducted with the participation of two subject samples, with the first sample containing 129 participants and the second sample containing 149 participants. The only difference between the samples was that the questionnaire for the second sample included several additional questions. For the purposes of this paper, we will pool both groups.
The survey was advertised at a university, and this is the reason why the majority of the participants (89%) are students. However, there were also subjects (21%) that were already employed (at least part-time). In the advertisement, we did not reveal the fact that the survey was about financial literacy. Instead, we mentioned that the questions included "knowledge questions" but also made it clear that the chances of winning were not influenced by the given answers in order to reduce self-selection effects. The percentage of women in the survey (62%) slightly exceeded that of men. Very few subjects had work experience in the financial sector (7%), and only a fifth studied economics (or a related subject), so it was no surprise that the self-assessed financial literacy level was rather low (around 3 on a scale from 1 to 7); see Table 1 for details.

Analysis of Financial Literacy Questions
To get a better understanding of the financial literacy scales proposed in literature, we first classified the questions into five categories: specific financial knowledge, financial mathematics, inflation rates, mathematics, and cognitive reflection. 1 These topics were not evenly distributed, and none of the scales covered all of these topics; see Table 2 for an overview. The table makes it clear that the scales might be too narrow to cover every aspect of financial literacy. On the other hand, there is the possibility that these topics are so closely intertwined that measuring one of them will already be enough to capture the full spectrum. And of course, our characterization is not comprehensive, and the items of one scale might be broader than our overview suggests. 1 We were, indeed, surprised to find questions that have nothing to do with financial literacy as such, for example questions that are pure math problems or the classical cognitive reflection test (CRT) from Frederick (2005). In fact, the use of CRT in the scale proposed byĆumurović and Hyll (2019) seemed to us so out of place in the given context, that we also used a modified version of this scale which did not include CRT.

Comparison of Measurements
A good scale should be able to distinguish between different levels of financial literacy. A scale where all participants get all (or no) questions right would not be very useful. Table 3 shows, however, that this is not a major issue-probably, with the exception of the scale proposed by Lusardi and Mitchell (2011), which might have been too easy for our subjects. A comparison of the proportion of correct answers in our sample with other studies would be of interest, but, given the inherent differences between subject groups, we will refrain from it within the framework of this paper. Table 3. Proportions of correct answers for various measures of financial literacy.

Measurement
Correct Answers Another point to examine is the validity of the proposed scales. To this end, we calculated Cronbach's Alpha for all eight scales. Generally, high values are desirable, with 0.5 typically being considered as the smallest acceptable value. Yet, given that some of the scales comprised very few items, we cannot expect Cronbach's Alpha to be too high, so that even a value of 0.4 could be considered sufficient in this case. We found, however, that some of the scales failed to reach this low threshold, which suggests that they might measure other concepts along with financial literacy (compare Table 4). Accordingly, the scale proposed byĆumurović and Hyll (2019), with or without CRT, was most reliable, while we consider the scale proposed by Lusardi and Mitchell (2011) still to be acceptable, given that it consists of only three items. Table 4. Reliability of financial literacy measurements and cognitive reflection (CRT). Values of Cronbach's Alpha above 0.4 are in italics; values above 0.6 are in bold. Reliability, however, is not sufficient for a good scale: A scale can be highly reliable, but, if it measures something entirely different, it cannot successfully serve our purposes. Therefore, we tested the correlations between the various financial literacy measures (see Table 5). The good news is that most measures do significantly correlate with each other. Thus, despite the fact that the questions included in the scales usually focus on different topics, the scales are still related with each other. The correlation coefficients, however, are often relatively small and sometimes not statistically significant. This means that it matters which scale we use. Next, we calculated the correlations between the scales and the scores in the five categories mentioned above (specific financial knowledge, financial mathematics, inflation rates, mathematics, and cognitive reflection). In most cases, the correlations were statistically significant (see Table 6). There were, however, considerable differences between the correlation coefficients (see Table A1 for a robustness check).  To illustrate the relationship between the existing measures and to find out which of them are the most representative ones, we used multidimensional scaling (PROXimity. SCALing (PROXSCAL)), which allows to depict these relationships graphically (Figure 1). The graph suggests that the scales proposed by Lusardi and Mitchell (2011) andĆumurović and Hyll (2019) might be most representative for the various financial literacy scales.

Suggested New Measure
In the previous sections, we came to the conclusion that the scales proposed by Lusardi and Mitchell (2011) andĆumurović and Hyll (2019) are the ones best suited for measuring financial literacy. We also found that the removal of CRT questions fromĆumurović and Hyll (2019) does not impair the overall performance of the scale (compare Table 4 and Table 5). Thus, we decided to combine the non-CRT part of the scale proposed byĆumurović and Hyll (2019) with the scale proposed by Lusardi and Mitchell (2011). The resulting scale consists of six items, which is a reasonable quantity of items that enables easy measurements. The reliability of the scale is good with Cronbach's Alpha being 0.62.
The correlation of the suggested scale with other scales is also good, as shown in Table 7. We also find that the correlation of the suggested scale with the total financial literacy score, which is calculated as the total of the questions included in the eight scales, is excellent (above 80%, see Table 7). Multidimensional scaling with these additional variables shows that the combined measure gives an even better representation of financial literacy than the scales individually ( Figure 2).  The percentages of correct answers for the individual items in the combined scale are between 60% and 90% (see Table 8). Table 8. The six items of the CL scale and the percentage of correct answers for these items.

Item Correct Answers
Buying a single share is safer than buying an equity fund. True or false? 76.3% You have 100EUR on your savings account with 2% interest per year. How much will you have after 5 years if you let your money grow?

90.6%
Your savings account earns 1% interest per year, and inflation amounts to 2% per year. How much can you buy after one year with the money in your savings account? More than today/The same as today/Less than today. 81.3% Which investment normally has the largest fluctuations? Savings account/Fixed-interest securities/Shares 87.8% Which of the following statements best describes the main task of the stock market?
The stock market predicts stock profits./The stock market leads to an increase in stock prices./The stock market brings together potential buyers and sellers./None of the 3 statements.

62.2%
Which of the following statements is correct? Once you have invested in a mutual fund, you cannot withdraw the money in the first year /Investment funds can invest in several assets, e.g., shares and bonds /Investment funds pay a guaranteed return, which depends on the past performance /None of the 3 statements.

Dependence of Measurement on Other Factors
In the following, we will study several characteristics of the new combined Cumurovic-Lusardi measure ("CL" for short). In particular, we want to find out whether self-assessed financial literacy is related to CL and how demographics (age, gender, education) and cognitive reflection affect CL. Finally, we will also test whether the attitude of individuals towards money is correlated with this financial literacy measure.
As seen in Table 9, CL is higher for males, students, working people, and those having a university degree. 2 There is no significant correlation between the CL-scale and age. Given that the sample is very homogeneous in terms of age, it is no surprise that we do not find age effects. Thus, this finding might be attributable to the composition of our subject pool and should be tested on a more heterogeneous sample.
We also find that CL is not higher for students majoring in economics, but it is higher for students having a university degree. This might appear surprising, but a possible explanation to this is that our undergraduate programs in Business and Economics do not include a finance class in the first year of studies. This means that many students that took our survey had not taken finance classes before. Moreover, the finance class in the second year of studies concentrates mostly on corporate finance and, therefore, might not considerably promote the enhancement of general financial literacy. Subjects with a Bachelor's degree, on the other hand, do exhibit a higher level of financial literacy.
In addition, further analysis reveals a seemingly surprising result: there is no significant correlation between CL and self-assessed financial literacy (with a Pearson correlation coefficient of 6%). Does this signal a potential error of the new scale?
To find this out, we should take a look at other financial literacy measures for comparison: Burke's and Anderson's scales show modest significant correlations with self-assessed financial literacy (21% and 18%, respectively), but all other measures show no significant correlation. It appears that self-assessed financial literacy and measured financial literacy are barely related. 2 The effects of education on the other financial literacy measures are shown in Figure A1 in the Appendix A. A possible explanation for this interesting observation could be that people tend to compare their knowledge level with that of their peers with similar backgrounds, and, as a result, they report a "relative" level of knowledge. This can also explain why there is no effect of education on self-assessed financial literacy, while we have already observed such an effect for the measured financial literacy.
To sum up, one should be cautious about trusting self-assessed financial literacy in surveys.
How is financial literacy related to the attitudes towards money (Fünfgeld and Wang 2009)? Of the five dimensions (anxiety, need for saving, interest in financial issues, free-spending, and intuitive decisions), only interest in financial issues, anxiety and intuitive decision-making should be a priori related to financial literacy. (Anxiety about financial issues could discourage subjects from familiarizing themselves with this topic, which is needed to increase financial literacy. Intuitive decision-making would be indicative of a less thoughtful approach towards financial affairs, which would again prevent people from making more effort to enhance their financial literacy.) Indeed, in our correlation analysis, we find that these three factors are significantly related to CL, while the same cannot be said about the other two factors (see Table 10). With the exception of one scale (Burke), none of the other financial literacy measures correlate significantly with these three scales.

Predictive Power of the New Scale: Stock Investment and Belief in Stock Returns
Within the framework of this study, we have defined a more reliable financial literacy scale that seems to perform well. But we have not discussed its predictive power for actual behavior and attitudes.
To test this, we included two relevant items into our survey regarding stock market investment. The first one was an incentivized question on asset allocation where the subjects could select stocks, and the second one was a question, where the subjects were asked to state to what extent they agreed with the statement that stocks yielded good returns in the long-run. Table 11 shows that financial literacy-as measured with the CL scale-does, indeed, have an impact on these survey items. This is the case even after controlling for a number of other items, including CRT, demographics or the self-assessed financial literacy. In fact, CL is the only variable that has a strong significant impact on all of these items. Table 11. CL can explain stock market investment decisions and general attitudes towards long-term returns. Self-assessed financial knowledge has an explanatory power for the latter, while gender affects only the investment decisions. The effect is robust when controlling for other demographics and for attitudes towards money.

Stock Investment (%) Stocks as a Good Long-Term Investment
Model When comparing the external validity of CL with the external validity of other financial literacy measures (see Tables 12 and 13), we see that some of them seem to have no significant external validity with regard to stock investments or the fact of viewing stocks as good long-term investments. In fact, only three scales (Burke, Lusardi, Ooijen) perform as well as CL. We do not claim that this alone proves the superiority of the CL scale because other external validity tests could be conducted that might produce different results. The lack of significance for some of the measures, however, may be indicative of crucial limitations.

Conclusions
We have seen that typical financial literacy measures are mostly significantly, but not always, strongly correlated. We also found that some of the scales proposed in literature have a low reliability. By putting together parts of the two measures that were most representative of the financial literacy questions in our study, we have proposed a short (six-item) and powerful scale that has a significant predictive power for actual financial behavior and attitudes.
For now, we would recommend using this combined scale ("CL scale") to measure general financial literacy.
There are obvious limitations to our study: in particular, we studied "only" eight measures proposed in literature, even though there are many other measures. There are, however, limitations to the feasibility of a study where every subject has to answer many questions regarding financial literacy. Of course, it would be of great value to have direct connections to real-life financial decisions in the data, rather than only hypothetical ones. However, it seems difficult to simultaneously elicit a lengthy questionnaire and obtain field data on financial decision-making. Nevertheless, this approach might be promising for future work and could help to identify or propose measures that have a higher predictive power for actual financial decision-making.
Future studies could test whether the CL scale can be improved, e.g., by adjusting different weights to its items or by reducing the number of items. We refrained from doing this, since we did not aim to provide an optimal scale but, rather, to point out the general problem of weak correlation between certain scales and to suggest a reasonable alternative, which would be as close as possible to the existing frequently-applied scales.
Another line of future research may study the question of how financial literacy measures vary among different demographic groups, and especially which tests have a higher discriminatory power for which group, e.g., a certain measure might be too easy for more knowledgeable individuals, and other measures might be too difficult for less knowledgeable individuals. Our results point out such differences, but further studies are needed.
We hope that our work is a step in the right direction and that we have been able to illustrate the limitations of currently applied measures and give some practical advice on what scale should be used to measure financial literacy in a more reliable way.
Funding: This research was partially funded by the State of Rhineland-Palatinate through the research initiative at the University of Trier.
One of the subjects was randomly selected at the end and obtained the payoff that his chosen investment would have gotten in the scenario described above, where the starting year was decided by chance, and the payoff computed according to historic data. In this way, the question was incentivized.
The general ideas about long-term investment were computed based on the following two items: "In the long run, shares have a good return." and "Investments in shares are only something for gamblers." 3 The subjects could indicate to what extent they agreed or disagreed with each of these statements using a 5-point Likert scale from "Strong approval" to "Strongly dismissive". The difference of these two answers was then coded as "stocks having good long-term investment".  Figure A1. Effects of education on financial literacy measures. For most (but not all) measures in our sample, a university qualification (in Germany, this corresponds usually to an "Abitur") increases financial literacy, but a university degree does not. In the original German version, "gamblers" was translated as "Zocker" which has a much more negative connotation than the English word.