Trusting the Trust Game: An External Validity Analysis with a UK Representative Sample

Using a nationally representative sample of 1052 respondents from the United Kingdom, we systematically tested the associations between the experimental trust game and a range of popular self-reported measures for trust, such as the General Social Survey (GSS) and the Rosenberg scale for self-reported trust. We find that, in our UK representative sample, the experimental trust game significantly and positively predicts generalised self-reported trust in the GSS. This association is robust across a number of alternative empirical specifications, which account for multiple hypotheses corrections and control for other social preferences as measured by the dictator game and the public good game, as well as for a broad range of individual characteristics, such as gender, age, education, and personal income. We discuss how these results generalise to nationally representative samples from six other Organisation for Economic Co-operation and Development (OECD) countries (France, Germany, Italy, Korea, Slovenia, and the US).


Introduction
Experimental games such as the trust game (Berg et al. [1]) have been extensively used by behavioral economists to elicit key social preferences, such as trust and trustworthiness. Together with the dictator game (Forsythe et al. [2]) and the public good game (Andreoni [3]), the trust game is one of the most widely used experimental games of strategic interaction, by not only economists, but also by researchers in political science, neuroscience, marketing, game theory, sociology, psychology, etc.
In the classic trust game by Berg, Dickhaut, and McCabe (BDM [1]), two participants are randomly matched, and one acts as sender while the other one acts as receiver. Both are initially endowed with £10, and the sender decides the amount to send to the receiver, T ∈ [0, 10]. This amount is tripled and transferred to the receiver. Finally, the receiver decides to send an amount Y back to the sender. T normally captures the experimental trust of the sender, whilst Y captures the experimental trustworthiness of the receiver.
A number of studies, systematically reviewed and summarized by Galizzi and Navarro-Martinez [4], have looked at the external validity of the trust game by trying to predict field behaviors or self-reported trust or trustworthiness using experimental trust games (Ashraf et al. [5]; Baran et al. [6]; Bellemare and Kröger, [7]; Bennett et al. [8]; Bouma et al. [9]; Cardenas et al. [10]; Carter and Castillo [11]; Ermisch et al. [12]; Fehr et al. [13]; Glaeser et al. [14]; H. J. Holm and Danielson [15]; Karlan [16]; Riedl and Smeets [17]). We review in greater detail these studies in the next section, but their main findings are as follows: whilst some studies find that the experimental trust and trustworthiness measured in the trust game can predict some field behaviours, the evidence is much more mixed and contradictory for predicting survey questions on self-reported trust or trustworthiness.
In this article we contribute to the existing literature by systematically testing the associations between the trust game and a range of popular self-reported measures for trust and trustworthiness, such as the General Social Survey (GSS) question on trust and the Rosenberg scale for self-reported trust. We do so by using data collected by the Organisation for Economic Co-operation and Development (OECD) Trustlab in the UK (Murtin et al. [18]), a large nationally representative sample of the UK population (n = 1052). In our estimates, we control for other social preferences as measured by other games, such as the dictator game and the public good game, and we correct for multiple hypothesis testing.
Our main finding is that trust, measured by the experimental trust game, significantly and positively predicts generalised self-reported trust in the GSS question. To the best of our knowledge, this is the first time that such an association is documented. This is also the first time that such an association is documented using a nationally representative sample. We also add a variety of robustness checks in our estimates, including alternative empirical specifications controlling for a broad range of individual characteristics, as well as a replication of our results using nationally representative samples of six other OECD countries (France, Italy, Germany, Korea, Slovenia, and the US).
The rest of the paper is organised as follows. Section 2 briefly discusses the literature. Section 3 describes the data and methodology. Section 4 presents the results, while Section 5 briefly concludes.

Literature Review
Galizzi and Navarro-Martinez [4] conducted a systematic review of the literature and meta-analysis of all the studies that have looked at the external validity of social preferences games, such as the dictator game, the ultimatum game, the trust game, and the public good game. Focusing on the trust game, they found a total of 13 studies that have analysed the external validity of experimental trust and trustworthiness as measured in incentive-compatible trust games. Their review shows that the experimental trust and trustworthiness, in particular, have been used to predict a variety of variables outside of the lab, including survey questions on self-reported trust, and a range of field behaviours as diverse as: hours spent volunteering, per-capita household expenditure, frequency of past trustful behaviour, default on loans, dropping out of loans, voluntary savings, household investment in soil and water conservation, household contribution to soil and water maintenance, outcome per worker, earnings, being active in an organization on a regular basis, giving donations to a business school by respondents who previously studied there for a MBA, participating in social organization and attending their meetings, holding a socially responsible investment and its amount.
The studies reviewed by Galizzi and Navarro-Martinez [4] typically test the external validity of the trust games by running regression models where the dependent variables are the field behaviours (including the self-reported measures of trust) and the explanatory variables are the experimental trust games, possibly with further control variables. Those studies interpret the estimated coefficients of the regressions, if statistically significant, as indicators of the fact that the experimental trust games are good predictors of the field behaviours and are therefore 'externally valid'. When assessing the external validity of the experimental trust games, that literature does not typically focus on the overall goodness of fit of the conducted regressions. In their meta-analysis, Galizzi and Navarro-Martinez [4] indeed discuss the extent to which the considered studies report, and comment on, issues broader than just the analysis of the estimated coefficients, such as the use of control variables and the overall goodness of fit of the regressions. From the same systematic review and meta-analysis by Galizzi and Navarro-Martinez [4], it also emerges that no study to date has considered large representative samples of the population, and none have used corrections for multiple hypothesis testing.
Looking at the findings of the systematic literature review and meta-analysis by Galizzi and Navarro-Martinez [4], one can conclude that the evidence on the external validity of the experimental trust and trustworthiness measured in the trust games is quite mixed. Whilst some studies find trust and trustworthiness predict some important field behaviours, the evidence is much more mixed and contradictory when predicting self-reported survey questions on trust. In particular, four studies have shown that experimental trust is unable to predict thirteen questions on self-reported trust (Ashraf et al. [5]; Fehr et al. [13]; Glaeser et al. [14]; H. J. Holm and Danielson [15]). Experimental trustworthiness seems instead able to predict some of the questions on self-reported trust and self-reported trustworthiness: it predicts three out of ten self-reported trust questions and one out of three self-reported trustworthiness questions (Ashraf et al. [5]; Fehr et al. [13]; Glaeser et al. [14]; Holm and Danielson [15]).
So, to summarise, by looking at the evidence to date one can conclude that experimental trust does not predict self-reported trust, and that experimental trustworthiness fares somehow better at predicting self-reported trust and self-reported trustworthiness.
Galizzi and Navarro-Martinez [4] also critically review and discuss some of the possible arguments and reasons that have been proposed in the literature (e.g., Levitt and List, 2007a,b [19,20]) to explain the lack of external validity of the experimental trust games, and, in general, of the social preferences games: (i) the fact that participants in the lab make decisions under the close scrutiny of an experimenter; (ii) the fact that their decisions are unlikely to be anonymous; (iii) the specific lab setting of the decisions; (iv) the low monetary stakes of the decisions; (v) the characteristics of the participants selfselecting into lab experiments; (vi) the artificiality of the choice sets and of the time horizons; (vii) the lack of repeated interactions over longer periods of time; and (viii) the lack of context in the lab games. Another possible reason is that some of the studies reviewed above (e.g., Fehr et al. [13]; Glaeser et al. [14]) use the strategy method to implement the decisions made by players 2 in the trust games. All these reasons can be used to argue that the trust games typically used in lab experiments can measure individual constructs different, or more nuanced or complex, than just trust and trustworthiness, and therefore might not be expected to be externally valid.

Survey Design
We use data collected by the OECD Trustlab survey on social preferences on a sample of 1052 respondents in the United Kingdom (Murtin et al. [18]). Respondents were sampled and surveyed online by a professional company in July 2018, and the sample was nationally representative by age, gender, and income. Respondents were compensated for participation, and all experimental tasks in the survey were incentive compatible. The survey lasted approximately 35 min, and participants were recommended to complete it in one sitting. The survey comprised of three main modules. In module one, respondents were asked to partake in three incentive-compatible experimental tasks using real monetary rewards, namely a trust game (Berg et al. [1]), a public good game (Andreoni [3]), and a dictator game (Forsythe et al. [2]), to elicit their experimental trust (and trustworthiness), cooperation, and altruism, respectively. Following this, respondents participated in the Binswanger, Eckel, and Grossman (Binswanger [21,22]; Eckel and Grossman [23,24]) multiple-lotteries risk elicitation task, including an adapted variant with higher stakes. In module two, respondents' implicit attitudes towards public institutions were measured using a single category Implicit Association Test (Karpinski and Steinman [25]). Finally, in module three, participants answered a battery of survey questions, including various measures of self-reported trust.
To test the generalization of our findings, we extend our analysis to a pool of 6025 additional respondents from nationally representative samples of six other OECD countries collected as part of the OECD Trustlab project (France, Germany, Italy, Korea, Slovenia, and the US). A list of all the survey questions and variables and their codebook is available in Murtin et al. [18].

Outcome Variables
We use five self-reported measures of trust as our primary outcome variables. These include two generalised measures of trust: the first is based on the standard General Social Survey (GSS) question asking respondents 'Generally speaking, would you say that most people can be trusted or that you need to be very careful in dealing with people' with answers measured on an 11-point Likert scale; the second question is an adapted version of the Rosenberg [26] question, hereby referred to as the OECD Generalised Trust question, asking participants 'In general, how much do you trust most people?', with answers measured on an 11-point Likert scale as well. The third outcome variable is a binary measure of trust using the hypothetical lost wallet question: 'If you lost a wallet or a purse that contained items of great value to you, and it was found by a stranger, do you think it would be returned with its contents, or not?'. The last two outcome variables are two composite measures of trust, one for institutional trust, the other for interpersonal trust, as obtained by a factor analysis on a battery of self-reported questions measuring trust in a variety of institutions and individuals, respectively. Table 1 below reports the Pearson's correlation coefficients among these five trust measures. Our five measures of self-reported trust are positively and marginally significantly correlated with each other (for summary statistics, see Table A1).

Explanatory Variables and Covariates
The two main explanatory variables to predict our self-reported measures of trust are the experimental measures of trust and trustworthiness, as elicited by the Berg, Dickhaut, and McCabe (BDM [1]) trust game. Recall that this game has two participants initially endowed with £10. The sender decides how much to send to the receiver, T ∈ [0, 10]. This amount is tripled and transferred to the receiver. The receiver decides to send an amount Y back to the sender. The final payoffs are (10 − T + Y) for the sender and (10 + 3T − Y) for the receiver. T measures our experimental trust (Camerer et al. [27]), while Y measures our experimental trustworthiness. In the Trustlab games, each respondent played both roles, acting as a sender and receiver subsequently. First, respondents decided on the amount they wished to send to their unknown counterparts. Second, we elicited trustworthiness with the strategy method: respondents indicated the amount Y they would return for every possible amount T offered by the other player. Our experimental measure of trustworthiness is the average amount sent back. In terms of payment, we randomly matched two players and selected one of the roles for each player.
Furthermore, we use three covariates, namely altruism, cooperation, and reciprocity, to control for other social preferences of our respondents. Altruism is measured by how much the respondent offers to a matched participant in a dictator game with an initial endowment of £10. Cooperation is measured as the contribution, from an initial endowment of £10, made in a public goods game. Finally, our subjects played two public good games: a standard one and a conditional one. The conditional public good game allows to measure Reciprocity: respondents had to state how much they would contribute to a public game conditional on the average contribution of other players. In this way we were able to control for whether subjects contributed more or less than the average contribution of others, a notion akin to reciprocity. Table 2 below reports the Pearson's correlation coefficients among our various experimental measures of social preferences. Our experimental measure of trust is positively and marginally significantly correlated with trustworthiness, and also with other social preferences, in particular altruism and cooperation. Reciprocity is not correlated with other experimental measures (for summary statistics, Table A2). We also control for individual risk aversion as measured by their lottery choice in the Binswanger, Eckel, and Grossman (BEG) task. The risk taken by a respondent corresponds to the standard deviation of the expected payoff from their chosen lottery. Figure 1 below shows the frequency of choosing the six lotteries in the BEG task. The wider grey bars correspond to choices made in the first BEG task where the highest risk attainable is 9 pounds. The narrow grey bars correspond to a second BEG task where the riskiest gamble has a standard deviation of 36 pounds (for details, see Table A3). The proportion of respondents who are considerably risk-averse increases as the risk involved in the tasks increase: for instance, more respondents chose gambles 5-6 in the second BEG task relative to the first task. We also control for socio-demographic characteristics of respondents (for summary statistics, see Table A4). ment of £10, made in a public goods game. Finally, our subjects played two public goo games: a standard one and a conditional one. The conditional public good game allows measure Reciprocity: respondents had to state how much they would contribute to a pu lic game conditional on the average contribution of other players. In this way we we able to control for whether subjects contributed more or less than the average contributio of others, a notion akin to reciprocity. Table 2 below reports the Pearson's correlation c efficients among our various experimental measures of social preferences. Our exper mental measure of trust is positively and marginally significantly correlated with trus worthiness, and also with other social preferences, in particular altruism and cooperatio Reciprocity is not correlated with other experimental measures (for summary statistic Table A2). We also control for individual risk aversion as measured by their lottery choice in th Binswanger, Eckel, and Grossman (BEG) task. The risk taken by a respondent correspond to the standard deviation of the expected payoff from their chosen lottery. Figure 1 belo shows the frequency of choosing the six lotteries in the BEG task. The wider grey ba correspond to choices made in the first BEG task where the highest risk attainable is pounds. The narrow grey bars correspond to a second BEG task where the riskiest gamb has a standard deviation of 36 pounds (for details, see Table A3). The proportion of r spondents who are considerably risk-averse increases as the risk involved in the task increase: for instance, more respondents chose gambles 5-6 in the second BEG task rel tive to the first task. We also control for socio-demographic characteristics of responden (for summary statistics, see Table A4).

Empirical Strategy
Our empirical strategy tests the external validity of experimental trust and trustworthiness by using them to predict a variety of self-reported trust measures. We outline our model specifications to test the following main hypotheses in our UK nationally representative sample: Hypothesis 1 (H1). Experimental trust significantly predicts self-reported trust measures, and this holds true when controlling for other social preferences.
Hypothesis 2 (H2). Experimental trustworthiness significantly predicts self-reported trust measures, and this holds true when controlling for other social preferences.
To test these hypotheses, we use multiple linear regression models with and without a set of control variables. In testing for H1, we estimate the model specification (3) below: (Alternatively, we used an additional specification following Wilson [28] to create a truncated measure of experimental trust. This involves a two-step process. We first regress our experimental measure of trust on measures of altruism, cooperation, and reciprocity, and predict the residuals. This residual trust measure is our truncated indicator of experimental trust, and it is non-confounded with any other regarding preferences. Next, we use this truncated measure of residual trust, independent of any other regarding preferences, to predict our self-reported survey trust measures. We adjust for the standard errors to reflect this two-step estimation procedure. Model specifications (1) and (2) reflect this estimation strategy below: (1) Self-Reported Trust i = α 2 + β 4 Residual Experimental Trust i + ε 2i (2) Model specification (2) is repeated for our five different outcome variables, namely the GSS, OECD generalised trust, lost wallet, institutional, and interpersonal trust measures).
Self-Reported Trust i = α 2 + β Experimental Trust i + δ 11 Altruism i + δ 12 Cooperation i + δ 13 Reciprocity i + ε 3i (3) In testing H2, we estimate the model specification (4) below: Self-Reported Trust i = α 2 + β Experimental Trustworthiness i + δ 11 Altruism i + δ 12 Cooperation i + δ 13 Reciprocity i + ε 3i (4) For robustness checks, we run simple linear regression without including any additional covariates as reported in (3) and (4). We also add further controls for risk-aversion and socio-demographic characteristics of our respondents. In the results section, we report p-values corrected for multiple hypotheses testing using the Holm (Holm [29]) correction procedure. We have also replicated all the analyses and results using the alternative Romano-Wolf correction for multiple hypotheses testing (Clarke et al. [30]) with no substantial difference in the findings (available on request). We present these findings in the next section.

Descriptive Statistics
Our sample of respondents is representative by age, gender, and income. Table A4 reports the demographic characteristics of the sample. In line with prior experimental evidence, we find that our sample exhibits trusting behaviour, with a mean experimental contribution of £5.89 (σ = 2.95) in the trust game (experimental trust). In the public goods game, we find an unconditional (cooperation) and conditional (reciprocity) contribution of £6.19 (σ = 2.99) and £0.69 (σ = 1.36), respectively. Finally, in the dictator game, respondents send on average £4.29 (σ = 2.39) to their counterparts (altruism). The density plots of these contributions are shown in Figure 2: in the cases of trust and cooperation we observe a bimodal distribution with peaks in the middle point (£5) and the top (£10); in the case of altruism, we observe three typical responses: give nothing, give half the amount, and give the whole amount.  Figure 2: in the cases of trust and cooperation we observe a bimodal distribution with peaks in the middle point (£5) and the top (£10); in the case of altruism, we observe three typical responses: give nothing, give half the amount, and give the whole amount. Respondents also exhibit high trustworthiness, with an average of £8.9 (σ = 4.74) returned by each respondent. Furthermore, in line with Wilson [28], we find that trust pays off: senders often receive more than they have paid out and, on average, expectations of receiving a certain payoff are matched (for details, see Figure A1). This is contrary to the seminal work by BDM in which first-movers had sent 5.16 USD, while second movers had paid back only 4.66 USD. This led to the proposition: trust does not payback. A plethora of similar findings in the literature often raises the concern that trust is on the decline (i.e., investment does not yield return at a given time). Nonetheless, like Wilson [28], our findings further demonstrate trust begets trustworthiness.
Experimental trust varies significantly with age: trust measures are highest in the group of 40-60-year-old individuals. Similarly, levels of cooperation increase with age. Furthermore, trust levels are higher for males (p = 0.0083) and lower for those who identify as Catholics (p = 0.0276). Both cooperation and altruism levels are also seen to be lower for Catholics. Experimental trust and cooperation are decreasing in the size of the household (p = 0.02). There are no significant differences in levels of trustworthiness by these demographic characteristics. ?   Table 3a presents Pearson correlation coefficients for our experimental and self-reported measures of trust and other-regarding preferences, adjusted for Bonferroni corrections. We find positive correlations between the GSS and the lost wallet self-reported measure of trust and the experimental measures of trust. The former is also positively correlated with experimental measure of trustworthiness. ?   Table 3b presents our findings using model specification (3) to test for H1. The five columns in Table 3 correspond to different estimations for the five different survey Respondents also exhibit high trustworthiness, with an average of £8.9 (σ = 4.74) returned by each respondent. Furthermore, in line with Wilson [28], we find that trust pays off: senders often receive more than they have paid out and, on average, expectations of receiving a certain payoff are matched (for details, see Figure A1). This is contrary to the seminal work by BDM in which first-movers had sent 5.16 USD, while second movers had paid back only 4.66 USD. This led to the proposition: trust does not payback. A plethora of similar findings in the literature often raises the concern that trust is on the decline (i.e., investment does not yield return at a given time). Nonetheless, like Wilson [28], our findings further demonstrate trust begets trustworthiness.

Does Experimental Trust Predict Self-Reported Generalised Trust
Experimental trust varies significantly with age: trust measures are highest in the group of 40-60-year-old individuals. Similarly, levels of cooperation increase with age. Furthermore, trust levels are higher for males (p = 0.0083) and lower for those who identify as Catholics (p = 0.0276). Both cooperation and altruism levels are also seen to be lower for Catholics. Experimental trust and cooperation are decreasing in the size of the household (p = 0.02). There are no significant differences in levels of trustworthiness by these demographic characteristics. ?   Table 3a presents Pearson correlation coefficients for our experimental and selfreported measures of trust and other-regarding preferences, adjusted for Bonferroni corrections. We find positive correlations between the GSS and the lost wallet self-reported measure of trust and the experimental measures of trust. The former is also positively correlated with experimental measure of trustworthiness.  ?   Table 3b presents our findings using model specification (3) to test for H1. The five columns in Table 3 correspond to different estimations for the five different survey measures of trust, namely, the OECD generalised trust, the General Social Survey, the Games 2021, 12, 66 9 of 16 lost wallet, and the interpersonal and institutional trust. When the estimations account for multiple hypotheses testing, we find that the experimental trust significantly and positively predicts self-reported trust as measured by the GSS question. Our results are robust across alternative specifications, such as a simple linear regression without any additional covariates or by including further control variables, such as risk aversion and using truncated experimental trust (see Table A5).

Does Experimental Trust Predict Self-Reported Generalised Trust
We further find that altruism, as measured by the contribution in the dictator game, significantly and positively predicts self-reported trust measures, with the only exception of the lost wallet question: for example, one standard deviation increase in experimental altruism is associated with an increase of~0.13 standard deviation points in our generalised measures of trust. On the contrary, cooperation, as measured by the unconditional contribution in the public goods game, significantly and negatively predicts the OECD generalised trust measure: one standard-deviation point increase in cooperation is associated with ã 0.15 standard deviation point decrease in this generalised trust measure. Risk-preferences, as measured by the BEG game, do not significantly predict any self-reported trust measures.
Finally, we generalise our findings to an additional 6025 representative responses from the OECD Trustlab surveys in six other OECD countries, namely, France, Germany, Italy, Korea, Slovenia, and the United States. To this aim, we modify model specification (3) to include country-fixed effects. Our findings are presented in Table A5d for three outcome measures of survey trust, which were measured in all these countries. Controlling for individual country fixed effects besides demographic characteristics and other regarding preferences (The standard BEG risk game was played only in Germany, Italy, United Kingdom, and the United States; while the high stake BEG game was only played in the United Kingdom. We do not report these results because it limits our responses to a few countries. Nonetheless, the results in Table 3 are robust when controlled for risk-aversion), we find again that experimental trust significantly predicts generalised trust measured by the GSS. Unlike in Table 1, we also find experimental trust to significantly predict the OECD generalised trust and the lost wallet measure as well: while a standard-deviation increase in experimental trust is associated with~0.04 standard-deviation increase in the GSS generalised trust, the same is associated with~0.02 and~0.01 standard-deviation increase in OECD generalised trust and lost wallet measures, respectively. These results are robust to multiple hypotheses correction as indicated in Table A5. In line with our prior findings, the predictive power of altruism towards the OECD, GSS, and lost wallet trust measures is robust to our country-fixed effects specification: a standard-deviation increase in altruism is associated with a~0.1 standard-deviation increase in survey trust measures, ceteris paribus. A joint F-test comparing the larger model with country fixed effects to the nested model without such effects returns a p-value less than 0.001 for each of all three outcome measures of self-reported trust, indicating that the country fixed-effects model has a better fit than its nested counterpart. Table 4 presents our findings using model specifications (1) and (2) to test for H2. In this case, in the specification with control variables, we further control for reciprocity, as measured by the conditional contribution in the public good game, which is often found to be correlated with trustworthiness.

Does Experimental Trustworthiness Predict Self-Reported Generalised Trust?
The main result is that experimental trustworthiness does not significantly predict the self-reported trust measures. It should be noticed here that our experimental measure of trustworthiness is derived by decisions made by players 2 in the trust games elicited with the strategy method. Whilst the use of the strategy method is common in the literature (Fehr et al. [13]; Glaeser et al. [14]), it may introduce some measurement errors in our experimental measure of trustworthiness. Our finding that experimental trustworthiness does not significantly predict self-reported trust measures can be explained by an attenuation bias related to the fact that our only explanatory variable (experimental trustworthiness) is indeed affected by such measurement errors. Legend: Holm's * p < 0.05; ** p < 0.01; *** p < 0.001. Notes: superscript a. Linear Probability Model; superscript b. Includes age, highest educational qualification, country of nativity, household income, household size, gender, religion, and region of residence in the UK.
In line with our findings in Table 3, altruism significantly predicts survey trust measures, with the exception, again, of the hypothetical lost wallet. We further find that reciprocity, as measured by conditional contribution to the public goods game, significantly and negatively predicts the OECD generalised measure of trust and also institutional trust. Our results are also robust to a simple linear regression specification without any additional covariates (see Table A5b).
Finally, we also generalise our analysis to evaluate the relationship between experimental trustworthiness and the survey trust measures using the country fixed effects. We find that a significant and positive correlation between the two measures is akin to our findings in Table 2. This correlation between experimental trustworthiness and the GSS trust measure is robust when controlling for experimental trust: conditional on controls, on average, a standard-deviation increase in experimental trustworthiness is significantly and positively correlated with a 0.05 standard-deviation increase in generalised trust.

Conclusions
We have systematically tested the associations between the experimental trust game and a range of popular self-reported measures for trust and trustworthiness, such as the General Social Survey (GSS) question on trust and the Rosenberg scale for self-reported trust. Innovatively, we have done this using a large, nationally representative sample of the UK population (n = 1052), controlling in our estimates for other social preferences as measured by other games such as the dictator game and the public good game, and correcting for multiple hypothesis testing.
Our main finding is that trust, as measured by the experimental trust game, significantly and positively predicts generalised self-reported trust in the GSS question. We have also conducted a variety of robustness checks in our estimates, including alternative empirical specifications controlling for a broad range of individual characteristics, as well as a replication of our results using nationally representative samples of six other OECD countries (France, Italy, Germany, Korea, Slovenia, and the US). To the best of our knowledge, this is the first time that evidence is provided that trust, as measured in an experimental trust game, significantly predicts self-reported trust. This is also the first time that such an association has been documented using a large nationally representative sample. This finding sheds a new, positive, light on the debate about the external validity of social preferences games (Campos-Mercade et al. [31]; Charness and Fehr [32]; Galizzi and Navarro-Martinez [4]; Levitt and List, 2007a,b [19,20]). Data Availability Statement: More information about the OECD Trustlab project, its variables, data and results is publicly available here: https://www.oecd-ilibrary.org/economics/trust-and-its-determinants_869ef2ec-en (accessed on 2 September 2021).

Conflicts of Interest:
The authors declare no conflict of interest.