Biogeographic Ancestry, Cognitive Ability and Socioeconomic Outcomes

The cause(s) of ubiquitous cognitive differences between American self-identified racial/ethnic groups (SIREs) is uncertain. Evolutionary-genetic models posit that ancestral genetic selection pressures are the ultimate source of these differences. Conversely, sociological models posit that these differences result from racial discrimination. To examine predictions based on these models, we conducted a global admixture analysis using data from the Pediatric Imaging, Neurocognition, and Genetics Study (PING; N = 1,369 American children). Specifically, we employed a standard methodology of genetic epidemiology to determine whether genetic ancestry significantly predicts cognitive ability, independent of SIRE. In regression models using four different codings for SIRE as a covariate, we found incremental relationships between genetic ancestry and both general cognitive ability and parental socioeconomic status (SES). The relationships between global ancestry and cognitive ability were partially attenuated when parental SES was added as a predictor and when cognitive ability was the outcome. Moreover, these associations generally held when subgroups were analyzed separately. Our results are congruent with evolutionary-genetic models of group differences and with certain environmental models that mimic the predictions of evolutionary-genetic ones. Implications for research on race/ethnic differences in the Americas are discussed, as are methods for further exploring the matter.


Introduction
Cognitive ability, whether measured by IQ tests, Piagetian tests, educational/scholastic tests (e.g., PISA, TIMSS, etc.), or other indices of cognitive functioning, differs on average between biogeographic ancestry groups (BGAs). (We use the term "cognitive ability" instead of "general intelligence" or "general cognitive ability" to make clear that we are not committing to a claim about the psychometric nature of the underlying construct. For the purpose of this study, the distinctions between cognitive ability in general and general cognitive ability are not important.) According to Shriver and Kittles [1], biogeographic ancestry is the personal genetic history, indexed by ancestrally informative autosomal markers, which reflects an individual's overall ancestry with respect to "population groups" (also referred to as "clusters," "ancestral groups," and "ancestral populations"). These reference biogeographic ancestry groups differ owing to the effects of evolutionary factors, such as "isolation by distance," and barriers that "have all affected human migration and mating patterns in the past." Biogeographic ancestry groups have commonly been called "races" (see [2,3]). Examples Psych 2019, 1, 1-25; doi:10.3390/Psychology1010001 www.mdpi.com/journal/psych of these groups include East Asians, Europeans, sub-Saharan Africans, and Amerindians (see [4]). Cognitive ability differences appear both between nations (e.g., in predominantly European versus predominantly African national populations), and between self-identified racial and ethnic (SIRE) groups within nations (e.g., Whites versus African Americans in the USA [5][6][7]). The cause(s) of interand intra-national cognitive ability gaps is highly uncertain and hotly debated, with some researchers arguing that genetic differences are substantially or even predominantly responsible for at least some of these gaps (e.g., [8]) and others maintaining that non-genetic environmental factors fully account for the gaps (e.g., [9,10]). Nevertheless, in the most recent survey of intelligence researchers, "genes were rated as the most important cause (17%) [of international differences], followed by educational quality (11.44%), health (10.88%), and educational quantity (10.20%)" [11]. Further, 90% of the surveyed experts believed that international differences were at least in part genetic in origin. Rindermann, Coyle, and Becker [12] also reported that 83% of survey respondents believed that the Black-White cognitive gap in the USA is partially due to genetic differences between the relevant populations. While there certainly is no general consensus among researchers, a substantial number of them believe that both international and certain intra-national SIRE differences in cognitive ability have a partial genetic basis [11,13].
Although racial and ethnic groups within a country can differ behaviorally for a variety of reasons (e.g., selective self-identification or local selective migration), possible genetic bases of these differences are typically analyzed in an evolutionary framework. When discussing evolved human diversity, one can conceptualize groups in a number of different ways, e.g., as subspecies (taxonomically significant subdivisions of a species), ecotypes (environmentally adapted types), clines (character gradients), and morphs (alternative phenotypes in a population). Following the work of Charles Darwin, communities are frequently delineated by propinquity descent since descent is understood as inductively potent. The preferable term to describe descent-based groups (e.g., variety, genetic population, race, genetic cluster, ancestry group, etc.) is a matter of ongoing semantic dispute. Here we call them "biogeographic ancestry groups" (again, BGAs), as is frequently done in genetic epidemiology. There are phenotypic and genetic differences between these groups, with evolutionary forces acting over relatively short spans of time likely having caused this divergence [14]. Importantly, these ancestry groups are delimitable using genomic data.
The cause(s) of inter-BGA differences in cognitive ability specifically is the subject of several evolutionary theories, which decompose broadly into two categories: (1) Pleistocene-selection models and (2) Holocene-selection models. The former posit the action of evolutionarily novel factors that differed systematically in the regions to which human groups migrated after leaving Africa around 60 to 100 kya (thousand years ago), and which specifically selected for increased cognitive ability as a mechanism for enhancing survival [15]. Salient evolutionarily novel challenges likely included the presence of seasonality, specifically cold winters during the main Würm glaciation event 60 kya, when temperatures in Europe and (especially) Northeast Asia were considerably lower on average than they are today. Extreme cold, coupled with the challenge of provisioning for the future such as to anticipate seasonality (e.g., food storage) have been proposed as sources of selection for higher cognitive ability and other somatic traits related to innovation and productivity [16,17]. These climatic models predict that North East Asian and European BGA groups will have higher levels of cognitive ability than Pacific Islanders, sub-Saharan Africans, and Amerindians (at least from the tropical and subtropical regions of the Americas).
Research in comparative zoology supports climatic models of this kind. A large body of research has investigated the climatic correlates of cognitive ability-usually measured with brain size as a proxy variable-in non-human animals. Non-migratory birds have been a primary choice of study [18][19][20][21] given their range of habitat and the ease of studying them. Studies have revealed the expected patterns, namely that birds that live further north and in more seasonally affected areas have larger brains (controlling for body size), more flexible behavior, and more innovative behavior [18,20,22].
Holocene-selection models stress the observation that the rate of adaptive evolution during the Holocene (beginning roughly 12 kya) was on the order of 100 times that experienced by humans during the preceding Pleistocene epoch, thus the evolutionary factors crucial to the origin of these differences may have arisen relatively recently [14]. Culture-gene co-evolution theory posits that cognitive ability and cultural complexity arose in tandem via a feedback process. Eurasian populations' transition into sedentarism during the late Pleistocene and early Holocene would have been associated with a substantial increase in cognitive challenges, such as those related to competition over settled land, management of agriculture, and sustainment of higher standing population densities [14,23]. Innovations in modes of production and social organization (such as intensified division of labor) would have led to hierarchical societies involving strong individual-level competition for finite resources, which may have been a major source of social selection favoring higher levels of cognitive ability. Clark [24,25] documents the persistence of downward social mobility among the descendants of elites who competed for limited economic and social resources, with the less competitive sinking into lower-status occupational niches. The end result of this process was a persistent "bootstrapping" of the population, as cognitive ability (and other salient "bourgeois" traits) rose across all levels of the social hierarchy, leading to greater degrees of industriousness and innovativeness. This may in turn have boosted the competitiveness of certain European, and perhaps Northeast Asian, BGAs, permitting group-level expansion (or colonialism), which likely enhanced the biocultural "corporate" fitness of these populations-a process that may only have ceased in the West in the mid-19th century. Around this time, the advent of milder climates and concomitantly reduced ecological stress (corresponding to the end of the Maunder Minimum), coupled with both social and technological innovations that allowed those with relatively high cognitive ability to control their fertility, potentially reversed the fitness advantage of those with relatively high cognitive ability compared to those with relatively low cognitive ability [26]). Given that these Holocene-selection models posit that Eurasian populations had the greatest exposure to environmental challenges favoring the fitness of those with relatively high cognitive ability, they predict that selection for cognitive ability was stronger in Eurasian populations than non-Eurasian ones, such as Pacific Islanders, sub-Saharan Africans, and Amerindians.
A further evolutionary theory of inter-BGA cognitive differences is the disease-stress model. This has assumed forms that posit [27] and do not posit [28] genetic differences to explain inter-BGA disparities in cognitive ability. Models of the latter type propose that in the Pleistocene environment of evolutionary adaptedness, humans were repeatedly exposed to periods of high and low parasite stress. They further maintain that this exposure to variable levels of parasite stress favored the evolution of epigenetic adaptations that dynamically regulate tradeoffs of bioenergetic investments into brain/cognitive development or immune system functioning in response to environmental challenges, with greater investments into the latter over the former occurring in high-parasite-stress contexts. Therefore, when humans radiated into novel, more northerly and easterly low-parasite-stress ecologies, they came pre-adapted with the capacity to developmentally trade immune system functioning for higher cognitive ability, which is reflected in very strong inverse ecological correlations between national IQ and parasite-burden indices [28]. Given the global distribution of parasite burdens, this model predicts, as do the other evolutionary theories, that the populations of Europe and North East Asia have higher cognitive ability relative to those of the Pacific Islands and the tropical and subtropical regions of the Americas, as well as the populations of sub-Saharan Africa. But this version of the disease-stress model is clearly inadequate to account for the total global pattern of inter-BGA cognitive ability variation, since it seemingly cannot explain BGA cognitive ability gaps found within nations, where BGA groups do not differ substantially in their exposure to parasites (since they inhabit the same general environments that the nations encompass). Therefore, our statistical analyses do not endeavor to test predictions of this disease-stress model.
Conversely, genetic disease-stress models posit that globally variable geographical factors covary with disease burdens, and that genetic adaptations to local disease burdens have partially generated inter-BGA differences in social outcomes and related phenotypes, including cognitive ability [27].
While our data do not permit any direct test of this, or any other, particular evolutionary account of the emergence of inter-BGA cognitive ability variation, global admixture analytic findings indicative of a genetic etiology of such variability would be consistent with all the evolutionary theories reviewed here, apart from the epigenetic disease-stress model.

Genomic Studies of Selection in Humans
There are several fairly well-documented cases of recent simple selection in humans. Examples include skin tone and high-altitude adaptations. The study of highly polygenic selection via changes in frequencies of many alleles (soft sweeps) in humans has advanced recently due to the availability of large-scale, powerful computer clusters and large genomic datasets. A number of recent studies based on genomic data have found evidence of recent polygenic selection in humans over time [29,30] and space [31][32][33][34]. These findings indicate selection for, e.g., height [31], body mass [32], cognitive ability [33], educational attainment [34,35], and schizophrenia [36].
One significant concern with genomic studies that compare polygenic score (PGS) frequencies between major ancestry groups is the "transethnic validity" of PGS [37]. Most genome-wide association studies (GWASs) are done solely on people of European ancestry (to avoid population-stratification confounding). These studies unfortunately yield predictive models that are less useful in non-Europeans, particularly Africans [38,39]. This problem results from two factors. First, the genetic variants discovered in GWASs are typically not causal (with respect to the trait of interest) but are so-called "tag variants." These are variants that are close enough on the genome to the causal variants such that their presence in the population is statistically linked (known as linkage disequilibrium; LD). Second, the degree to which two variants on the genome are in LD partly depends on the populations studied and their genetic distance; this is because random genetic patterns can arise over time (i.e., genetic drift in LD).
The effect is that a given variant which tags a causal variant in Europeans may not do so, or not very well, in other populations, thus reducing the validity of European-derived estimates of the true PGS. As a result, the predictive validity of associated variants in one major BGA group frequently does not transfer to others. While a number of methods have been employed to control for drift-related effects [33,36], and while robust or partially robust inter-BGA differences in educational attainment and intelligence PGS have been found, until causal variants that are not affected by LD [38] are identified, PGS-based results will continue to carry a degree of uncertainty.

Admixture Analysis
Admixture analysis-analysis of genetic ancestry in previously isolated but recently interbred populations, which relates genetic ancestry to outcomes-is a potent tool used by medical geneticists for the exploration of the source of trait differences and disease disparities in and among admixed populations. The relationships between genomic ancestry and phenotype are treated as indirect evidence of genetic causation, especially when confounds are controlled in regression analysis (e.g., [40][41][42]). Admixture analysis includes global admixture analysis and admixture mapping. When ancestral BGA groups vary in the frequency of genetic variants underlying a trait, in admixed populations the phenotype of interest will be correlated with BGA in genomic regions near the causal genetic variants. This situation allows for the identification of associated loci, a process called admixture mapping [43,44]. When the trait has a complex genetic architecture, one where thousands of loci are assumed to contribute to the phenotype, an appropriate first step is global admixture analysis. This process seeks to identify associations between global BGA and phenotype, without attempting to identify local regions of a genome associated with a phenotype. One advantage to global admixture analysis is that it requires much smaller sample sizes compared to admixture mapping.
Templeton [45] has intricately detailed the logic of global admixture analysis as applied to evolutionary-genetic models. Generally, global admixture analysis within ethnic groups can be viewed as a Mendelian "common garden" experiment, since members of the same ethnic groups experience similar cultures and environments. Factors affecting SIRE groups in general (e.g., stereotype threat, race-based discrimination, segregation, cultural norms, dialect, language, etc.) are either controlled for or attenuated. This basic logic has been adopted by genetic epidemiologists e.g., [40][41][42]46] who frequently examine the association between global genetic ancestry and traits either within ethnic groups or after controlling for SIRE. Admixture analysis has been applied to study the etiology of inter-BGA differences in, among other traits, height [47], sleep behaviors [48], and brain morphology [49]. The quantitative predictions and theoretical basis of global admixture analysis have often not been explicated. Doing so requires employment of a series of simulations that incrementally increase in model complexity and realism. We do this in Appendix 1 of Supplementary File 1, which contains the simulations and explanations.
Before the advent of genomic ancestry testing, attempts were made to test evolutionary predictions with respect to inter-BGA cognitive ability differences using other methods (for reviews, see [50]). Ancestry was estimated using traits such as skin color, blood groups, and reported genealogy. In other words, these studies relied on various proxies or poor measures of genomic ancestry, and further the samples were usually small; moreover, no one has systematically meta-analyzed the data to settle the interpretive dispute. More recent studies have utilized nation-level data on the frequencies of Y-chromosomal haplogroups [23,51], estimates of between-country genetic distance [52], and estimates of regional and national-level ancestry percentages [5] as predictors of national variation in cognitive ability. These recent studies have been restricted to the national level of analysis, which necessarily reduces sample resolution. As a result of these problems and others, there has been disagreement about whether past studies employing these techniques show, in admixed populations, the expected relations between BGA and cognitive ability [8,53,54].
Thus far, no published study has examined genomic ancestry, assessed using genomic era methods, in a large sample and related the findings to cognitive ability, while taking into account the confounding effect of racial/ethnic identity. One study [55], using the same sample presently examined, reported ANOVA results for genetic ancestry and test scores. The authors, however, did not report coefficients for specific ancestries, did not use general factor scores, and did not examine the effect of ancestry within SIRE groups. The last of these is a critical aspect of the so-called "common garden" experiment.
A large meta-analysis of pan-American epidemiological studies found that European genetic ancestry was robustly associated with better socioeconomic outcomes relative to African and Amerindian ancestry in admixed populations (European: r = 0.18, k = 28, N = 35,476.5; [56]). Amerindian and African ancestry were related to poorer socioeconomic outcomes: r = −0.14, k = 31, N = 28,937.5 and r = −0.11, k = 28, N = 32,710.5, respectively. Consistent with evolutionary models, these associations were found within admixed ethnic groups, and when the effect of SIRE was statistically controlled for. Given these associations between genetic ancestry and SES, and the moderate-to-strong relationship between cognitive ability and SES [57], it is likely that BGA is also associated with higher cognitive ability at least partially independent of SIRE. A related finding is that BGA is a strong predictor of regional cognitive and general socioeconomic outcomes across the Americas [5,6,58] both at the country-and first-order division (e.g., province, state, district) level. This provides ecological support for the expectation that individual BGA will be robustly associated with cognitive ability.
The purpose of the current study is to test the prediction that BGA is associated with cognitive ability after statistically controlling SIRE (which indexes a number of factors commonly invoked to explain cognitive gaps between BGAs, such as racial discrimination), and to quantify the magnitudes of the individual-level associations. More generally, we sought to apply global admixture analysis, via genomic data, on American samples. We aimed to determine if associations between genetic ancestry and IQ are more consistent with commonly proposed evolutionary-genetic or social environment-based explanations for mean cognitive ability differences observed between European and East Asian descent groups relative to African, Amerindian, and Oceanian ones. Based on convergent results from pre-genomic-era studies, national-level analyses of ancestral markers, genomic PGS research, individual-level SES-admixture results, and regional-level SES/cognitive ability-admixture results, we expect that results similar to those found for SES-admixture will be found for cognitive ability.

Material and Methods
Data used in this study were obtained from the Pediatric Imaging, Neurocognition, and Genetics (PING) database (http://ping.chd.ucsd.edu/). The primary aim of PING was to create a well-standardized and carefully organized resource of magnetic resonance imaging (MRI) data, comprehensive genotyping data, and developmental and neuropsychological assessments for a large cohort of developing children. The PING data were based on a large sample (N = 1,391 with genetic ancestry data) of healthy American children of diverse ancestries [55]. Participants were not nationally representative, but were rather recruited from the greater metropolitan areas of Baltimore, Boston, Honolulu, Los Angeles, New Haven, New York, Sacramento, and San Diego. Individuals who had medical conditions that could affect their development were excluded from the recruitment process.

Cognitive Ability
Participants were given the National Institute of Health toolbox cognitive tests. These tests have previously been validated [59]. The cognitive sub-domain assessed by each test, along with the subtests' psychometric characteristics, are detailed by Akshoomoff et al. [55]. The seven cognitive tests were as follows: Dimensional Change Card Sort Test (Card Sort); Flanker Inhibitory Control and Attention Test (Flanker); Picture Sequence Memory Test (Picture Sequence); Pattern Comparison Processing Speed Test (Pattern Recognition); Oral Reading Recognition Test (Pattern Recognition); List Sorting Working Memory Test (List Sort); and Picture Vocabulary Test (Vocabulary).
Some data were missing (4% of cells, 14% of cases, had at least 1 missing data point). We imputed values for cases that were missing less than half of their data points rounded down (three or fewer). Imputation was performed using IRMI [60,61]. After this, there were 1,369 cases with complete cognitive data and genetic ancestry (with 1,370 cases for some subtests).
Next, the effects of age was regressed out by first fitting a local regression model (LOESS; see [62]) with age as the predictor and then saving the residuals. LOESS was used for age because this approach is able to capture complex non-linear patterns in the data. For this regression, we used age at neuropsychological testing as the primary variable and we filled in missing data (n = 3) with age at MRI (total = 1,369 cases with g scores). The mean age was 11.75 (SD = 4.88; range = 3-21). Supplementary Figure S1 shows the distribution.
These residuals were then residualized again for sex using a standard linear model (OLS). As shown in Table 1, the age/sex corrected scores exhibited a positive manifold. The factor loadings of the subtests are shown in Table 1.
Finally, we factor analyzed (minimal residual factor analysis/PAF) the adjusted data for the seven tests and extracted a general cognitive ability (g) factor [63], using the psych 1.6.9 package [64]. The pattern of factor loadings looked normal and the g factor explained 26 percent of the variance. Using this method, g scores for 1,369 individuals were derived. Scores on this factor were saved and standardized (M = 0, SD = 1) for further analysis. We were then faced with the choice of whether to standardize the scores to the White subsample or whether to use the entire sample. We chose to use the full sample because this would counterbalance the likely cognitive ability/social status selection of the sample due to urban sampling. Sample sizes and descriptive statistics for the factor analysis are provided in Tables S13-S16 of Supplementary File 2. The intercorrelations and g loadings were low compared to those found with standard IQ batteries; for example, the fourteen Woodcock-Johnson-III subtests showed an average g loading of 0.60 [65], while our seven subtests showed an average of 0.50 (0.37 to 0.57). These results, though, were comparable to those found in some adult datasets, such as the Human Connectome Project, which used similar subtests [66]. As the g scores for this sample predict parental SES no less well than typical IQ scores, and as they exhibit more or less the same magnitudes of ethnic differences as found using standard IQ batteries, there is no strong reason to suspect that these scores are particularly unreliable.
It is possible that there may be measurement invariance (MI) issues between SIRE groups. We do not attempt to explore this issue here, as it is not directly relevant to the hypothesis we test. The question we investigate is whether socioeconomic and cognitive differences between SIRE groups are, to some degree, statistically explainable by genetic ancestry, not whether the differences have the same sociological and psychometric meaning as differences between individuals within SIRE groups. More generally, we conceptualize our g scores as statistical constructs which represent summaries of observed subtest scores. We do not imply that these score differences index differences in a causal latent general factor ("biological g"; [67]). The same consideration applies with respect to our general SES scores. As such, more detailed analyses regarding the relation between observed scores, latent factors, and SIRE/genetic ancestry are unnecessary for testing the hypothesis (Dolan, pers. comm., 1 January 2018).
Nonetheless, to assess factorial invariance, we computed Tucker's congruence coefficients for standard SIRE groups with sample sizes greater than 100. The congruent coefficient matrix is shown in Table 2. The congruence coefficient is a measure of factor similarity, with 0.95 or greater indicating virtually identical to identical factors and 0.85 to 0.94 indicating fair factor similarity [68]. For only two comparisons, Asian-African American and Asian-Hispanic, were the coefficients below 0.95. Note, these results do not directly address the issue of factor invariance with respect to genetic ancestry.

Parental Socioeconomic Status
The following SES variables were available: household income, guardian 1 educational level, guardian 2 educational level, guardian 1 occupational level, and guardian 2 occupational level. For 95% of the participants for which relationship data were available, guardian 1 was either a biological mother or father (missing data: 2%; self: 1%) and for 92% of the participants for which relationship data were available, guardian 2 was either a biological mother or father (missing data: 32%). Since most often guardians were parents, SES is referred to as parental SES. As before, some data were missing. To maximize the available sample size, data with at least one data point were imputed. Finally, variables were factor analyzed to extract a general SES factor. Scores (1380; 1363 with g scores also) were saved (M = 0, SD = 1) for further analysis. As before, the standardization was done the full sample, rather than the White subset. Intercorrelations and descriptive statistics for the factor analysis are provided in Tables S17 and S18 of Supplementary File 2.

SIRE and Genetic Ancestry Percentages
Genetic ancestry percentages were available in the PING data files (see [55] for details). These were calculated using over 15,000 single-nucleotide polymorphisms or SNPs (i.e., single DNA base pair variations). To access ancestry percentages, supervised clustering analysis with the ADMIXTURE algorithm was used. Ancestry was assigned to six major biogeographic clusters corresponding to indigenous Europeans, Africans, Americans, East Asians, Oceanians, and Central Asians. Ancestry percentages for a given individual summed to one with each individual being assigned a percentage with respect to each component. A total of 1391 individuals in the sample had genetic ancestry data. In addition to genetic ancestry percentages, multiple-option, dichotomous SIRE variables were available. These were based on the five major SIRE racial categories in the United States (White, African American, American Indian, Asian, and Pacific Islander) and the one US SIRE ethnic category (Hispanic). Some cases had missing SIRE data (1% of the cells); and these were imputed as "false." A dummy variable, "Other," was also created and set to "true" for participants who selected no SIRE.
Owing to the multiple-response nature of the SIRE categories, there were a number of different methods for defining them (for discussion, see [69]). Since concern has been raised about the impact of a researcher's "degree of freedom" [70] and about assigning people to SIRE categories [69], we coded SIRE groups in four different ways. We did this to assess how coding decisions affected outcomes. The codings were: standard coding (for which separate categories are created for Hispanic and non-Hispanic multi-racial individuals along with each of the five racial identities), common combination coding (for which a category is created for each of the unique combinations of SIRE identities), continuous or interval coding (for which individuals are assigned a percentage of each SIRE category based on the number of responses chosen), and dummy coding (for which each of the six ethnic and racial categories is used as a dichotomous categorical predictor). The methods are discussed in more detail in Supplementary File 2. Table 3 shows the sample characteristics for the full sample (which had g scores) and by Standard SIRE groups. A one way ANOVA indicated a significant effect of SIRE on age at the p < 0.01 level for the eight SIRE categories [F(7, 1,361) = 7.42, p < 0.01]. A Chi-Square test also indicated a significant effect of SIRE on sex [(X 2 (7, N = 1,369) = 21.372, p < 0.01)]. As the effect of sex and age was regressed out of cognitive ability, and as sex and age have no direct effect on genomic ancestry or Parental SES, no further steps were taken. Note: N is the number of cases with g scores; the number additionally with socioeconomic status (SES) scores may be smaller. SD = standard deviation.

English as First Language
For 11% of the sample, English was not the participant's first language. As cognitive tests could be biased against non-native speakers, a dichotomous English as First Language (EFL) variable was also included as a variable.

SIRE and Genomic Ancestry
It is known that genetic ancestry varies substantially within some US SIRE groups but not others [71]. To examine the distribution of genetic ancestry by SIRE in the current dataset, means and standard deviations of genetic ancestries were calculated by standard SIRE. The results are shown in Table 4 (Table S12 of Supplementary File 2 reports more descriptive statistics for the genetic ancestries). As expected, Whites were almost entirely genomically European (97%), as has been found in previous studies of US citizens (e.g., [71]). Other groups showed more admixture. For these, there were substantial discrepancies between SIRE and genetic ancestry. These discrepancies allow for the admixture analyses. The correlations between continuous SIRE coding [for description, see Section 2.3 above] and genetic ancestry are shown in Table 5. It can be seen that continuous SIRE coding is a reasonably good index of genetic ancestry. Since US SIRE groups are fairly non-admixed, much of the variance is explained by recent patterns of exogamy which can be captured by this type of coding.

Bivariate Relationship between Genetic Ancestry, Cognitive Ability, and Parental SES
The correlations between CA, parental SES, and each of the genetic ancestries are shown in Table 6. The correlations were roughly as would be expected given the well-documented cognitive and socioeconomic differences between US SIRE groups. European ancestry had a moderate positive correlation with both cognitive ability and SES (r = 0.23 and 0.32, respectively), while negative relationships were seen for African (r = −0.33 and −0.30), Amerindian (r = −0.15 and −0.24), and Oceanian (r = −0.08 and −0.20) ancestries. The Genomic ancestry × SES correlations are larger than those previously reported by Kirkegaard et al. [56], presumably because this sample is not decomposed by SIRE group, leading to reduced restriction of range and thus higher correlations. The remaining ancestries had very small or inconsistent relationships. Not much should be made of the bivariate analysis for ancestry and outcomes because these confound the effects of multiple ancestries, as well as the effects of non-genetic causes. In particular, the ancestry variables are necessarily negatively related in general because every individual's ancestry must sum to one (see Tables S8-S11 of Supplementary File 2 for correlation matrices by subpopulation).
Previous research has found a modest relation between parental SES and cognitive ability [57]. In our sample, the relation (r = 0.42) was somewhat stronger than typically reported. This increased strength of association is probably in part due to the use of a general socioeconomic factor, which is a more reliable measure, as opposed to individual measures of parental socioeconomic status.

Main Analyses
The key question is whether the bivariate associations between genetic ancestry and outcomes follow ancestry or racial/ethnic identification. This question was explored in two ways. First, the full sample was examined with SIRE being statistically controlled (4.1). Second, subgroup analyses were conducted (4.3). The analyses complement one another. The full sample analysis provides more power. Additionally, it allows for a robust exploration of the relation between parental SES, cognitive ability, and genetic ancestry (4.2). The subsample analyses allow focus to be directed to specific groups to see if the full sample associations replicate in subsamples.

Relationship between Genetic Ancestry, Cognitive Ability, and Parental SES
The main regression analysis strategy consisted of fitting models with SIRE (Model 1) and SIRE and genetic ancestry (Model 2). As there were two outcomes (cognitive ability and parental SES) and four methods of coding for SIRE, there were eight tables. The results from the standard and continuous SIRE models are shown in Tables 7-10. The results based on the common SIRE and the dummy SIRE methods were largely redundant with respect to those based on, respectively, the standard SIRE and the continuous SIRE methods, and so are presented in the Supplementary File 2 (Tables S1-S4). The unstandardized beta coefficients are shown with both White SIRE and European ancestry as the reference classes (and, thus, unstandardized betas of 0). Since cognitive ability and SES scores are already standardized, these unstandardized betas represent a change in a standard deviation of cognitive ability/SES over a change in percentage of a given ancestry. The adjusted coefficient of determination, denoted as r 2 -adj., is reported to facilitate model comparison. As individuals were assigned ancestry percentages for each genetic ancestry group, the sample sizes are the same for each component. Across genetic ancestries, the standard errors varied substantially as a function of variance in admixture. Since one of the objectives was to determine the relative utility of including genetic ancestries as a predictor, ancestry components with high standard errors were not dropped (we did this, however, for some subgroup analyses below, see: Section 4.3 for discussion).    In general, the results of the four models yielded consistent patterns (for computations, see Tables S1-S4 in Supplementary File 2). Across all four models (the two sets reported above and the two reported in the supplementary file), African ancestry is consistently associated with large negative betas (cognitive ability: mean −1.46, range −1. English as a first language is associated with consistent but weakly positive betas in the genetic ancestry models (cognitive ability: mean 0.04, range 0.03 to 0.05; SES: mean 0.13, range 0.11 to 0.16). These latter findings might reflect small effects of language bias in the tests, as well as the effects of factors related to parental immigration status. Finally, consistent with other research, continuous SIRE coding was a better predictor of cognitive ability and SES than was standard SIRE coding. And, as expected given the covariance between continuous SIRE and genetic ancestry, the use of continuous SIRE reduced the incremental predictive ability of genetic ancestry.
As the SIRE effects had wide confidence intervals, interpretation warrants caution. That said, the African American SIRE betas are universally positive across codings in the genetic ancestry model (cognitive ability: mean 0.24, range 0.09 to 0.31; SES: mean 0.26, range 0.02 to 0.42), with the exception of the "Hispanic, African American" common combination group (cognitive ability: −0.38; SES: −0.40). For Hispanic SIRE groups, the situation was reversed in that these predictors were uniformly related to worse outcomes (cognitive ability: mean −0.31, range −0.46 to −0.17; SES: mean −0.20, range −0.28 to −0.11). Since these SIRE effects are independent of genetic ancestry, they may reflect the effects of social favoritism/discrimination, or of SIRE-related cultural factors. Alternatively, they could be the product of selective ethnic attrition (e.g., [72]) and genetic differences resulting from ethnic leakage (see, e.g., [73]).
The results were robust to controls for MRI scanner location, a proxy for geographic location, with each site entered as a dummy variable (results not shown).

Relationship between Genetic Ancestry, Cognitive Ability, and Parental SES
Parental SES is a potential confound in models where children's cognitive ability is an outcome. This is particularly the case given the moderate heritability of cognitive ability in childhood/early adolescence. In this case, it could be argued that associations between genetic ancestry and cognitive ability are consequent of those between parental genetic ancestry and parental SES. Figure 1 shows a theoretical path model of the variables. (We did not run a path model for this diagram as we lack trans-ethnically valid cognitive ability PGS).  Accordingly, offspring ancestry is a function of parental ancestry and is correlated with offspring trait PGS. Likewise, (1) parental ancestry is correlated with parental trait PGS, (2) offspring PGS is a function of parental PGS, (3) parental trait is a function of parental PGS and social environment, (4) parental socioeconomic status is a function of both parental trait and social environment, and (5) offspring trait is a function of parental provided environment and parental provided genotype.
Without longitudinal data, disentangling causal pathways is impossible, since controlling for parental SES has the effect of controlling for an indeterminate portion of those behavioral traits directly transmitted by parents. It also controls for what is sometimes referred to as "genetic nurture" [74], which refers to genetic effects from parents on offspring that condition offspring traits by way of nurturing (e.g., the socioeconomic environment provided). This caveat noted, the regression results for standard SIRE are reported in Table 11 (Tables S5-S7 show the results for the common combination, dummy SIRE, and continuous method). The betas for the genetic ancestries were reduced in size, especially in the case of Amerindian ancestry, but the directions were consistent with previous results. Without longitudinal data, disentangling causal pathways is impossible, since controlling for parental SES has the effect of controlling for an indeterminate portion of those behavioral traits directly transmitted by parents. It also controls for what is sometimes referred to as "genetic nurture" [74], which refers to genetic effects from parents on offspring that condition offspring traits by way of nurturing (e.g., the socioeconomic environment provided). This caveat noted, the regression results for standard SIRE are reported in Table 11 (Tables S5-S7 show the results for the common combination, dummy SIRE, and continuous method). The betas for the genetic ancestries were reduced in size, especially in the case of Amerindian ancestry, but the directions were consistent with previous results.
For ease of summary, the relative importance of the predictors across ancestries and across the four models was quantified by calculating etas from ANOVA using sum of squares type II (following the methods detailed in Supplementary File 2, Appendix 1). Before the inclusion of parental SES, the mean total eta for the independent effect of genetic ancestry on cognitive ability was 0.18; after the inclusion of parental SES it was reduced to 0.11. However, it still outperformed SIRE in all models (means of 0.26, 0.11, 0.08, and 0.00 for parental SES, genetic ancestry, SIRE, and language, respectively). This is illustrated in Figure 2. (See Tables S19-S21 for  For ease of summary, the relative importance of the predictors across ancestries and across the four models was quantified by calculating etas from ANOVA using sum of squares type II (following the methods detailed in Supplementary File 2, Appendix 1). Before the inclusion of parental SES, the mean total eta for the independent effect of genetic ancestry on cognitive ability was 0.18; after the inclusion of parental SES it was reduced to 0.11. However, it still outperformed SIRE in all models (means of 0.26, 0.11, 0.08, and 0.00 for parental SES, genetic ancestry, SIRE, and language, respectively). This is illustrated in Figure 2. (See Tables S19-S21 for complete results.) Figure 2. The median total eta for the independent effect of genetic ancestry on cognitive ability compared to SIRE for all models after controls for parental SES, genetic ancestry, SIRE, and language.

Subgroup Analyses
A number of subgroup analyses were carried out to see whether the results hold when data are disaggregated by SIRE groups. For this purpose, three subgroups of interest were chosen: (1) non-Whites, (2) African Americans, and (3) Hispanics. Models for other subgroups were not fit as the samples were too small to allow for an analysis with sufficient power to detect differences given the predictions of tenable evolutionary hypotheses or because, in the case of non-Hispanic mono-SIRE Whites, there was little variance in admixture.

Non-Whites
For this analysis, everyone classified as standard coding White was excluded. This group corresponds to the sociological construct of "people of color." Dummy coded SIRE predictors were used to avoid having to choose a new reference class, since doing so would make the betas less comparable with the regressions previously noted. A White SIRE predictor is included as some multiracial individuals identify as partially White. Table 12 shows the regression results. For Model 1 and Model 2 the dependents are cognitive ability and SES, respectively. The results are similar to those found previously, except that Oceanian ancestry was not a robust predictor here. Supplementary Table S8 shows the correlation matrix.

Subgroup Analyses
A number of subgroup analyses were carried out to see whether the results hold when data are disaggregated by SIRE groups. For this purpose, three subgroups of interest were chosen: (1) non-Whites, (2) African Americans, and (3) Hispanics. Models for other subgroups were not fit as the samples were too small to allow for an analysis with sufficient power to detect differences given the predictions of tenable evolutionary hypotheses or because, in the case of non-Hispanic mono-SIRE Whites, there was little variance in admixture.

Non-Whites
For this analysis, everyone classified as standard coding White was excluded. This group corresponds to the sociological construct of "people of color." Dummy coded SIRE predictors were used to avoid having to choose a new reference class, since doing so would make the betas less comparable with the regressions previously noted. A White SIRE predictor is included as some multiracial individuals identify as partially White. Table 12 shows the regression results. For Model 1 and Model 2 the dependents are cognitive ability and SES, respectively. The results are similar to those found previously, except that Oceanian ancestry was not a robust predictor here. Supplementary Table  S8 shows the correlation matrix.

African Americans
For African Americans, two inclusion criteria were utilized, yielding a narrow and a broad subsample. For the narrow (standard SIRE) subsample, only those who self-identified as African Americans and no other SIRE were included. For this analysis, only one ancestry predictor, African, was included as there was low variance for the other non-European genetic ancestries, and since this model showed a relatively good fit.
For the broad coding (in which everyone who identified as "African American" was included), it was an open question as to which model to use. The general criterion for model selection was adjusted R 2 . In case of similar model fits, European ancestry was left out, the simplest models were used (in order to reduce standard errors of the predictors), and models which showed symmetry between the two outcomes were preferred (to ease comparison). Results for chosen models are shown in Tables 13  and 14 (the remainder can be found in the study notebook).   The betas for African ancestry were consistently negative. Tables S9 and S10 (Supplementary File 2) show the correlation matrices for genetic ancestry, SES, and cognitive ability. Figure 3 below shows the plot of European ancestry × cognitive ability in the standard African American population (in turquoise) along with the distribution of ancestry in the standard White population (in red). The large circles represent population means.

Hispanics
For Hispanics, consistent with US research practices, every participant who self-identified as having Hispanic ethnicity was included. Four ancestry components showed sufficient variation to possibly warrant inclusion (African, Amerindian, East Asian, and European). European ancestry was dropped since it did not improve model fit (inclusion led to multicollinearity).
The regression results are shown in Table 15. The betas for both African and Amerindian ancestry were strong for this subsample. Interestingly, first language had little predictive ability, even though it would be expected to have more for the Hispanic sample, which likely contains some children of recent immigrants. Table S11 shows the correlation matrix. Figure 4 replicates Figure 3 but with Hispanics instead of African Americans.

Hispanics
For Hispanics, consistent with US research practices, every participant who self-identified as having Hispanic ethnicity was included. Four ancestry components showed sufficient variation to possibly warrant inclusion (African, Amerindian, East Asian, and European). European ancestry was dropped since it did not improve model fit (inclusion led to multicollinearity).
The regression results are shown in Table 15. The betas for both African and Amerindian ancestry were strong for this subsample. Interestingly, first language had little predictive ability, even though it would be expected to have more for the Hispanic sample, which likely contains some children of recent immigrants. Table S11 shows the correlation matrix. Figure 4 replicates Figure 3 but with Hispanics instead of African Americans.

Discussion and Conclusions
Our results show that independent of SIRE, African, Oceanian, and Amerindian ancestries, relative to Eurasian ones, were associated with lower cognitive ability and parental SES. Relative to European ancestry, there were no clear associations for East Asian ancestry. East Asians in the US are a heterogeneous group, comprised of both South East Asians (e.g., Filipinos and Vietnamese) and North East Asians (e.g., Chinese, Japanese, and Koreans). Since at the national level these groups have different cognitive ability levels [7], it is difficult to interpret these particular results in light of any global theory of cognitive differences.
Genetic ancestry was found to be related to both cognitive ability and parental socioeconomic status independent of SIRE, as predicted by evolutionary theory. These findings strongly disconfirm claims made by various researchers that there are no statistical relationships between genomic ancestry and cognitive ability when controlling for socially identified racial groups [9,53,54,75]. Although these findings are congruent with predictions from evolutionary-genetic models, it should be kept in mind that genomic ancestry may also be associated with a number of non-genetic variables that run in families for environmental reasons, some of which may be causal. As such, the apparent validity of genetic ancestry could be due to confounding with these non-genetic variables, or could reflect ancestry-induced social processes. Still, it is worth noting that our results are especially striking in light of the stronger effect of environmental influences on younger compared to older persons' cognitive ability [76]. Given this fact, it would be reasonable to expect that at least some environmental factors related to SIRE have their largest potential effects on cognitive ability among young individuals. Yet, in our sample of children, genetic ancestry explained a very great deal of inter-BGA cognitive ability variation, net of SIRE, potentially indicating that environmental effects in general have a limited role in intra-national cognitive ability differences between BGAs.
One set of the analyses included parental SES as a predictor of children's cognitive ability, and it was found to be a useful predictor. Inclusion of this in the models reduced the validity of ancestry predictors by 36-48% (cf. Section 4.2). As discussed above, a reduction in the effect size is expected on both genetic and non-genetic models. This is because in genetic models, controlling for parental SES indirectly controls for parental cognitive ability, parental genetic influences on cognitive ability,

Discussion and Conclusions
Our results show that independent of SIRE, African, Oceanian, and Amerindian ancestries, relative to Eurasian ones, were associated with lower cognitive ability and parental SES. Relative to European ancestry, there were no clear associations for East Asian ancestry. East Asians in the US are a heterogeneous group, comprised of both South East Asians (e.g., Filipinos and Vietnamese) and North East Asians (e.g., Chinese, Japanese, and Koreans). Since at the national level these groups have different cognitive ability levels [7], it is difficult to interpret these particular results in light of any global theory of cognitive differences.
Genetic ancestry was found to be related to both cognitive ability and parental socioeconomic status independent of SIRE, as predicted by evolutionary theory. These findings strongly disconfirm claims made by various researchers that there are no statistical relationships between genomic ancestry and cognitive ability when controlling for socially identified racial groups [9,53,54,75]. Although these findings are congruent with predictions from evolutionary-genetic models, it should be kept in mind that genomic ancestry may also be associated with a number of non-genetic variables that run in families for environmental reasons, some of which may be causal. As such, the apparent validity of genetic ancestry could be due to confounding with these non-genetic variables, or could reflect ancestry-induced social processes. Still, it is worth noting that our results are especially striking in light of the stronger effect of environmental influences on younger compared to older persons' cognitive ability [76]. Given this fact, it would be reasonable to expect that at least some environmental factors related to SIRE have their largest potential effects on cognitive ability among young individuals. Yet, in our sample of children, genetic ancestry explained a very great deal of inter-BGA cognitive ability variation, net of SIRE, potentially indicating that environmental effects in general have a limited role in intra-national cognitive ability differences between BGAs.
One set of the analyses included parental SES as a predictor of children's cognitive ability, and it was found to be a useful predictor. Inclusion of this in the models reduced the validity of ancestry predictors by 36-48% (cf. Section 4.2). As discussed above, a reduction in the effect size is expected on both genetic and non-genetic models. This is because in genetic models, controlling for parental SES indirectly controls for parental cognitive ability, parental genetic influences on cognitive ability, and ultimately children's genetic influences on cognitive ability, and thus the association between genomic ancestry and the relevant genetic factors is weakened.
Overall, genetic ancestry was a better predictor of outcomes than was SIRE membership. As the US population becomes more ancestrally heterogeneous (owing to admixture), and SIRE and genetic ancestry become less related, genetic ancestry may turn out to be an even better tool for studying race-related social differences. Research suggests this may be the case for very-admixed countries such as Brazil (e.g., [77]).

Non-Evolutionary Explanations
While the results we found are consistent with an evolutionary model, there are some potential alternative explanations, namely: phenotypic discrimination, confounding due to immigration status, confounding due to geographic location, and intergenerational environmental transmission.
Some have posited discrimination based on stereotypical race-phenotype, called colorism [78]. It has been argued that such discrimination could account for covariances between BGA, cognitive ability, and SES (e.g., [79]). Unfortunately, this dataset does not have appearance data (e.g., skin color), so we could not test whether the associations found are statistically mediated by phenotypic differences. With regard to this sample specifically, we find it unlikely that colorism could directly lead to the association between ancestry and cognitive ability given the ages of the participants. This is because most colorism models propose market-based discrimination (e.g., [80]). A theoretical possibility is that such discrimination induces associations between parental SES and BGA and that parental SES differences influence offspring cognitive ability. However, some of our associations were only partially reduced in strength when controlling for parental SES, downgrading the likelihood of this scenario.
More generally, it is not clear that colorism is actually a potent force, at least in the USA. Consider research based on sibling designs, which can distinguish between discriminatory and intergenerational effects. A number of studies in the economics literature have utilized sibling control designs in this fashion [81][82][83][84][85][86]. Unfortunately, they differ somewhat in design (e.g., raw vs. SES-controlled results for between-family regressions), and do not report standardized effect measures, so we were unable to quantitatively meta-analyze them. However, generally speaking, when family characteristics are controlled for, residual associations between racial appearance and social outcomes are small. In the words of one researcher who studied a large dataset from Brazil: "[T]he estimated coefficients are small in magnitude, implying that individual discrimination is not the primary determinant of interracial disparities. Instead, racial differences are largely explained by the family and community that one is born into" [81]. Mill and Stein [83] make statements to the same effect based on an analysis of a large dataset from the USA.
Another possibility is confounding due to immigration status. Some SIRE groups in our study (specifically East Asians and Hispanics) contain substantial numbers of new immigrants. For these, possible interactions between generational status and admixture complicate interpretations of associations between BGA and social outcomes. For example, the US Hispanic population is comprised of ongoing waves of migrants primarily from Central America and the Caribbean. Since there is an association between immigrants' generational status and social outcomes (see [87]), if there is likewise an association between ancestry and generation status, this could lead to biases in parameter estimates. It is difficult to untangle these effects without detailed information about migrant status and specific population histories. That said, Kirkegaard et al. [56] showed that associations between SES and ancestry can be found across the Americas. It seems unlikely that Amerindian ancestry would be related to SES among native Mexicans, and that African ancestry would be related to SES among native Puerto Ricans, but that in the USA the associations within Latin-American-origin populations would only be due to migrant status confounding. And it also seems unlikely that the cause of the association between ancestry and cognitive ability would be radically different from the cause of the association between ancestry and SES. Furthermore, migrant status confounding cannot non-negligibly factor into associations with African ancestry found in our sample, indicating that non-genetic environmental explanations that rely on migrant status confounding would not be parsimonious.
Regarding geography, we were unable to directly control for the specific locations of the participants. However, when we ran the analyses adding controls for MRI scanner location, as a proxy for geographic location, there was no substantive effect. Overall, the three explanations just discussed seem unlikely to account for our findings. However, the potential confounds that they invoke warrant investigation in future research.

Related Research
While few studies have looked at the association between genetic ancestry and cognitive outcomes, a large number of older ones have examined the relationship between genealogical and phenotypic indexes of ancestry and cognitive performance. Summarizing research on the relationship between indexes of Amerindian ancestry and outcomes, Berry [88] notes: Nevertheless, many other researchers have explored the relationship between academic achievement and certain indices of assimilation and have reached the same conclusion. Coombs (123:6) reports: "Amazingly consistent relationship between the degree of Indian blood and the pre-school language on the one hand and level of achievement on the other. These two characteristics are the best indices of the degree of acculturation.
Atkinson (22) tested students at Union High School, Roosevelt, Utah, and found whites superior, mixed-bloods second, and full-bloods third-a fact frequently encountered in the literature. As many writers have pointed out . . . such terms as full-blood and mixed-blood refer to social rather than biological groups.
Berry [88] and many of the researchers he cites interpret the apparent outcome-by-admixture associations as owing to cultural factors; notably, they also interpret the purported ancestry divisions as being delineated culturally, not genetically. Their view suggests that the factors relevant to performance differences can readily diffuse horizontally across cultural groups. Our results do not exclude this possibility but suggest that the relevant factors are being vertically transmitted along genealogical lines -the most plausible candidate being genes.
Loehlin, Lindzey, and Spuhler [50] summarized research on the relationship between phenotypic indexes of African ancestry and outcomes: The majority of the studies with persons of mixed African-European ancestry found that groups of subjects judged to be of more African ancestry were on the average slightly inferior on the tests of intellectual functioning employed.
Because of the probability of complex and differential environmental response to physical differences and the likelihood of assortative mating complicating the genetic picture, in addition to the questionable reliability of the racial measures themselves, it is easy to decide that these findings lend themselves to no firm conclusions. One might go beyond this generalization and suggest that the observed findings provide little solace for extremes of either environmentalism or genetical determinism.
Our research design addresses many of the problems noted by Loehlin et al. [50]. First, we use a multilocus index of genetic ancestry, so cross-assortative mating for specific phenotypic characters (such as skin color) and cognitive ability is not a possible confound. Next, our index of ancestry is highly reliable, unlike the blood group studies that Loehlin et al. [50] discuss. Moreover, we show in our supplementary simulation that, owing to range restriction in ancestry, the typically small correlations are consistent with large between-group differences. Since there is disagreement about the interpretation of these pre-genomic studies, a systematic meta-analysis is warranted to confirm that indexes of ancestry are generally correlated with cognitive ability.
A number of studies have addressed the issue of the association between genetic ancestry and cognitive ability in more indirect ways (e.g., [8]). A popular approach involves testing Spearman's hypothesis, which posits that the magnitude of measured cognitive ability gaps between BGAs is a function of the g loadings of the test instruments (studies generally support Spearman's hypothesis; see [89][90][91]). The extent to which g moderates the magnitude of inter-BGA cognitive ability gaps may be relevant insofar as the g loadings of tests potentially correlate perfectly with their heritabilities [92]. Nonetheless, some argue that the truth of Spearman's hypothesis does not exclude the possibility that inter-BGA cognitive ability gaps have non-genetic environmental origins (e.g., [93]). Future admixture analyses could resolve this debate over the implications of the truth of Spearman's hypothesis. For example, if the present model were run for separate subtests, the resultant vector of model beta values could be correlated with the vector of subtest g loadings to determine whether more g-loaded subtests involve more strongly genetic inter-BGA ability gaps.

Evolutionary-Genetic Explanations versus Familial Environmental Models
In the context of admixture analyses of morphological and health-related traits, it is generally accepted (e.g., in genetic epidemiology) that the finding of a correlation between outcomes and genetic ancestry constitutes support for evolutionary-genetic models of inter-BGA differences, especially if potential environmental confounds are statistically controlled (see Introduction). Logically the same inference should apply to behavioral traits and also to outcomes under partial genetic influence, such as educational attainment. (Others make the same point: Nisbett [53] suggests that a relation between ancestry and IQ within SIRE groups would constitute "direct evidence" supporting hereditarian models of inter-BGA cognitive differences.) But many researchers are inconsistent in their treatment of these outcomes (see, e.g., [77]), falling for the so-called sociologist's fallacy [94,95]. (Specifically, the sociologist's fallacy refers to the hasty inference-made without considering the possible role of genetic factors-that a correlation between a social variable (such as SES) and a phenotype (such as cognitive ability) implies that the social variable is causal with respect to the phenotype.) Support for evolutionary-genetic models provided by global admixture analysis is indirect because there are possible confounds, as discussed above. It is also possible that non-genetic familial models (as noted) make the same predictions. Evolutionary-genetic models, though, can be directly tested using admixture mapping [46], which in genetic epidemiology is usually the next step taken after consistent findings of an association between outcomes and global genetic ancestry.
The idea behind the method is fairly simple: though we may not know exactly where the causal variants are on the genome, we know roughly where they are located (i.e., in which genes), and they can plausibly be assumed to be the same loci across populations [39]. This means that if we take cases and controls (or high/low trait groups) from a given admixed population where one ancestry (A) has a higher polygenic score for a trait (e.g., higher disease risk), the cases will have an overall higher proportion of that A ancestry than the controls. However, this increase in ancestry will not be randomly distributed along the genome but will be concentrated in regions around the causal variants. Thus, the test for evolutionary-genetic models is whether such local ancestry enrichment can be found for variants identified via GWASs for cognitive ability. As cognitive ability is strongly polygenic, this test will likely require a large sample size. This method can also be used to test racial-phenotypic discrimination models, as these models predict that associations between genetic ancestry and both behavioral and social outcomes will be more pronounced on regions of the genome related to visible phenotypes. Such analyses have been conducted in relation to assortative mating [96].
Following methods in genetic epidemiology, we began with a preliminary global BGA analysis, and have shown that non-White (except East Asian) ancestry in the United States is correlated with lower cognitive ability and parental SES. Before drawing conclusions, however, it is important to replicate these results. A single dataset or study cannot settle a debate [97], owing to potentially unidentified confounds. However, if our results can be replicated, admixture mapping, which can discriminate between phenotypic discrimination, environmental familial, and evolutionary-genetic models, is the next warranted step.

Implications for Research on Differences
Insofar as there is interest in ameliorating ethnic and racial differences in important behavioral and social outcomes, it is important to understand their causation. Results in this paper are largely in agreement with a familial model of differences in the Americas. These results suggest that outcome differences are being passed on along family lines, and that they are not due to common ethno-cultural factors. This is consistent with an evolutionary-genetic model, but environmental familial models cannot be ruled out at this stage.
These results clearly indicate that using standard categorical SIRE classifications fails to capture the full extent of BGA-associated disadvantages. As for other general models of American race/ethnic differences, contra [98], these results suggest the possibility of non-trivial omitted variable bias in discrimination-based models. Proponents of such models should try to show that their associations are independent of genetic ancestry. It is not clear that this is generally the case. (For example, see the 1982 Pelotas-birth cohort results reported in Supplementary File 2 of [56], which show no consistent association between interviewer-reported color and social outcomes after genetic ancestry is statistically controlled.) Proponents of cultural identification models are also encouraged to take genetic ancestry into account.