Global Ancestry and Cognitive Ability

: Using data from the Philadelphia Neurodevelopmental Cohort, we examined whether European ancestry predicted cognitive ability over and above both parental socioeconomic status (SES) and measures of eye, hair, and skin color. First, using multi-group conﬁrmatory factor analysis, we veriﬁed that strict factorial invariance held between self-identiﬁed African and European-Americans. The di ﬀ erences between these groups, which were equivalent to 14.72 IQ points, were primarily (75.59%) due to di ﬀ erence in general cognitive ability ( g ), consistent with Spearman’s hypothesis. We found a relationship between European admixture and g . This relationship existed in samples of (a) self-identiﬁed monoracial African-Americans (B = 0.78, n = 2,179), (b) monoracial African and biracial African-European-Americans, with controls added for self-identiﬁed biracial status (B = 0.85, n = 2407), and (c) combined European, African-European, and African-American participants, with controls for self-identiﬁed race / ethnicity (B = 0.75, N = 7,273). Controlling for parental SES modestly attenuated these relationships whereas controlling for measures of skin, hair, and eye color did not. Next, we validated four sets of polygenic scores for educational attainment (eduPGS). MTAG, the multi-trait analysis of genome-wide association study (GWAS) eduPGS (based on 8442 overlapping variants) predicted g in both the monoracial African-American ( r = 0.111, n = 2179, p < 0.001), and the European-American ( r = 0.227, n = 4914, p < 0.001) subsamples. We also found large race di ﬀ erences for the means of eduPGS ( d = 1.89). Using the ancestry-adjusted association between MTAG eduPGS and g from the monoracial African-American sample as an estimate of the transracially unbiased validity of eduPGS (B = 0.124), the results suggest that as much as 20%–25% of the race di ﬀ erence in g can be naïvely explained by known cognitive ability-related variants. Moreover, path analysis showed that the eduPGS substantially mediated associations between cognitive ability and European ancestry in the African-American sample. Subtest di ﬀ erences, together with the e ﬀ ects of both ancestry and eduPGS, had near-identity with subtest g -loadings. This ﬁnding conﬁrmed a Jensen e ﬀ ect acting on ancestry-related di ﬀ erences. Finally, we conﬁrmed measurement invariance along the full range of European ancestry in the combined sample using local structural equation modeling. Results converge on genetics as a potential partial explanation for group mean di ﬀ erences in intelligence.


Introduction
The European-American (EA) and African-American (AA) populations have been found to differ in mean general cognitive ability (or general intelligence, g) by about one standard deviation [1][2][3]. This difference exists even though the heritability of intelligence is both high and virtually identical in the European and African-American populations [4]. Likewise, the shared and unshared environmental components of variance in g are similar in these groups. These findings, in conjunction with others The three-factor baseline model was found to fit well in both groups and the addition of a bifactor g considerably improved model fit. SFI was found for this model. The strong and contra forms of Spearman's hypothesis [2] were rejected but the weak model fit the data well. Our constraint for testing the weak model was to set the episodic memory factor mean to 0 while the other group factor means were freed. The resulting mean differences appear in Table 1. These results accord with previous analyses (e.g., [2,73]) in that there is a nearly one d difference in cognitive ability between African and European-Americans and that this difference is concentrated on the g factor. The model without constraints for Spearman's hypothesis shows that (net of the g factor) there is an African-American advantage in executive functioning and episodic memory. This replicates patterns found in prior studies (e.g., [18,73]). The MGCFA results are discussed more fully in the Supplementary Materials. For the admixture analysis, we used g scores since differences were primarily relegated to this factor.

Parental Socioeconomic Status
Only paternal and maternal education were available to us as measures of SES. We coded parental education as the average of each, both z-scored first. When data were only available from one parent this value was used. The final score was then z-scored again. This variable was used as a control for SES in our admixture regressions but because we were unable to factor analyze a variety of SES measures and extract a more reliable indicator [74] the control can only be regarded as partial. With this noted, parental education has been found to be more important than wealth in shaping children's intelligence [75]. The effect of SES as a background variable in our MGCFA is investigated in our Supplementary Materials.

Self-Identified Race/Ethnicity (SIRE)
Subjects chose their self-identified race/ethnicity (SIRE) based on yes/no questions wherein it was possible to select multiple races or ethnicities. The choices were: Black or African-American; American Indian or Alaskan Native; Asian; European-American; Hispanic/Latino; Native Hawaiian/Pacific Islander; Other; and Not Available/Pending Validation. We coded European-American following standard procedures: subjects who selected the European-American option only and did not report being Hispanic. African-American was coded the same way. Finally, we coded subjects who selected both European-and African-American as biracial African-European-American (AA|EA). We then filtered subjects with different identifications, reducing the sample size from an initial N of 9496 to 8580. Of the included participants, there were 5183 European-Americans, 3155 African-Americans, and 242 biracial African-European-Americans.

Genetic Ancestry Percentages
Satterthwaite et al. [76] described the genotyping of the PNC dataset. Briefly, the participants were genotyped on six different arrays from two different manufacturers. The different arrays covered different variants. As such, it was necessary to impute them to obtain large, overlapping sets of single nucleotide polymorphisms (SNPs). We imputed the dataset using the Michigan Imputation Server (https://imputationserver.sph.umich.edu/index.html) with the Minimac3 imputation algorithm, 1000G Phase 3 v5 as reference panel, and Eagle v2. 3 Phasing. In total, 15.5 M variants were available for analysis. To reduce computation time, we filtered these to variants with a minor allele frequency (MAF) of at least 0.05 in the subset of interest (i.e., the combined sample of European and African-Americans). That is, we selected only variants for which the second-most common alleles occurred in the population at a frequency of 0.05 or higher. This screening left 6. 5M variants to compute genetic ancestry percentages. Note, this was only done in the context of computing admixture scores. We did not filter for MAF when computing color scores, as we did not want to lose variants that affected them. These operations were done using PLINK v1.90b6.8 [77]. To estimate individual global ancestry we used ADMIXTURE version 1.3.0 D.H. [78]. Since we were only concerned with European and African-Americans, we ran ADMIXTURE with k = 2 genetic clusters. Some subjects either had no genotypes available or their data failed quality control. Thus, the final sample size was reduced to 7399. Consistent with previous research (e.g., [59], SIRE is strongly associated with genetic ancestry. This can be seen in Figure 1 which shows the probability of identifying with a particular SIRE group as a function of European admixture.

Skin, Hair, and Eye Color
The data did not include measures of appearance, so we opted to impute these based on genotypes. We used the HIrisPlex-S web application to do this (https://hirisplex.erasmusmc.nl/). This application was developed by the U.S. Department of Justice for use in forensic investigations. It imputes skin, hair, and eye color probabilities with a high degree of accuracy based on 41 SNPs (with overlapping variants; 6 for eye color, 22 for hair color, and 36 for skin color). This tool has been validated on thousands of people from diverse regions of the world [79].
We focus on skin color since this trait is given primacy by colorist theorists (e.g., [80,81]) and because we were able to calculate skin color scores for a larger subsample than for hair or eye color. HIrisPlex-S gives probabilities of the Fitzpatrick Scale skin type (i.e., Type I, scores 0-6, "palest, freckles"; Type II, scores 7-13; Type III-IV (combined), scores 14-27; Type V, scores 28-34; Type VI, scores 35-36, "deeply pigmented dark brown to darkest brown"). We weighted the medium score of each Type (e.g., Type I = 3) by the probability of each type in order to create a single color measure. Unfortunately, because of the poor tagging of the relevant variants in some of the arrays, the imputed phenotypes were only available for a subset of the data (N = 7321 with cognitive ability and ancestry, versus N = 5533 with cognitive ability, ancestry, and skin color). By this scale, higher scores indicate darker colors.
The correlation between our imputed color score and European ancestry was −0.87 for the combined sample (N = 5585) and −0.39 for the African-Americans only sample (N = 1557). Figure 2 shows the regression plot for skin color and European ancestry, with SIRE groups marked in red (African-American), green (biracial African-European-American), and blue (European-American). The correlation found among African-Americans is similar to those previously reported (r's 0.25 to 0.71; median:~0.50) for African-Americans and other groups of African descent [56,61,82,83]. Furthermore, the predicted skin color values matched the expected values on the Fitzpatrick scale. The mean score for European-Americans was 14.70, placing them in category III (i.e., sometimes mild burn, tans uniformly) bordering on category II (i.e., usually burns, tans minimally), and for African-Americans, 30.96, placing them in category V (very rarely burns, tans very easily).

Cognitive Ability-Related Polygenic Scores
We scored the genomes using PLINK v1.90b6.8 [77] based on the genome-wide association study (GWAS) results for educational attainment reported in Lee et al. [84]. We constructed four different PGS based on the relative trade-off between differential linkage disequilibrium (LD) decay bias and the within-population signal. For background, a PGS "represents an aggregation, usually in the form of a sum, of the estimated effect sizes of the genetic variants in the genome" [85]. LD is the non-random association of alleles at different loci in a particular population. LD decay is the breakdown of this association between divergent populations. LD decay bias is bias in PGS-which are calculated based on both causal variants and variants in linkage with causal and non-causal ones-due to populations having different patterns of associations between variants and sites that are nearby (or in linkage).
First, we used the eduPGS with all variants trained without the 23andMe cohort (n = 7,762,369 SNPs overlapping in the present dataset) to construct a predictor with good within-population validity but very high LD decay bias [86,87]. Second, we used the multi-trait analysis of genome-wide association study (MTAG) eduPGS 10k SNPs (n = 8442 overlapping variants) to construct a predictor with good within-population validity but high LD decay bias. The MTAG eduPGS were PGS computed using MTAG. This is a method for analyzing statistics from genome-wide association study (GWAS) on different but genetically correlated traits (e.g., education and intelligence). Third, we used the MTAG eduPGS lead SNPs to construct an eduPGS with moderate within-population validity and LD decay bias (n = 1558 overlapping variants). The lead SNPs are the ones which had the most-significant association with education/intelligence within each cluster of LD-associated SNPs. The reasoning is that because SNPs with lower p-values (i.e., greater statistical significance) are more likely to be causal [88], and because causal SNPs are more likely to be transethnically valid [86,89], using SNPs with a higher significance may increase the transethnic validity of eduPGS [90]. Fourth, we used Lee et al.'s [84] putatively causal variants to construct an eduPGS with minimal LD decay bias and low within-population validity (n = 111 overlapping variants).  Table 2 shows descriptive statistics for all groups. AA|EAs were primarily European (80%) in ancestry. Correspondingly, they were closer to the EA means on SES, g, and color. The AA/EA g difference was d = 0.984 (IQ EA = 100 versus IQ AA = 85.28, with the biracial group scoring in-between at 97.96), while the SES difference was d = 0.99 and the difference in color was d = 3.60. Note: Standard deviations and sample sizes appear in parentheses. SES = socioeconomic status.

Bivariate Relationships among the Variables
Correlations between g, parental SES, genetic ancestry, and SIRE group appear in Tables 3 and 4. Correlations are as expected based on the cognitive and SES differences between different SIRE groups and the strong correlation between genetic ancestry and SIRE. However, this bivariate analysis is not very informative as outcomes are confounded with non-genetic causes and ancestry variables are negatively correlated by design since individual ancestry must sum to one. The relationship between parental SES and cognitive ability is likely stronger here than is typically reported (see [91]; r = 0.29) because we used both parents' education and a latent cognitive ability measure which reduces error that would normally attenuate the strength of this relationship. Note: All values significant at p < 0.0001. Pairwise N in parentheses. SES = socioeconomic status, SIRE = self-identified race/ethnicity, eduPGS = education polygenic score. Table 4. Pairwise correlations among self-identified monoracial African-American (below the diagonal) and self-identified monoracial and biracial African-European-American (above the diagonal). The correlations between cognitive ability and both color and African ancestry are highest in the combined African-, African-European-, and European-American sample, intermediate in the combined mono-and bi-racial African-American sample, and lowest in the monoracial African-American sample. The difference in the magnitudes of the correlations is a result of the first sample having the most variability in ancestry and the last sample having the least. This difference is mostly a statistical artifact of restriction of range, which attenuates correlations [92].

Regression Analyses
Our primary focus is whether genetic ancestry predicts cognitive ability. The rationale for the analysis has been explained in detail by Kirkegaard et al. [24]. In order to determine if this is the case, we constructed a series of regression models. These models were applied to the African-American and the combined African-, European-, and biracial African-European-American sample. Restricting the first analysis to SIRE African-Americans approximates the methodology used by Scarr et al. [52]. Our initial analysis used European ancestry (EUR) as the independent variable. We then added skin color to test discrimination (i.e., colorism) and other appearance-based hypotheses. Finally, we added SES to assess the effect of familial environmental influences on the observed gap, noting again that these measures are genetically confounded.
To provide results that can be easily compared across groups and studies, we standardize all values except European ancestry and then report the unstandardized beta coefficients. Doing so leaves only ancestry unstandardized. This allows one to interpret the beta coefficient for ancestry as a change in a standard deviation of cognitive ability over a change of 100% European ancestry. Following Kirkegaard et al. [24], we do not report standardized betas (except in Table 5 for illustration), because, in the context of the effect of ancestry on outcomes, these are difficult to compare across groups. This difficulty arises because groups differ substantially in their variability in ancestry and its effects. As such, a change of one standard deviation of ancestry from one group has a substantially different meaning than the same from another group.
In models that only included monoracial African-Americans, we found that European ancestry was always strongly and significantly related to cognitive ability. Skin color (assessed genetically with the highly accurate predictor [79,93] was associated with cognitive ability (Model 1b, Table 5), but made no significant incremental contribution when ancestry was also in the model (Model 2, Table 5). Results could still be due to phenotypic confounding from other appearance variables. To test this possibility, we fitted a number of models including skin, hair, and eye color. We found that none of these features had significant effects on their own, except for brown eye color, which was positively related to cognitive ability, but with a large standard error. These results are shown in the R notebook. The last monoracial African-American model (Model 3, Table 5) included SES, which had a considerable effect on cognitive ability. However, the so-called sociologist's fallacy [2,5] may be at play here. That is, controlling for parental SES also controls for genetic effects on SES which may be shared with cognitive ability.
As an alternative to the model with only monoracial African-Americans, we next included both individuals who identified as monoracial African-and biracial African-European-American. This is equivalent to the "broadly-defined" African-American group used by Kirkegaard et al. [24]. These results are shown in Table 6. In Model 2, with self-identified European-American race (which, in this case, indicates biracial status) as a covariate, the beta for European ancestry (B EUR = 0.85; n = 2407) was comparable although slightly higher than that for the monoracial African-American sample (B EUR = 0.78; n = 2179). Table 5. Regression analysis for European ancestry as a predictor of g among monoracial African-Americans with controls for skin color (Model 2), and SES (Model 3) added.
Comparing Model 2 and Model 3b, biracial status had a significant effect independent of European ancestry. This could be due to selective patterns of self-identification, whereby choosing to report that one is a European-American, regardless of one's genetic ancestry, is associated with greater cognitive ability. This appears to have been the case as evidenced by Figure 3, which shows the relationship between European ancestry and cognitive ability. Being above the regression line was associated with a greater likelihood of biracial African-European self-identification regardless of one's actual ancestry. Whatever the case, unlike with self-reported European-American race, skin color did not have significant predictive validity independent of European ancestry. The comparison between Model 3a and Model 3b shows that the inclusion of ancestry reduced the effect of color from highly significant to non-significant (B color = −0.25 in 3a and B color = −0.06 in 3b, n = 1,690). As in the models for monoracial AAs, the inclusion of parental SES reduced, but did not eliminate, the effect of European ancestry (Model 4).
The next set of models utilized the whole sample of African, European, and biracial African-European-Americans. These results appear in Table 7. The first model revealed a substantial impact of European ancestry on cognitive ability. The next model found the same, but also produced a significant effect of SIRE European-American (but not SIRE African-American) self-identification. That is, some incremental validity existed which was associated with self-identification as European-American and this did not occur for self-identification as African-American. The next model included physical appearance in the form of skin color. As with the African-American-only and the combined African-and biracial African-European-American sample, this variable had no significant effect on cognitive ability in the full sample with ancestry included (Model 3b). Our final model included SES. The inclusion of SES reduced the effect of SIRE European-American, although it still remained significant. Nonetheless, in this model, European ancestry was the dominant determinant of cognitive ability. The effect of SES on cognitive ability was essentially the same in the full sample as it was in the monoracial African-American model. The fact that the effect of SIRE European-American weakened with the inclusion of SES is perhaps notable, as it may imply that selective self-identification as European-American is driven in part by social processes. It is clear here though that self-identification and other-identifiable physical characteristics are much less important as determinants of cognitive ability than is genetic ancestry.  Note: * p < 0.05, ** p < 0.01, *** p < 0.001. EUR = European ancestry. SES = socioeconomic status, SIRE = self-identified race/ethnicity.
Cumulatively, European ancestry was consistently positively related to cognitive ability in all models. SES was similarly related to cognitive ability. European-American SIRE also had a significant effect independently of genetic ancestry, indicating that differences were not wholly explained by ancestry. Skin color was not significantly related to cognitive ability in any of our models which included genetic ancestry. However, it was related to cognitive ability in models not including ancestry, as found in previous studies [17,[94][95][96] (for a review of older studies, see [1]).
Recent immigrants from Africa in the sample may have affected the scores reported here as migrants may have been selected for cognitive ability when they emigrated from their home countries. As such, we conducted a robustness test by rerunning the analyses excluding individuals with one percent or less European ancestry. This exclusion did not substantially affect the results (reported in the R notebook, Model 17b). Thus, confounding due to migration likely did not influence our findings.

Cognitive Ability and Education-Related Polygenic Scores (eduPGS)
We examined the validity of four cognitive ability and education-related PGSs reported by Lee et al. [84]. The eduPGS were derived from European populations. First, we used the eduPGS based on all variants trained without the 23andme cohort (n = 7,762,369 variants overlapping in the present dataset). The goal was to construct a predictor with good within-discovery population (European, in this case) validity, but very high LD decay bias, and thus low transethnic validity. Second, we used MTAG eduPGS with 10k variants (n = 8442 overlapping variants) to construct a predictor with good within-discovery population validity and reduced LD decay bias. Third, we used the MTAG eduPGS lead SNPs (n = 1558 overlapping variants) to see if we could reduce LD decay bias and increase transethnic validity while retaining within-discovery population validity. Finally, we used the putatively causal variants (N = 111 overlapping variants in our dataset) to construct an eduPGS with minimal LD decay bias, but low within-discovery population validity. At some point, screening SNPs based on p values may lead to lower validity in both the discovery population and other-ethnic populations, despite having relatively high transethnic validity [90]. The goal is to identify PGS which have both high within-discovery population and transethnic validities.
The correlations with g are shown in Table 8, with results for European-Americans above the diagonal and results for (monoracial) African-Americans below. Since the PNC is independent from the samples analyzed by Lee et al. [84], the results for European-Americans constitute an independent replication of those reported by Lee et al. [84]. This may be of interest to some because our dependent variable was a good measure of g. Table 8. Pairwise correlations between cognitive ability and education/intelligence-related polygenic scores (European-American above the diagonal, African-American below). Of the PGS, the MTAG_10k PGS had the highest validity in both the European-American and African-American samples, although the validity for African-Americans was reduced by 51%. The results for MTAG_10k and MTAG eduPGS lead SNPs correlated highly with one another (r > 0.80) in both the African-and European-American samples. These eduPGS also had similar relationships with cognitive ability.
As expected, the GWAS eduPGS had high within-source population validity but low transethnic validity. While significant in the African-American sample, the validity was only 0.044/0.225 = 20% of that in the European-American sample. This low transethnic validity occurred because the 7,762,369 variants in conjunction with high LD decay produced considerable noise. Due to being based on only 111 overlapping variants, the putatively causal eduPGS had low validity within both samples but had the highest relative transethnic validity: 0.031/0.058 = 54%.
Consistent with theory [90], the eduPGS with the highest validity in both the African-and European-American samples was the MTAG_EA_10K (r = 0.112 & r = 0.227, respectively). As such, in the following analyses, we restricted consideration to these scores. While retaining some validity in the African-American sample, the MTAG_10k eduPGS showcased predictive bias. Indeed, a Fisher r-to-z Test showed that the differences in the correlations for African and European-Americans was highly significant (z = −4.6, p < 0.01). This difference is illustrated in Figure 4, which shows the regression plot for eduPGS and g for European (red) and African-Americans (blue). The slope of the regression line and the intercepts for the African-American sample are both lower. As Zanetti & Weale [86] note, differences in SNP predictivity could be due to (1) lack of statistical power, (2) allele frequency differences, (3) linkage disequilibrium, or (4) differences in true causal variant effect sizes/directions. These authors suggest that the last explanation is generally not necessary.
Regardless, a combination of these factors could lead to PGS bias in favor of Europeans (or Africans). There are a number of possible ways in which such bias could have arisen. For example, imagine the case for more recent polymorphisms. These more recent polymorphisms will probably have a higher frequency in the discovery population. If recent selection favored IQ-promoting variants and purged IQ-reducing variants, the more recent polymorphisms will be IQ-promoting, which will result in bias in favor of the discovery population. There is some evidence of an increase in PGS over time [97]. Of course, the reverse would be true, if recent selection disfavored high IQ alleles. However, given that most mutations are deleterious [98], if the PGS contain a large number of recent variants it is likely that they'll be biased against their discovery populations. For disease phenotypes, Kim, Patel, Teng, Berens, & Lachance [99] showed how this bias can function and found that correcting for differences in derived vs. ancestral polymorphisms can improve the transracial predictivity of PGS. In a supplementary analysis, conducted to address a reviewer's point, we did not find evidence of a substantive (or even statistically significant) effect of young vs. old polymorphisms. Thus, the concern raised regarding bias due to polymorphism age (e.g., [99]) does not seem to be a problem in this case and so we did not create polymorphism age-corrected PGS. Table 9 shows eduPGS means and standard deviations for African, biracial African-European, and European-Americans. The eduPGS difference is large at d = 1.89 and it is larger than the phenotypic gap in g. This would be expected even if the PGS were unbiased given the imperfect relationship between PGS and g. Beyond this and possible bias in the PGS, another possibility is that the PGS differences between groups are primarily in common variants, while the differences between individuals within groups are in both common variants and rare variants that arose recently by mutations and which are under mutation-selection balance. Given this scenario, PGS which indexed mostly common variants between populations would manifest as reduced phenotypic differences even assuming consistent validity. The association between eduPGS (MTAG_10k) and g in the African-American sample could, in principle, result from confounding with environmental factors. This could be an indirect effect of the association between g and European ancestry or, alternatively, color. Alternatively, the variants in the eduPGS could be "chopstick genes" [100] which have a statistical but no causal effect. If eduPGS were merely indexing European ancestry, one would expect the GWAS eduPGS (which is based on more than 7 million additional variants) to exhibit an association as high as the one we found with MTAG_EA_10k eduPGS. This is because the GWAS_eduPGS, with its greater number of variants, is likely to be a better index of ancestry. This does not appear to be the case. Regardless, to explore this possibility further, we ran regression models with European ancestry (Model 1b) and color scores (Model 2b) for the monoracial African-American and combined African-, European-, and biracial African-European-American samples. The results are shown in Tables 10 and 11 respectively. Note: * p < 0.05, ** p < 0.01, *** p < 0.001. EUR = European ancestry. eduPGS = education polygenic score. Note: * p < 0.05, ** p < 0.01, *** p < 0.001. EUR = European ancestry. SES = socioeconomic status, SIRE = self-identified race/ethnicity, eduPGS = education polygenic score.
Adjusting for European ancestry does not reduce the validity of eduPGS in the African-American sample (Table 10; Model 1b), nor does controlling for skin color. In fact, the validity here increases slightly relative to that found among European-Americans. Table 12 shows the regression results for European-Americans (Model 2). Comparing the monoracial African-American SIRE beta (Model 1b, Table 10) to the European-American beta (Model 2, Table 12) gives an African-American validity 52% the size of the European-American validity. Note: * p < 0.05, ** p < 0.01, ** p < 0.001. EUR = European ancestry. SES = socioeconomic status, SIRE = self-identified race/ethnicity, eduPGS = education polygenic score.
If we take the beta of 0.124 in the AA-only sample (Model 1b, Table 10) as an estimate of the LD-unbiased effect of eduPGS on g, the g-score differences due to eduPGS would be 0.124 × 1.89 (i.e., the Cohen's d for the African-/European-American eduPGS difference) = 0.23. This implies that 20%-25% of the difference in g can be accounted for by presently known cognitive ability-and education-related PGS.
Concerns have been raised that the eduPGS computed by Lee et al. [84] may be biased against non-Europeans due to the inclusion of population-specific variants [101] (for a more general discussion see [99]). These population-specific variants would have very low frequency in non-European populations. Following Piffer's suggestion [101], as a robustness check, we computed MTAG eduPGS excluding variants with MAF < 0.01 (leaving 7636 overlapping variants) and <0.05 (leaving 7172 overlapping variants) among African lineages. The 1000 Genomes reference samples were used to determine the African MAF. This exclusion had no substantive effect on the eduPGS validity with respect to g for either African or European-Americans. For SIRE monoracial African-Americans r MAF ≥ 0.01 = 0.112 and r MAF ≥ 0.05 = 0.114.
Theory predicts that, in the absence of countervailing confounds, variants which show concordant as opposed to discordant effects across racial groups, will exhibit higher validity in non-discovery populations (Piffer, personal communication, February 3, 2019; [90]). This is because variants which show transracially concordant effects are more likely to be causal and also because causal variants, as opposed to variants in LD with causal ones, are more likely to have transethnic predictivity [89,102]. These predictions are borne out in this sample. Among European-Americans, the predictivity for concordant MTAG_10k eduPGS was higher (r concordant = 0.223 vs. r discordant = 0.145), and among African-Americans, it was even more so (r concordant = 0.223 vs. r discordant = -0.128). Similarly, when we grouped variants by the p value for the interaction between race and the effect on g (with higher p values signifying less discordance in effect), among African-Americans, the variants with higher p values were substantially more predictive of g (r high_P = 0.141 vs. r low_P = 0.035), while among European-Americans, this was not the case (r high_P = 0.265 vs. r low_P = 0.263). Cross-validation confirmed that the concordant PGS had higher predictive validity among the non-discovery population. However, the concordant PGS subset did not seem to have higher predictivity, based on cross-validation, than the combined set. The identical validity for concordant PGS could not be sustained in cross-validation. Thus, we do not use these disaggregated PGS in further analysis.

Path Analysis
Finally, we depict our results in path model form. As seen in Figure 4, substantial differences exist in the transethnic validity of the eduPGS. We therefore limited our path analysis to monoracial African-Americans. The model was run with the lavaan R package with missing data handled using listwise deletion. (Missing color scores were not MCAR, so we did not impute data.) Results are shown in Figure 5. We leave European ancestry unstandardized, as in previous analyses, while the other variables were standardized. Thus, the total effect of European ancestry on cognitive ability is calculated as (1) the unstandardized path coefficient between European ancestry and cognitive ability (B = 0.512) plus (2) the unstandardized path between European ancestry and color times the standardized path coefficient between color and cognitive ability (−2.209 × −0.066 = 0.146) plus (3) the unstandardized path between European ancestry and eduPGS multiplied by the standardized path coefficient between eduPGS and cognitive ability (2.515 × 0.131 = 0.329).
other variables were standardized. Thus, the total effect of European ancestry on cognitive ability is calculated as (1) the unstandardized path coefficient between European ancestry and cognitive ability (B = 0.512) plus (2) the unstandardized path between European ancestry and color times the standardized path coefficient between color and cognitive ability (−2.209 × −0.066 = 0.146) plus (3) the unstandardized path between European ancestry and eduPGS multiplied by the standardized path coefficient between eduPGS and cognitive ability (2.515 × 0.131 = 0.329).  Table 13 shows the estimates for the different paths. From the table, while European ancestry and eduPGS had significant effects, color did not (although it was trending toward significance, p = 0.12). Additionally, the covariance between color and eduPGS was not significant. The association between European ancestry and g is more substantially mediated by the cognition-related eduPGS than by color scores. An analysis of the effect of color within sibling pairs is included in the  Table 13 shows the estimates for the different paths. From the table, while European ancestry and eduPGS had significant effects, color did not (although it was trending toward significance, p = 0.12). Additionally, the covariance between color and eduPGS was not significant. The association between European ancestry and g is more substantially mediated by the cognition-related eduPGS than by color scores. An analysis of the effect of color within sibling pairs is included in the supplement but the results are extremely tentative due to a lack of statistical power and the unconfirmed validity of our measure of color.

The Spearman-Jensen Hypothesis
Spearman's hypothesis holds that racial differences in cognitive ability are largely the result of differences in g. This theory predicts a strong positive correlation between subtest g loadings and the vector of group differences. We test this theory using Jensen's method of correlated vectors (MCV; [2]) by examining the correlation between the vector of g loadings, the vector of subtest heritability, and four vectors of group differences. The vector of g loadings is the g loadings for the 10 subtests found to be MI for African and European-Americans. The vector of heritability is the average of the African-and European-American heritabilities reported by Mollon et al. [65]. The vectors of group differences were: (1) the correlation with European ancestry (for the combined sample), (2) the correlation with MTAG 10K eduPGS for the European-American sample, (3) the correlation with MTAG 10k eduPGS for the monoracial African-American sample, and (4) the African-/European-American mean group differences. These results are shown in Table 14. Spearman's hypothesis is strongly confirmed in our data. Not only is there a large Jensen effect on European ancestry, there is an equally large Jensen effect for the subtest correlations with the MTAG eduPGS. The correlations with the heritabilities are also positive, although these were estimated based on small samples (as far as behavior genetics goes) and thus this vector correlation is downwardly biased. If one includes the remaining 5 subtests, the correlations are about the same magnitude.

Assessment of Measure Invariance with Respect to Ancestry
Initially, we confirmed that MI held between self-identified African-and European-Americans. However, it is also possible to examine if MI holds with respect to genetic ancestry using LSEM (see [103]). A finding of MI along the whole range of admixture would preclude many group-specific factors which are commonly proposed to explain potential admixture-ability or other admixture-outcome relationships.
We fit an LSEM on the full African-and European-American sample (n = 7399) using 21 focal points (i.e., 5% ancestry increments ranging from 0% to 100%). From 0% to 100% European ancestry there was no deviation from MI. While the mean levels in g and complex reasoning increased linearly, the mean levels in episodic memory and executive functioning decreased linearly. MI appears to be tenable across the whole range of ancestry and, as well, the relationship between latent abilities and genetic ancestry comport with Spearman's hypothesis.
We additionally fitted LSEMs with SES. When standardized SES was modeled from its lowest to its greatest value, it showed measurement invariance but failed to account for the differences in the latent means between racial groups despite having a similar, albeit smaller, effect compared to European admixture when modeled as a background variable in the MGCFA model. The important takeaways from the LSEM results are that MI with respect to both SES and admixture is tenable, admixture can fully account for group mean differences in latent abilities while our present measure of SES cannot, and, as MI implies that the differences between groups on tests are due to differences in means and variances alone, this appears to be the case with admixture and race generally.

Discussion
We conducted an admixture study to assess the association between cognitive ability and European ancestry in the African-and European-American populations. Earlier researchers (e.g., [51,52]) were unable to detect significant relationships between ancestry and cognitive ability. We, however, found a strong, significant association between European genetic ancestry and cognitive ability, confirming the recent results of both Kirkegaard et al. [24] and Warne [25].

Global Admixture Analysis and Past Research Findings
The betas we calculated for European ancestry were very similar across models: B = 0.78 for the self-identified monoracial African-American-only sample (Model 1, Table 5) and B = 0.75 for the combined African-and European-American sample (with controls for SIRE; Model 2, Table 6). The former beta corresponds to a bivariate correlation of only r = 0.086 (N = 2179; Table 4). This small correlation is misleading as to its importance. The correlation is small in magnitude simply because the variation in European ancestry in the sample is low (or restricted). As Jensen [104] noted, constrained variability results in attenuated admixture correlations. Because of this, a small correlation within a population with restricted ancestry variance can be consistent with a large overall effect of admixture. For example, our African-American sample averaged 18.7% (SD = 11.7%) European ancestry. Compared with European-Americans (98.6%), this is a difference of (98.6%-18.7%)/11.7% = 6.83 standard deviations of European ancestry. Thus, the predicted difference in cognitive ability between African-and European-Americans (or, rather, African-Americans with the same amount of European ancestry as European-Americans in this sample), based on our observed bivariate association, would be r = 0.086 × 6.83 = 0.587 SD. This is relative to the African-/European-American difference of d = 0.984 in cognitive ability (from Table 1). Thus, this small r = 0.086 correlation is consistent with 60% of the mean difference being explained by genetic ancestry.
Note that the correlation Scarr et al. [52] found was 0.05/0.086 = 58% of the correlation we found. The attenuation in correlation is likely attributable to the reduced validity of their admixture index and the lower heritability of their measure of cognitive ability. Thus, our study largely vindicates Jensen's [104] reanalysis of the Scarr et al. [52] data.
We found very large Jensen effects for both the African-/European-American difference (r = 0.929) and the association between ancestry and g loadings (r = 0.941). The finding of a Jensen effect on ancestry is also consistent with the results presented by Scarr et al. [52]. As Loehlin [105] seems to have implied, the association between admixture and ability in the Scarr et al. [52] sample comported with the typical weak form of Spearman's hypothesis in that European ancestry was positively related (r = 0.1) with Raven's Progressive Matrices scores and negatively related (r = −0.12) with a memory measure. Indeed, using average g loadings provided by Jensen [19] (cf. [106]) the correlation between the African-/European-American score differences and g loadings is r = 0.64, while the correlation between the African-American-only g loadings and the score differences between the most-and least-African in this group is r = 0.84.
The only other study which has examined Spearman's hypothesis in the context of intra-SIRE ancestry-related differences is Hu et al. [18]. The authors studied participants from the National Longitudinal Survey of Youth 1997. Among African-Americans, Hu et al. [17]) found that darker color was associated with lower g scores while parent-reported European ancestry was associated with higher g scores. Jensen effects were found for both indices of ancestry. It seems that Spearman's hypothesis generalizes to ancestry-related differences within self-identified racial groups.
Our study goes beyond those of Kirkegaard et al. [24]) and Warne [25]) in that we (as with Hu et al. [17]) established both SFI and the weak form of Spearman's hypothesis regarding the African-/European-American difference. On the basis of this result we conclude that the SIRE differences in this battery are typically-sized and consistent with prior investigations of ability patterns in African-and European-American samples. Because the test battery is unbiased and it assesses the same traits in both populations, the difference observed here is a real difference in the cognitive abilities being measured.
We go beyond Hu et al. [17] in that we included a measure of both color and genetic ancestry in our regression analyses. As predicted by the human capital model (Hu et al. [17], Prediction 7), we found that color had no incremental validity beyond ancestry. This lack of incremental validity for color is possibly due to power. Color scores were available for only a subset of individuals since several of the arrays did not include all the SNPs needed by the algorithm to compute the scores. Thus, we cannot completely rule out an incremental effect of color in the present dataset. We can, however, conclude that the association between European ancestry and g is not secondary to that between color and g.
It is worth noting that in the path model we did not find a significant association between skin color and eduPGS. This could be due to a lack of statistical power. This may, however, suggest a limited role for assortative mating (or, for that matter, pleiotropy), which has been suggested by some (e.g., [2]) in explaining the color-IQ association among African-Americans.
Beyond color, we also controlled for SES. Although not as strong of a predictor as ancestry, SES was a useful predictor of cognitive ability which modestly attenuated the effect of ancestry on cognitive ability. These results, however, are expected in both genetic and non-genetic models because environmental variables and familial characteristics like SES also reflect genetic influences [2,107,108]. Controlling for these variables may therefore spuriously weaken the overall association between ancestry and the relevant genetic factors that actually explain their influence. Conditions in which cross-sectional data reflect real mediation are exceptional [109] so future investigations should attempt to tease out causality with longitudinal designs and local admixture mapping. A particularly strong research design would be to investigate sibling variation in admixture just as Visscher et al. [110] did with relatedness. Very large samples are needed for this design if using full siblings; if data can be found using half-or adopted-siblings, then the required sample sizes would be more reasonable.

eduPGS Findings and Past Research
We also evaluated the transethnic validity of eduPGS. We found that the eduPGS with the highest validity for g in both the African-and European-American samples was the MTAG_EA_10K set. Although the validity in the African-American sample was approximately half of that in the European-American sample (r AA = 0.1115; r EA = 0.2269), the relations were statistically significant in both populations (p < 0.0001). As with Piffer [111] we found large African-/European-American differences in these eduPGS (d = 1.89). Using the beta in the African-American sample and controlling for the effect of European ancestry (B = 0.124; Model 1b; Table 10), we estimate that the known eduPGS can naïvely explain as much as 20%-25% of the African-/European-American intelligence gap.
Removing variants with low MAF in African 1000 Genomes lineages had little effect on the validity of MTAG_10K eduPGS among African and European-Americans. Thus, contrary to some arguments, European-specific alleles do not seem to be biasing prediction with eduPGS. Our results corroborate those of Piffer [111] who found a strong ecological correlation between MTAG-derived SNPs and population IQ (r = 0.86). We also found, using both regression and path analysis that, while the eduPGS mediates the association between European ancestry and cognitive ability, skin color scores do not.

Alternative Explanations and Methods to Further Reduce Uncertainty
At least one researcher has advanced the possibility that admixture designs are hopelessly confounded due to historical assortative mating [112]. This theory supposes that the European ancestors of African-Americans were genetically disposed toward either lower or higher cognitive ability in a systematic way. Since Piffer [111] found that other African descent groups have similarly low eduPGS, this explanation would necessitate that the eduPGS are not indexing genetic differences related to cognitive ability in African groups. Determining if this is the case is important for future research. We can directly investigate this hypothesis and further test the transethnic validity of the eduPGS with local admixture mapping. If the theory implied here is correct, European ancestry segments should have systematically higher or lower eduPGS in African-American samples. Relatedly, if eduPGS lack transethnic validity in Africans, they will not be associated with specific regions of the brain, biological variables, or consistent patterns of gene expression as they are in Europeans.
Our data are compatible with a between-group heritability (variance explained by European ancestry) of between 50% and 70% depending on the model chosen (see Scarr et al.,p. 85 [52]). This estimate of between-group heritability is consistent with Rushton and Jensen's [9] hereditarian Psych 2019, 1 453 model, according to which 50%-80% of the African-/European-American cognitive difference is due to genetic differences.
While the statistical mediation by PGS scores suggests that genetic factors may be involved, as discussed in detail by Kirkegaard et al. [24], we cannot rule out many types of confounding environmental variables with this research design. Global admixture analysis results are suggestive and should only be considered a first step for investigating the effects of admixture on a trait. We suggest to attempt replication of the current results using a nationally representative sample and then, if these findings are confirmed, proceed to admixture mapping (local admixture analysis). This is the standard approach taken in medical epidemiology.
We suggest two approaches to further reduce the uncertainty regarding the causes of the African-/European-American cognitive ability gap. First, attempts should be made to replicate the current results using other samples (e.g., the Add Health study; [64]). Second, local ancestry analysis/admixture mapping to examine the regions of the genome where the association with ancestry is most pronounced are a natural follow-on project. The rationale of such an analysis has been explained by others (e.g., [17,24,113,114]). It would also be worthwhile to attempt to replicate these results in admixed American populations outside of the United States (e.g., Brazil, Colombia, etc.). This research project can and should be expanded to other ethnic groups both in and outside of the United States. Examples include Aborigines in Australia, Cape Coloureds in South Africa, and the Mestizo population in Mexico. For some groups, it may be of utility to examine differences in broad abilities (e.g., spatial or mathematical ability) instead of general ability as done here, as differences may not be general.

General Conclusion
Rushton and Jensen [9] called for modern genetic studies to test the hereditarian model. They predicted that "for those Black individuals who possess more White genes, their physical, behavioral, and other characteristics will approach those of Whites" ( [9], p. 262). In the present study, we confirmed that this was the case for general cognitive ability. Moreover, we showed that the association between European ancestry and g was substantially mediated by eduPGS rather than skin color PGS. These results provide support for a hereditarian model.
We conducted several analyses aimed at testing a genetic hypothesis for the African-/European-American difference in cognitive ability. We found that European ancestry was a consistent predictor of cognitive ability, even after entering various controls into our models. The large observed eduPGS differences were capable of predicting substantial proportions of the cognitive ability gap, which suggests the possibility of mediation. Future assessments with cross-racially valid PGS should attempt to assess this possibility more clearly. Our global admixture-based results suggest a contribution of admixture to the African-/European-American cognitive ability gap and should encourage future investigations at the level of local admixture.