Association of a Total Cholesterol Polygenic Score with Cholesterol Levels and Pathological Biomarkers across the Alzheimer’s Disease Spectrum

Midlife hypercholesterolemia is a well-known risk factor for sporadic Alzheimer’s disease (AD), and like AD, it is highly influenced by genetics with heritability estimates of 32–63%. We thus hypothesized that genetics underlying peripheral blood total cholesterol (TC) levels could influence the risk of developing AD. We created a weighted polygenic score (TC-PGS) using summary data from a meta-analysis of TC genome-wide association studies for evaluation in three independent AD-related cohorts spanning pre-clinical, clinical, and pathophysiologically proved AD. APOE-ε4 variant was purposely included in the analysis as it represents an already well-established genetic risk factor for both AD and circulating TC. We could vastly improve the performance of the score when considering p-value thresholds for inclusion in the score, sex, and statin use. This optimized score (p-value threshold of 1 × 10−6 for inclusion in the score) explained 18.2% of the variance in TC levels in statin free females compared to 6.9% in the entire sample and improved prediction of hypercholesterolemia (receiver operator characteristics analysis revealed area under the curve increase from 70.8% to 80.5%). The TC-PGS was further evaluated for association with AD risk and pathology. We found no association between the TC-PGS and either of the AD hallmark pathologies, assessed by cerebrospinal fluid levels of Aβ-42, p-Tau, and t-Tau, and 18F-NAV4694 and 18F-AV-1451 positron emission tomography. Similarly, we found no association with the risk of developing amyloid pathology or becoming cognitively impaired in individuals with amyloid pathology.


Introduction
Alzheimer's disease (AD), dementia, and cognitive impairment are multifactorial in nature, whose cause and progression are typically influenced by a combination of risk factors such as old age, genetics, and lifestyle factors [1]. Approximately half of the AD phenotypic variance is explained by genetics; however, most of the genetic variants are unidentified [2], as is the mechanism by which they act.
One of the suggested lifestyle factors involved in AD is the level of blood total cholesterol (TC) [1], as well as LDL content. For example, several studies have shown that high TC levels, specifically in midlife, are associated with an increased risk of developing AD [3][4][5][6][7][8], yet others have shown little or no association [9][10][11]. Moreover, higher circulating cholesterol levels (TC or LDL cholesterol) have been associated with increased amyloid load [12][13][14] and hypometabolism in brain regions affected by AD [3]. Late-life TC levels have also been examined, but again with contradictory results. In a study with nursing home residents, levels of blood TC were found to be significantly increased in pathologically defined AD patients, compared to individuals free from AD pathology [15,16]. Similarly, when compared to non-demented subjects with atherosclerotic heart disease, TC levels were found to be increased in individuals with possible clinical or probable AD [17]. Contrary to these findings, one study found that TC levels were decreased in AD individuals compared to controls [18].
A factor that could partly explain these variable results is the fact that AD is a clinicopathological construct [19], and a clinical diagnosis of probable AD has a sensitivity of 81% and a specificity of 70% to predict definite AD (pathophysiologically proven) [20]. This has very recently led to a proposal of new guidelines for the definition of AD in research settings by the NIA-AA Research Framework [21]. These guidelines propose that, for research purposes, AD should be defined as a biological construct determined by the presence of pathology as assessed with the *A/T/N* classification system, depending on levels of amyloid-β (Aβ, A), phosphorylated TAU (p-Tau, T) and neurodegeneration (N) [22]. When pathology data is not available, proxies of the aforementioned pathologies have been developed using cerebrospinal fluid (CSF) or positron emission tomography (PET) measurements. Of note, this biological definition was proposed to also work with current clinical diagnoses of AD; e.g., AD neuropathological change with or without accompanying cognitive decline. In this study, we have used the *A/T/N* framework to define participants according to their neuropathological status and to refine their clinical statuses (healthy or AD).
Similar to AD, TC levels are also markedly influenced by genetics [23][24][25]. For example, heritability is estimated to be 58-79% for AD [26] and 32-63% for TC [27]. Considering the genetic background of both conditions and the fact that they are linked in terms of risk, it is possible that some of the genetic variance seen in AD can be explained by variants influencing blood cholesterol levels. The best example is the APOE-E4 allele, which serves both as a very significant risk factor of sporadic AD as well as a potent modulator of TC in the blood.
Along these lines, an early study did investigate the effect of a TC polygenic score (TC-PGS) in AD but failed to reveal any significant effects [28]. However, only patients with clinically defined AD were investigated in this paper. In addition, scores were based only on genome-wide significant single nucleotide polymorphisms (SNPs) compared to performing an evaluation of the best p-value cut-off, and the polygenic score only explained a small portion of the variance (3.6%) in cholesterol levels. It is thus possible that the inclusion of low effect loci in the score and using the new classification system for AD could reveal important associations.
The aims of this study were to first examine multiple TC-PGSs to determine the effect of inclusion of low effect loci and, at the same time, evaluate the influence of factors such as sex and statin use. Secondly, after determining the score with the best prediction, we aimed at investigating the TC-PGS in the context of AD as a biological construct, examining associations with the *A/T/N* pathologies and cognition in individuals with AD pathology.
We show, using three different AD-related cohorts that cover the pre-symptomatic to the symptomatic end-stage of the disease, that despite creating an improved TC-PGS, no associations with either pathology or cognition could be detected.

The Meta-Analysis Summary Data
Summary statistic data from the Global Lipids Genetics Consortium's meta-analysis (GLGC) of TC GWAS's [24] was downloaded from csg.sph.umich.edu/willer/public/ lipids2013/ (downloaded on 21 June 2018). Results from the joint analysis of metabochip and GWAS data were used. Before being used for scoring, ambiguous SNPs were excluded, and only SNPs present in all three target data sets were kept. The details of this data set are described elsewhere [24]. Briefly, this data are based on 63 blood total cholesterol genome-wide association studies (GWAS's) for a total of 114,230 individuals [24]. 48 vs. 15 studies were of European and non-European ancestry, respectively. The ratio of women in these studies ranged from 0 to 76.8%, and the mean age in the studies ranged from 16 to 75 years. Most studies investigated individuals free of lipid-lowering drugs (44/63) and the majority of studies had a fasting regime before cholesterol measurements (51/63). Raw data contained 2,446,981 SNPs, whereof 15.4% were ambiguous. These were excluded resulting in a data set with 2,069,037 SNPs.

PREVENT-AD
The Pre-symptomatic Evaluation of Novel or Experimental Treatments for Alzheimer's Disease (PREVENT-AD, openpreventad.loris.ca/, accessed on 26 January 2018) cohort, based at the Centre for Studies on the Prevention of AD in Montreal, Canada (StoP-AD, douglas.research.mcgill.ca/stop-ad-centre, accessed on 26 January 2018), is a longitudinal study of older, healthy individuals (55+) with a parental or multiple-sibling history of AD [29]. Data for all variables were obtained from data release 5.0 (30 November 2017) except for APOE genotype, PET, and genetic data. For these variables, the latest available data at the center was used to be included in future data releases. Each participant and study partner provided written informed consent. All procedures were approved by the McGill University Faculty of Medicine Institutional Review Board and complied with the ethical principles of the Declaration of Helsinki. In this cohort, 382 individuals were genotyped and selected for evaluation. Of these, 41 were excluded during quality control procedures and 35 were excluded due to lack of data for covariates and target phenotypes, resulting in a final data set of 306 individuals.

ADNI
Data used in the preparation of this article were obtained from the ADNI database (adni.loni.usc.edu, accessed on 3 December 2015). The ADNI was launched in 2003 as a public-private partnership led by Principal Investigator Michael W. Weiner, MD in the US. The primary goal of ADNI has been to test whether serial magnetic resonance imaging, PET, other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of MCI and early AD. For up-to-date information, see www.adni-info.org (accessed on 3 December 2015). For this study, a subset of ADNI consisting of individuals with genetic data and a family history of AD (first-degree relative affected) was used. All data, but the CSF data, which were downloaded on 22 June 2018, were downloaded on 3 December 2015. The final data set, after quality control and merging of data (see below; Genetic data-Merging of ADNI data), contained 1065 individuals. These were further filtered for having a family history of AD, resulting in a data set of 401 individuals.

ROSMAP
ROSMAP consists of two longitudinal clinical-pathologic cohort studies of aging and AD from the Rush Alzheimer's Disease Center in the US (www.radc.rush.edu/, accessed on 28 May 2019) [30]. In this study, a subset containing individuals with genetic and pathology data was used. Briefly, 1081 individuals passed genomic quality control steps  [31] for the genetic data was done similarly for all cohorts and were performed in PLINK v1.9 [32,33] as follows: heterozygous haploid genotypes excluded, sex check, relatives excluded (identity by descent > 0.1875), sample and genotyping call rate > 0.95, minor allele frequency > 0.05, and Hardy-Weinberg equilibrium < 1 × 10 −6 . SNPs were then matched with the GRCh37 genome (www.ncbi.nlm.nih.gov/assembly/ GCF_000001405.13/#/st, accessed on 28 May 2019). A principal component analysis with 1000 Genomes phase 3 data as a reference [34,35] was performed to determine ancestry and filter for individuals with European ancestry. Briefly, long-range linkage disequilibrium regions and ambiguous SNPs were first excluded from 1000 Genomes data and then merged with target cohort genetic data. Any ambiguous SNPs from target data were then excluded. Merged data were pruned with a sliding window of 2000 bp with a step size of 200 bp, excluding SNPs with an R2 > 0.2 (-indep-pairwise 2000 200 0.2), and PCs calculated (-pca) in PLINK [32,33]. Averages and standard deviations of PC1 and PC2 for Europeans in 1000 Genomes were determined, and target cohort individuals were determined to be of European ancestry if their PC1 and 2 fell within ± 3 SD of the 1000 Genomes cohort means. PCs were then calculated again within the European target cohorts to include as covariates in subsequent analyses.

Imputation
PREVENT-AD, a subset of ADNI ("ADNI 1 GWAS" data set, see below), and ROSMAP data were imputed using the Sanger Imputation Service [36] (imputation.sanger.ac.uk/, accessed on 3 December 2015). Briefly, quality-controlled genetic data was uploaded and pre-phased with SHAPEIT2 [37] and imputed with positional Burrows-Wheeler transform [38] using the 1000 Genomes cohort [34,35] as a reference panel. Only postimputed SNPs with an info score greater than 0.7 were kept (similar to [39]) to balance the quantity of excluded data (14% in [39]) with data quality.

Merging ADNI Data
Two genomic data sets were used for ADNI; the "ADNI 1 GWAS" data set genotyped using the Illumina Human610-Quad BeadChip, and the "ADNI WGS" data set genotyped using a whole-genome sequencing platform. Before merging, both data sets were quality controlled and "ADNI 1 GWAS" data was imputed. Some individuals were present in both data sets, in which case data from the "ADNI WGS" data set was used. The merged genetic data set contained 6,164,853 SNPs and 1065 individuals.

TC and Hypercholesterolemia Measurements
In PREVENT-AD, TC levels were assessed in plasma drawn from non-fasting individuals at the eligibility visit (i.e., before baseline measurements). In ADNI, TC levels were assessed in whole blood drawn at the screening visit from fasting individuals. ADNI TC measurements were transformed from mg/dL to mM to match the PREVENT-AD data by dividing values with 38.67. ROSMAP was not used for blood TC analyses. A Genes 2021, 12, 1805 5 of 20 hypercholesterolemia variable was created for PREVENT-AD and ADNI, by assuming that all individuals on statins and all non-treated individuals with TC levels > 6.2 mM were hypercholesterolemic [4].

CSF Measurements
In both PREVENT-AD and ADNI, CSF was obtained by lumbar puncture following an overnight fast. Levels of Aβ-42, p-Tau, and t-Tau were measured by the Innotest ® ELISA (enzyme-linked immunosorbent assays, Fujirebio) [40] and the Roche Elecsys CSF immunoassays (data file UPENNBIOMK9_04_19_17.csv) [41,42], for PREVENT-AD and ADNI, respectively. Of note, the Elecsys Aβ-42 CSF immunoassay is currently under development for investigational use only and has an upper technical limit of 1700 pg/mL. Values above this limit are based on extrapolation of the calibration curve, and the performance of these values has not been formally established. These are still included in this study. In PREVENT-AD, Aβ-40 levels were further assessed by the MSD ® MULTI-SPOT Assay System (V-PLEX Plus Aβ Peptide Panel 1 (6E10) Kit, MesoScale, Rockville, USA).

PET Imaging
PET scans were performed in PREVENT-AD using fluorine 18-labeled NAV4694 and AV-1451 (Flortaucipir, Montréal, Canada) to estimate the deposition of Aβ and TAU in the brain, respectively. Standardized uptake value ratios (SUVR) were computed by dividing tracer uptake by cerebellar gray matter uptake (Aβ) or by inferior cerebellar gray matter uptake (TAU). For details on PET procedures, see [13].

Amyloid Positivity Status
According to the recently proposed biological definition of AD, we categorized individuals as on or off the AD spectrum by the presence of amyloid pathology in the brain [21,22]. In PREVENT-AD, individuals were split into amyloid negative (Aβ(−)) and positive (Aβ(+)) status based on Aβ PET values (Aβ(+) defined as SUVR > 1.37), similar to McSweeney and colleagues [43]. In ADNI, we used the CSF p-Tau/Aβ-42 ratio as a proxy for brain amyloid pathology, as described by Hansson et al., [44]. Briefly, the CSF values were extracted from the last available visit for each individual, and a ratio ≥ 0.028 was considered as Aβ(+) and thus on the AD spectrum, whereas a lower ratio was considered Aβ(−). In ROSMAP, semiquantitative estimates of post-mortem neuritic plaque density as recommended by the Consortium to Establish a Registry for Alzheimer's Disease (CERAD score) were used to define Aβ(+) individuals. This is a four-point scale, and individuals, where scores three or four were considered Aβ(+).

Statistical Analyses
All statistical analyses were performed in R [45]. Data was handled with the "data.frame" [46] and "tidyverse" [47] packages and plotted with the "cowplot" [48] package. For a full list of data sets, software, and r packages used, and their respective links, see Supplementary  Table S1. Values are reported as mean ± standard error or the mean (SE) if not otherwise stated.

Descriptive
Differences in cohort characteristics, such as age, sex, and TC levels, were analyzed with either a Welch two-sample t-test (comparing two cohorts) or an ANOVA (comparing the three cohorts) for continuous variables and with Pearson's chi-square test for categorical variables. Post-hoc analysis was performed if primary analyses were significant, and comparisons were between all three cohorts. Here, Tukey HSD was used for continuous variables and post-hoc chi-square test was used for categorical variables. R package "psych" [49] was used to compute summary statistics.

TC Levels and p-Value Thresholding
The relationship between each score and blood TC levels was evaluated with a linear regression with genetic PCs 1-10, age, age 2 , APOE-ε4 status, sex, and statin use as covariates. Additional R2 explained were calculated as the difference in R2 between a model containing only the covariates and a model containing covariates and the TC-PGS. Standard deviations at each cut-off were determined by bootstrapping (n iterations = 5000) using the R package "boot" [50,51] and R2-values were calculated using the "rcompanion" package [52]. Effects of statin use and sex on the relationship between the TC-PGS and TC levels were assessed by stratification, first by statin use and then by sex (in statin-free individuals). The TC-PGS that explained most of the variance were selected for further analyses in all three cohorts with interactions terms for sex and statin use when sample sizes allowed.

Hypercholesterolemia ROC Analyses
Discrimination of hypercholesterolemic from healthy individuals was evaluated by ROC curve analysis and quantified by the AUC using the "pROC" package [53]. Data were stratified for sex, and the difference between a model containing the covariates (PCs 1-10, age, and age 2 ) and a model containing covariates plus the TC-PGS was evaluated with DeLong's test.

CSF and PET Linear Regression Analyses
Each dependent variable was examined for distribution patterns and transformed if not normally distributed and analyzed with multiple linear regression. In PREVENT-AD, models were corrected for genetic PCs 1-10, age, APOE-ε4 status, statin use, and run with a sex*TC-PGS interaction term. In ADNI, the same covariates were used except for statin that was included in the interaction term (statin*sex*TC-PGS). Aβ PET data were not normally distributed, even after transformation, and individuals were therefore analyzed both with a robust regression (same model as above) and by linear regressions after stratifying for Aβ(+) status. The Aβ(−) group had a sufficient sample size to be analyzed with the aforementioned model (n = 80), whereas the sample size of the Aβ(+) group was too small to run the same regression (n = 18). Thus, the regression was run with age, APOE-ε4 status, statin use, and sex as covariates and only investigated the main effect of the TC-PGS.

Risk of AD and Cognitive Impairment Logistic Regression Analyses
We evaluated whether the TC-PGS associated with the risk of ending up on the AD spectrum is defined as being Aβ(+) by logistic regression. Cognition was analyzed in ADNI and ROSMAP, and these analyses were limited to Aβ(+) individuals to investigate individuals on the AD spectrum only. In both ROSMAP and ADNI, CI was defined as having a clinical diagnosis of either MCI, AD, or other dementia. Risks between the TC-PGS and both Aβ(+) status and CI were evaluated by multiple logistic regressions. All models were corrected for genetic PCs 1-10, age, and APOE-ε4 status. In PREVENT-AD, statin use was further included as a covariate, and the model was run with a sex*TC-PGS interaction while statin use was included in the interaction term in ADNI. Similar models were used in ROSMAP but without the statin factor as this data were not available.

Conversion Rate
The effect of TC-PGS on conversion rate in ADNI and age of onset in ROSMAP was evaluated with Kaplan-Meier survival analysis [54]. Before filtering, the TC-PGS was categorized into tertiles (i.e., low, medium, and high TC-PGS). In ADNI, individuals that were Aβ(+) with either no CI or with an MCI diagnosis at baseline were selected. Follow-up time ranged from three to 120 months. Conversion was defined as developing a clinical diagnosis of AD. In ROSMAP, Aβ(+) individuals were selected, and the conversion was defined as receiving a clinical diagnosis of either possible or probable AD. A larger sample size in ROSMAP allowed for stratification on sex. Analyses were done using the "survival" package [55,56], the "ggfortify" package [57,58] were used for plotting and the "survminer" package [59] was used for creating survival tables.

Global Lipids Genetics Consortium
Summary data were matched with the target cohorts. After matching, the proportion of non-ambiguous SNPs present in each cohort was 86.4, 89.7 and 91.3% for PREVENT-AD, ADNI and ROSMAP, respectively. After filtering SNPs not present in all the data sets, 1,653,356 SNPs remained, representing 67.6% of the original number of summary data SNPs (see Supplementary Figure S1 for Manhattan plots of included and excluded SNPs).

Amount of Variance Explained in TC Blood Levels by TC-PGS
To establish a TC-PGS that best associates with blood TC levels, various p-value cutoffs were investigated in the PREVENT-AD and ADNI cohorts (Figure 1). The different scores were first evaluated in all individuals, correcting for covariates as well as statin use and sex (Figure 1, left-hand panel, circles). At best, the TC-PGS explained 6.9% of the variance in PREVENT-AD (p = 2.93 × 10 −8 , p-value cut-off 1 × 10 −6 ) and 4.1% in ADNI (p = 7.1 × 10 −6 , p-value cut-off 0.01). Stratification on statin use (Figure 1, left-hand panel, triangles) revealed strong associations in statin free individuals in both cohorts, increasing the variance explained to 13.5% in PREVENT-AD (p = 2.83 × 10 −9 , p-value cut-off 1 × 10 −6 ) and 7.1% in ADNI (p = 1.4 × 10 −4 , p-value cut-off 1 × 10 −7 ). In contrast, the scores in general performed poorly in statin users with none of the scores significantly associated with TC levels in PREVENT-AD (p's ≥ 0.412) and the best score in ADNI explaining 5.2% of the variance (p = 7.2 × 10 −4 , p-value cut-off 1 × 10 −30 ).
In either cohort, no association between TC-PGS's and TC levels could be found in statin-free males (p's > 0.05). Based on its performance in the younger, combined PREVENT-AD cohort, the TC-PGS with a p-value cut-off of 1 × 10 −6 were selected for further analyses and will from hereon be referred to solely as the "TC-PGS". revealed strong associations in statin free individuals in both cohorts, increasing the variance explained to 13.5% in PREVENT-AD (p = 2.83 × 10 −9 , p-value cut-off 1 × 10 −6 ) and 7.1% in ADNI (p = 1.4 × 10 −4 , p-value cut-off 1 × 10 −7 ). In contrast, the scores in general performed poorly in statin users with none of the scores significantly associated with TC levels in PREVENT-AD (p's ≥ 0.412) and the best score in ADNI explaining 5.2% of the variance (p = 7.2 × 10 −4 , p-value cut-off 1 × 10 −30 ).

TC-PGS Predicts Hypercholesterolemia
Next, we examined the TC-PGS's ability to predict hypercholesterolemia in PREVENT-AD and ADNI ( Figure 2). Receiver operator characteristics (ROC) curve analysis in PREVENT-AD revealed a significant improvement in hypercholesterolemia prediction in females with the addition of the TC-PGS to the model (area under the curve (AUC) increase from 70.8 to 80.5%, p = 0.0042) but no effect in males (AUC 74.0 vs. 74.1% in model without and with TC-PGS, respectively, p = 0.91). In ADNI, although adding the TC-PGS increased the AUC values for both females (65.2 vs. 71.3%, p = 0.14) and males (65.3 vs. 70.7%, p = 0.087), these increases did not reach significance.

TC-PGS Does Not Associate with Amyloid Pathology
The effect of TC-PGS on Aβ pathology was assessed in PREVENT-AD and ADNI (Figure 3). In PREVENT-AD, linear regressions correcting for covariates and with a sex*TC-PGS interaction term revealed no effect of the TC-PGS, either as part of the interaction term or as a main effect, on analyses and will from hereon be referred to solely as the "TC-PGS".

TC-PGS Predicts Hypercholesterolemia
Next, we examined the TC-PGS's ability to predict hypercholesterolemia in PRE-VENT-AD and ADNI (Figure 2). Receiver operator characteristics (ROC) curve analysis in PREVENT-AD revealed a significant improvement in hypercholesterolemia prediction in females with the addition of the TC-PGS to the model (area under the curve (AUC) increase from 70.8 to 80.5%, p = 0.0042) but no effect in males (AUC 74.0 vs. 74.1% in model without and with TC-PGS, respectively, p = 0.91). In ADNI, although adding the TC-PGS increased the AUC values for both females (65.2 vs. 71.3%, p = 0.14) and males (65.3 vs. 70.7%, p = 0.087), these increases did not reach significance.

TC-PGS Does not Associate with Amyloid Pathology
The effect of TC-PGS on Aβ pathology was assessed in PREVENT-AD and ADNI ( Figure 3). In PREVENT-AD, linear regressions correcting for covariates and with a sex*TC-PGS interaction term revealed no effect of the TC-PGS, either as part of the interaction term or as a main effect, on CSF Aβ

TC-PGS Does Not Associate with TAU Pathology
The TC-PGS was evaluated for associations with biomarkers of TAU pathology in PREVENT-AD (CSF and PET) and ADNI (CSF, Figure 4). In PREVENT-AD, linear regressions corrected for covariates and with a sex*TC-PGS interaction revealed no associations between TC-PGS and biomarkers of TAU pathology as assessed by

TC-PGS Does Not Associate with Increased Risk of Becoming Aβ(+)
The association between TC-PGS and risk of AD, defined as being Aβ(+), was evaluated in all three target cohorts ( Table 2).
Individuals in ADNI and ROSMAP were categorized based on the presence of Aβ pathology in the brain as either Aβ(−) or Aβ(+) (see Method section for classification). We did not find any significant effect of TC-PGS on the risk of becoming Aβ(+) in neither PREVENT-AD, ADNI, nor ROSMAP. Stratification by statin used and sex did not lead to any significant association either.
Due to not being normally distributed, the data was analyzed with a rob and by linear regression after stratifying for Aβ(+) status. We found no eff PGS in the combined cohort or after stratification on Aβ(+) status (combine = 0.898, pmain = 0.510, tint (16,

TC-PGS Does not Associate with TAU Pathology
The TC-PGS was evaluated for associations with biomarkers of TAU PREVENT-AD (CSF and PET) and ADNI (CSF, Figure 4). In PREVENT-AD sions corrected for covariates and with a sex*TC-PGS interaction revealed n between TC-PGS and biomarkers of TAU pathology as assessed by CSF

TC-PGS Does not Associate with Markers of Neurodegeneration
The TC-PGS was evaluated for associations with biomarkers of neurodegeneration in PREVENT-AD and ADNI by measuring levels of CSF t-Tau ( Figure 5). We found no evidence for an association of the TC-PGS with CSF t- Tau

TC-PGS Does not Associate with Markers of Neurodegeneration
The TC-PGS was evaluated for associations with biomarkers of neurodegeneration in PREVENT-AD and ADNI by measuring levels of CSF t-Tau ( Figure 5). We found no evidence for an association of the TC-PGS with CSF t- Tau

TC-PGS Does Not Associate with Cognition in Aβ(+) Individuals
Finally, we evaluated whether the TC-PGS is associated with the risk of becoming cognitively impaired in ADNI and ROSMAP (Table 3). For this analysis, we used the subset of individuals that were Aβ(+) and defined cognitive impairment (CI) as having any diagnosis of CI (e.g., including mild cognitive impairment (MCI), AD, and other dementias) at the last recorded visit. In neither ADNI nor ROSMAP could we detect any significant association between the TC-PGS and risk of becoming cognitively impaired. Stratification by sex does not lead to any significant associations. We also evaluated whether the TC-PGS had any effect on the conversion rate (ADNI, Figure 6A) or the age of onset (ROSMAP, Figure 6B). Aβ(+) individuals, either non-CI or with an MCI diagnosis at baseline, were selected as a subset of ADNI. The "event" was defined as receiving a clinical diagnosis of AD. Survival analysis revealed no difference between TC-PGS tertiles on conversion rate in ADNI (χ 2 (2) = 1.1, p = 0.6). In ROSMAP, we examined the association between TC-PGS tertiles and age at onset of a clinical diagnosis of possible or probable AD; however, we found no difference between the tertile groups (χ 2 (2) = 0.2, p = 0.9).

TC-PGS and TC Levels
In this study, we have created a TC-PGS that associates with blood TC levels in two AD-related cohorts. We show that the variability explained by the score depends on the cohort, selection of SNPs to include in the score, statin use, and sex. We used clumping and p-value thresholding as a method for pruning SNPs to include in the scores, and thus, evaluated a number of p-value thresholds in both PREVENT-AD and in ADNI. We found

TC-PGS and TC Levels
In this study, we have created a TC-PGS that associates with blood TC levels in two AD-related cohorts. We show that the variability explained by the score depends on the cohort, selection of SNPs to include in the score, statin use, and sex. We used clumping and p-value thresholding as a method for pruning SNPs to include in the scores, and thus, evaluated a number of p-value thresholds in both PREVENT-AD and in ADNI. We found that the score that explained most of the variance varied between the two cohorts and depended on statin use and sex stratification. For instance, a p-value threshold of 1 × 10 −6 performed best in PREVENT-AD, while a threshold of 0.01 performed best in ADNI in the non-stratified analyses. In addition, the scores in general performed better in PREVENT-AD than in ADNI (Figure 1). Further, stratification on both statin use and sex had a remarkable effect on the scores' performance in PREVENT-AD, and less so in ADNI. For example, in PREVENT-AD, the TC-PGSs had significant associations in statin-free females, with no significant associations in statin-treated individuals and males.
Similarly, examining the predictive ability of the TC-PGS on hypercholesterolemia revealed a significant improvement in PREVENT-AD females after the addition of the TC-PGS to the model, increasing AUC from 0.708 to 0.805 but not in males ( Figure 2). In ADNI, we detected similar trends for improved AUCs in females and males, but this did not reach significance (p's > 0.08).
The discrepancies between cohorts could possibly be due to the differences between PREVENT-AD and ADNI ( Table 1) that differ in the proportion of females, APOE-ε4 carriers and statin users, as well as age. With sex and statin stratifications, we see that results do become more similar, further supporting the importance of taking these factors into account. Another factor that could affect the associations is that cholesterol measurements were taken after fasting in ADNI, whereas in PREVENT-AD, non-fasted samples were used, although studies have shown that TC levels are little influenced by fasting conditions [60,61].
The APOE gene locus is one of the most important for TC levels. The top SNP in the results from Willer et al. [24] is indeed rs7412-the SNP, together with rs429358, that determines the APOE-ε4 genotype. Its C allele combines to either result in the ε3 or ε4 alleles (as opposed to the ε2 allele) and associates with increased TC levels (β = 0.374, p = 1.560 × 10 −283 ). Rs429358 further determines the ε4 allele and is not present in the summary data. Nevertheless, rs429358 has been shown to associate with TC levels in other big GWAS's [62,63] such that the C allele, which results in the ε4 allele, associates with increased TC levels.
It should be noted that the two cohorts also differ in terms of age; ADNI being on average 10 years older than PREVENT-AD. TC levels increase from early life over midlife to late-life [64], however, it appears to be decreasing with age in older adults above 70 years [18,65]. This altered metabolism of cholesterol with age possibly involves different sets of genes and could thus explain why the TC-PGS behave differently in the two differently aged cohorts. This hypothesis, however, need to be further investigated using either longitudinal studies or cross-sectional studies covering a bigger range of ages. Considering that increased midlife levels of TC [3][4][5]8] are associated with increased risk of AD, it is interesting that our TC-PGS performs better in PREVENT-AD, which is closer to midlife than ADNI, thus suggesting it is better capturing midlife than late-life cholesterol levels.
The interaction between age and sex is of interest. For example, menopause in women is associated with increased TC levels and risk of cardiovascular disease [66,67] and hormone replacement therapy has been shown to decrease TC levels [68]. PREVENT-AD are younger and have a higher percentage of females compared to ADNI, and one could thus hypothesize those discrepancies in TC metabolism could also be influenced by discrepancies in the proportions of individuals that underwent menopause and treatment thereof.
Finally, compared to the study by Proitsi [28], our results show that the variance explained in blood TC levels by a TC-PGS can be vastly improved (3.6% in [28] vs. 18.2% for p-value cut-off 1 × 10 −6 in statin free females in PREVENT-AD) by considering statin use and sex in the model.

TC-PGS and AD
Contrary to our significant findings between a TC-PGS and TC blood levels, the TC-PGS showed no associations with any biomarkers of AD pathologies (Figures 3 and 4), neurodegeneration ( Figure 5), or cognition ( Figure 6, Table 3). Similarly, the TC-PGS did not associate with the risk of becoming Aβ(+) ( Table 2), whether stratified by sex or statin use.
The relationship between vascular factors and AD biomarkers was recently assessed in PREVENT-AD and showed that vascular factors, including TC levels, associate with increased Aβ pathology, but only in individuals free of vascular medication, which include statins [13]. In contrast to the current study, where individuals were grouped based on statin use only, Köbe et al. included other medications relevant to cardiovascular disease (drugs against hypercholesterinemia and hypertension). Although samples size was an issue in PREVENT-AD, in ADNI, we had a sufficient sample size to include statin and sex use as interaction terms. Nevertheless, we could not find evidence for any association between the TC-PGS and AD biomarkers in both cohorts, maybe indicating that further vascular medications, rather than just statin use, need to be formally considered.
It is also possible that there is an additive effect of vascular risk factors such that the TC-PGS alone is not sufficient to have an effect on AD. Kivipelto et al. showed in multiple studies that there is an additive effect of TC levels, blood pressure, and APOE-ε4 [6][7][8], leading to the development of the cardiovascular risk factors, aging, and dementia (CAIDE) score [69]. This score takes into account age, sex, education, systolic blood pressure, body mass index, cholesterol, physical activity, and APOE-ε4 status and has been validated as a predictor for AD [70]. Similarly, vascular burden scores, taking into account factors such as hyperlipidemia, diabetes, and hypertension, are associated with impaired executive function and lower the threshold of amyloid burden needed to result in cognitive impairment [71]. This raises the important issue that concomitant vascular pathology may have severely confounded previous studies that established the link between mid-life total cholesterol and late-life AD risk. A note of interest, low education is associated with worse lipid profiles in women and better lipid profiles in men, the subgroup most susceptive to developing AD with aging. The percentages of intra-individual biological variability of total cholesterol, LDL and HDL do not exceed 9% in the normal population [72]. Complementary studies are now required to help to better understand the possible interplay between genetics and education pathways as they may both modulate AD risk in the elderly population where socioeconomic inequalities are quite common.
As mentioned above, APOE is important both for TC levels and AD risk. In this study design, we decided to keep the APOE gene locus in the TC-PGS but to correct for APOE-ε4 status in each regression model. Thus, the associations between the TC-PGS and TC levels are in addition to any effect of APOE-ε4 status. Similarly, the lack of association between TC-PGS and AD is after correcting for APOE-ε4 status. It is thus possible that the increased risk of AD seen in APOE-ε4 carriers is actually mediated in large part by independent processes found both in the periphery and in the CNS. For example, our group reported a surprisingly strong association between CSF concentrations of apolipoprotein B (apoB) and phospho(181)-tau in the pre-symptomatic phase of the disease in elderly subjects who are "at-risk" of AD because of a parental history. ApoB-containing lipoproteins such as LDL and VLDL have been associated with vascular or mixed dementia [73] in contrast to total cholesterol, which is the one clearly associated with AD risk. The observed apoB/phosphotau association in pre-symptomatic AD is markedly modulated by the presence of the APOE-ε4 allele but not by the passage of peripheral apoB into the CNS [74]; supporting the notion that the total cholesterol could act more as a surrogate biomarker for APOE-ε4 mediated effects than a direct player in the pathophysiological process. This would be consistent with the above results showing that genetic variants, other than the genetic variants resulting in the APOE-ε4 isoform, strongly correlate with TC levels but fail to associate with AD pathology.

Conclusions
In summary, we have created a TC-PGS that associates with TC levels and significantly improves the prediction of hypercholesterolemia, specifically in statin-free females with European ancestry. We could, however, not prove any significant associations with AD, neither on the neuropathological underpinnings nor on cognition. It is possible that explaining~18% of the variance in blood TC levels is still not enough to find significant associations with AD. For example, while it has previously been shown that TC levels are associated with Aβ pathology in PREVENT-AD [13], the TC-PGS was not in the same cohort, which would suggest that we would possibly need a bigger sample size. Furthermore, considering the fact that there is an additive effect of vascular risk factors on AD, it is still possible that the TC-PGS could have an effect on AD in individuals at higher cardiovascular risk (e.g., APOE-ε4 carriers). Further research is warranted to establish the role of a TC-PGS in AD.
Author Contributions: N.I.V.N. was involved in the study design, analysis, results interpretation, and writing of the manuscript. J.P., P.-F.M. and C.P. contributed to data analysis. A.L., D.A., P.-F.M., T.K., S.V. and J.P. were involved in the data acquisition and aspects of the study design and critically revised the manuscript. All authors have read and agreed to the published version of the manuscript. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study and written informed consent has been obtained from the patient(s) to publish this article. Data Availability Statement: PREVENT-AD data is available upon request to the authors via the research centre's website: StoP-AD Centre (prevent-alzheimer.net, accessed on 26 January 2018). For all other datasets, please refer to the Material and Methods Section. Data used in the preparation of this article were obtained from the ADNI database (adni.loni.usc.edu, accessed on 3 December 2015). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in the analysis or writing of this report. A complete listing of ADNI investigators can be found at: adni.loni.usc.edu/wp-content/uploads/ how_to_apply/ADNI_Acknowledgement_List.pdf (accessed on 22 June 2018).