Mendelian Randomisation Confirms the Role of Y-Chromosome Loss in Alzheimer’s Disease Aetiopathogenesis in Men

Mosaic loss of chromosome Y (mLOY) is a common ageing-related somatic event and has been previously associated with Alzheimer’s disease (AD). However, mLOY estimation from genotype microarray data only reflects the mLOY degree of subjects at the moment of DNA sampling. Therefore, mLOY phenotype associations with AD can be severely age-confounded in the context of genome-wide association studies. Here, we applied Mendelian randomisation to construct an age-independent mLOY polygenic risk score (mloy-PRS) using 114 autosomal variants. The mloy-PRS instrument was associated with an 80% increase in mLOY risk per standard deviation unit (p = 4.22 × 10−20) and was orthogonal with age. We found that a higher genetic risk for mLOY was associated with faster progression to AD in men with mild cognitive impairment (hazard ratio (HR) = 1.23, p = 0.01). Importantly, mloy-PRS had no effect on AD conversion or risk in the female group, suggesting that these associations are caused by the inherent loss of the Y chromosome. Additionally, the blood mLOY phenotype in men was associated with increased cerebrospinal fluid levels of total tau and phosphorylated tau181 in subjects with mild cognitive impairment and dementia. Our results strongly suggest that mLOY is involved in AD pathogenesis.


Introduction
Alzheimer's disease (AD) is the leading cause of dementia worldwide, accounting for 60-80% of total cases [1]. While Mendelian inheritance is suspected to cause early onset AD (<65 years) [2], late-onset AD (LOAD, >65 years) is a complex, multifactorial disease influenced by both genetic factors and life exposures. The genetic contribution to LOAD is estimated to be 60-80% [3], with APOE being the most prominent locus discovered to date [4]. However, demographic features also play a predominant role in AD. Notably, age is considered the most important risk factor for LOAD [5], and women represent nearly two-thirds of the global population with AD [1], showing higher rates of cognitive decline [6,7] than men. However, whether sex should be considered a risk factor for AD or rather a source of disease heterogeneity is a matter of intense debate [8]. Recent reviews have highlighted the importance of reporting results for sex interactions and sex-stratified AD data instead of the more widely used approach of adjusting data by sex [9]. These approaches may help elucidate differences in sex-specific AD risk profiles, which will be of great value in the incoming age of precision medicine.
The male-specific region of chromosome Y is one of the most unexplored regions of the human genome-and it has long been considered a genetic wasteland. Mosaic loss of chromosome Y (mLOY) in blood cells is the most common known form of somatic mosaicism in humans [10][11][12]. Genetic factors together with age, smoking, and other environmental stressors are well-known risk factors for mLOY [13]. Genetic variants associated with mLOY risk are mainly related to mitotic processes, cell cycle regulation, DNA damage sensing and response, and apoptotic processes [14]. mLOY was initially considered a phenotypically innocuous, age-related trait [15][16][17][18]. However, there is increasing evidence that mLOY in blood cells has a direct effect in the aetiopathogenesis of several diseases affecting different tissues. Specifically, blood cell mLOY has been associated with susceptibility to multiple ageing-related diseases, including AD [19], non-haematological cancer [10,20], cardiovascular diseases [21,22], and all-cause mortality risk [10]. The main proposed mechanism to explain blood mLOY pathogenesis is impairment of immune functions caused by the loss of the Y chromosome in leucocytes [23][24][25]. However, it has been described that autosomal genetic predisposition for mLOY is associated with breast cancer in women, indicating that the underlying genomic instability can also explain the associations between mLOY and disease risk [14].
Here, we aimed to study the impact of mLOY on AD risk in the GR@ACE and Dementia Genetics Spanish Consortium (DEGESCO) cohorts [26,27]. First, we checked for blood mLOY associations with AD in a case-control setting and in the phenoconversion process from mild cognitive impairment (MCI) to all-cause dementia and AD. Subsequently, to remove age-confounding effects, we generated an autosomal, age-independent mLOY polygenic risk score (mloy-PRS) and analysed its effect on AD status and progression in both sexes. Finally, we analysed the impact of mLOY in different AD-related biomarkers in cerebrospinal fluid (CSF).

Results
We calculated the mean log R ratios of probes in the X (LRR-X) and Y (LRR-Y) chromosomes to check the sex chromosome dosages of 7954 clinically reported male samples in the GR@ACE-DEGESCO cohort ( Figure S1). We detected one individual with a gain of chromosome Y (GOY, XYY), compatible with a supermale syndrome, and three individuals with Klinefelter syndrome (XXY). We removed women (XX), GOY, Klinefelter individuals, and outliers prior to mLOY computation. For the 7843 remaining XY individuals, we removed second-degree or lower relatives as well as samples with a low genotype call rate (≤0.97) or excess heterozygosity (>3 standard deviations [SD] over the mean heterozygosity of the cohort). We ran principal component analysis to identify the population structure and removed 72 individuals from non-European population (>6 SD from the 1000 Genomes European population mean). We also excluded subjects with detectable autosomal chromosomopathies, (i.e., Down's syndrome). After applying these exclusion criteria, we split the remaining 6955 male samples into two randomised batches and calculated the mean LRR of probes found at the male-specific region of chromosome Y (mLRR-Y). We used the mLRR-Y thres method of the MADloy R package [28] to call the mLOY status. We did not detect batch effect due to cohort splitting ( Figure S2). We excluded 12 additional samples with LRR SD > 0.46. Finally, we plotted mLRR-Y and pseudoautosomal region 1 B-deviation (PAR1-Bdev) values to identify and remove individuals with detectable anomalies in chromosome Y (i.e., partial loss of chromosome Y) or loss of heterozygosity in the PAR1 region ( Figure S1). Quality control (QC) and filtering steps for analysis of mLOY phenotypes and mloy-PRS are summarised in Figure S3.
For an initial glimpse at our data, we plotted mLRR-Y values of all AD cases and controls with respect to age ( Figure 1). The first thing that became apparent was that our control population is significantly younger than the AD population. Additionally, mLOY occurrence before 65 years was a very rare event in our cohort, indicating that our control population below this age threshold may not be representative for assessing the effect of mLOY on AD. Moreover, our control population mostly lacked individuals older than 85 years. Consequently, we decided to establish a 65-85-year age window to analyse the effect of mLOY on AD. This matches the usual age at onset range for preclinical, prodromal, and mild dementia stages for LOAD in our population [29] and helped reduce the age gap between our case and control groups (Table S1). Concordant with previous reports, we observed a clear age-related increase in mLOY events in the older individuals ( Figure 1). Age was associated with mLOY occurrence in men aged 65-85 years, with an estimated 1% increase in the chance of developing LOY every year (p = 3.50 × 10 −11 ). fect of mLOY on AD. This matches the usual age at onset range for preclinical, prodromal, and mild dementia stages for LOAD in our population [29] and helped reduce the age gap between our case and control groups (Table S1). Concordant with previous reports, we observed a clear age-related increase in mLOY events in the older individuals ( Figure 1). Age was associated with mLOY occurrence in men aged 65-85 years, with an estimated 1% increase in the chance of developing LOY every year (p = 3.50 × 10 −11 ). Then, we assessed if continuous mLRR-Y values were differentially distributed among the cases and controls. Both the unadjusted Kolmogorov-Smirnoff test (D = 0.18124; p < 2.2 × 10 −16 ) and analysis of covariance (ANCOVA) adjusted by age at DNA sampling and APOE genotype (F = 68.0, p = 2.54 × 10 −16 , Table S2) yielded highly significant results in the models including all available men in our cohort.
Next, we fitted logistic regressions for AD status using mLOY calls and mLRR-Y, defining three experimental setups: (a) all men with available age at DNA sampling, (b) 65-85-year-old men, and (c) dividing the data in age groups (65-70, 70-75, 75-80, and 80-85 years). We adjusted logistic regressions by age at blood sampling, APOE genotype and relevant principal components (PCs) ( Table S3).
We found that the continuous mLRR-Y variable was associated with AD in the group including all men (N = 2697, odds ratio (OR) = 2.74, p = 0.01), indicating that AD cases had an increased degree of LOY mosaicism compared with controls. We also observed increased mLOY levels in AD men in the 65-85 (N = 1944, OR = 2.19, p = 0.09) and agestratified groups with respect to controls, but these differences were not statistically  Table S2) yielded highly significant results in the models including all available men in our cohort.
Next, we fitted logistic regressions for AD status using mLOY calls and mLRR-Y, defining three experimental setups: (a) all men with available age at DNA sampling, (b) 65-85-year-old men, and (c) dividing the data in age groups (65-70, 70-75, 75-80, and 80-85 years). We adjusted logistic regressions by age at blood sampling, APOE genotype and relevant principal components (PCs) ( Table S3).
We found that the continuous mLRR-Y variable was associated with AD in the group including all men (N = 2697, odds ratio (OR) = 2.74, p = 0.01), indicating that AD cases had an increased degree of LOY mosaicism compared with controls. We also observed increased mLOY levels in AD men in the 65-85 (N = 1944, OR = 2.19, p = 0.09) and age-stratified groups with respect to controls, but these differences were not statistically significant (Table 1). Importantly, we noticed that the significant effect observed in the model including all available men could, at least partially, be driven by the dramatic age differences between AD cases and controls in our cohort ( Figure 1 and Table S1) even after adjusting by age. mLOY calls were not significantly associated with AD in the group including all men (N = 2697, OR = 1.14, p = 0.35), the 65-85-year-old group (N = 1944, OR = 1.04, p = 0.81), or in the age-stratified groups (Table 1). Table 1. Logistic regression results using mLRR-Y and mLOY calls as predictors for AD status. We defined three experimental setups: (a) all men with available age at DNA sampling, (b) 65-85-year-old men, and (c) a stratification of 65-85-year-old men into age groups (65-70, 70-75, 75-80, and 80-85 years old). We adjusted models by age at blood sampling, APOE genotype, and PCs. In the age-stratified models, only the effect of mLRR-Y or mLOY are displayed. The 95% confidence interval is presented as the 2.5% quantile (CI2.5) and the 97.5% quantile (CI97.5). OR, odds ratio; SE, standard error. To check for an effect of mLOY on risk of conversion to all-cause dementia and AD, we fitted Cox proportional-hazards models adjusted by age at sampling and APOE genotype in our prospective cohort of men with MCI (N = 400). The continuous mLRR-Y variable had a non-significant risk effect in MCI conversion to all-cause dementia (hazard ratio (HR) = 1.93; p = 0.10). The effect size increased when we calculated the model exclusively using conversion to AD but did not reach statistical significance (HR = 2.05, p = 0.19). mLOY calls also showed similar but smaller non-significant positive effects for conversion to dementia (HR = 1.17, p = 0.40) and AD (HR = 1.38, p = 0.20, Figure 2). The Cox model results are summarised in Table 2.  Because the impact of age on mLOY and AD might obscure genuine associations between both phenotypes (mLOY and AD), we decided to construct an mLOY polygenic risk score (mloy-PRS) to evaluate the impact of the genetic variance associated with the mLOY phenotype in AD risk. Our rationale was to implement a Mendelian randomisation strategy reasoning that, if blood cell mLOY is genuinely associated with AD, the genetic factors linked to blood mLOY risk should also be associated with AD and its related endophenotypes. To this end, we generated the mloy-PRS instrument based on a list of autosomal genome-wide significant single-nucleotide polymorphisms (SNPs) associated with the mLOY phenotype identified in a recent genome-wide association study (Table S4) [14].  Because the impact of age on mLOY and AD might obscure genuine associations between both phenotypes (mLOY and AD), we decided to construct an mLOY polygenic risk score (mloy-PRS) to evaluate the impact of the genetic variance associated with the mLOY phenotype in AD risk. Our rationale was to implement a Mendelian randomisation strategy reasoning that, if blood cell mLOY is genuinely associated with AD, the genetic factors linked to blood mLOY risk should also be associated with AD and its re-lated endophenotypes. To this end, we generated the mloy-PRS instrument based on a list of autosomal genome-wide significant single-nucleotide polymorphisms (SNPs) associated with the mLOY phenotype identified in a recent genome-wide association study (Table S4) [14].
For benchmarking purposes of the constructed PRS, we initially validated the effect of mloy-PRS in the mLOY cell phenotype in our cohort (65-85 years old). We fitted a logistic regression for mLOY calls with PRS, age at DNA sampling, and APOE genotype as predictors. mloy-PRS (OR = 1.80, p = 4.22 × 10 −20 ) and age at sampling (OR = 1.08, p = 5.07 × 10 −11 ) but not APOE (OR = 0.88, p = 0.15) were significantly associated with mLOY in our population (Table S5). Importantly, mloy-PRS was orthogonal with age and evenly distributed across the age spectrum ( Figure S4). Our results corroborate the validity of mloy-PRS as a Mendelian randomisation instrument for investigating the causal role of mLOY in AD and its endophenotypes and independently confirm the combined risk effect of previously reported loci in the mLOY phenotype [14].
In the case-control setup, we checked the effect of mloy-PRS on AD risk by fitting logistic regressions adjusted by APOE genotype, age, and relevant PCs (Table S6). Interestingly, the effect of mloy-PRS on AD could also be measured in the female samples. Therefore, we established three analysis groups: all (men + women), men only, and women only. We found no association between mloy-PRS and AD in the group including both sexes (Table 3). However, after sex stratification, we found a weak, non-significant, positive effect of mloy-PRS with respect to AD in the male subgroup (N = 2471, OR = 1.07, p = 0.12), while the effect was mostly neutral in the female subset (N = 4 978, OR = 1.00, p = 0.93). Next, we assessed the effect of mloy-PRS in disease progression. We adjusted Cox models by age, APOE genotype, and cohort ascertainment. We found a male-specific positive effect of mloy-PRS in the disease progression models (N = 682) ( Table 3 and Figure S5), with a suggestive signal for MCI-to-dementia progression (HR = 1.11, p = 0.08) and a significant risk effect for MCI-to-AD progression (HR = 1.23, p = 0.01). Of note, we found no association between mloy-PRS and conversion to all-cause dementia (HR = 0.99, p = 0.81) or AD (HR = 0.99, p = 0.85) in the female group (N = 1082). Table 3. Association results for mloy-PRS. (a) Results of logistic regressions for case-control AD. We only included individuals aged 65-85 years in the models. We adjusted the models by APOE genotype, age, and principal components. (b) Results of joint analysis of prospective MCIs in the GR@ACE-DEGESCO and EADB-DEGESCO cohorts using Cox proportional-hazards models for progression from MCI to all-cause dementia or AD. We adjusted the models by APOE genotype, age, and cohort ascertainment. The 95% confidence interval is presented as the 2.5% quantile (CI2.5) and the 97.5% quantile (CI97.5). OR, odds ratio; SE, standard error. Following these results, we proceeded to examine the existence of associations between mLRR-Y and the levels of core AD biomarkers in CSF: Abeta-42, phosphorylated tau 181 (p-tau), and total tau. We only kept individuals aged 65-85 years at the moment of the lumbar puncture (LP), and we excluded those with a gap of >5 years between DNA sampling and the LP (N = 214). We adjusted linear regressions by APOE genotype, age at LP, and the time window between blood sampling and LP. To account for the effect of syndromic status on the levels of Abeta-42, p-tau, and total tau, we calculated the effect of mLRR-Y in two groups (MCI N = 148; dementia N = 66) and then performed an inversevariance weighted fixed-effect meta-analysis. We found that both p-tau (β = 41.92, p = 0.01) and total tau (β = 396.69; p = 0.004) levels were increased in individuals with a higher degree of mLOY (Figure 3). and total tau (β = 396.69; p = 0.004) levels were increased in individuals with a higher degree of mLOY ( Figure 3). (h-k) QQ plots obtained in the models for (h) mLRR-Y, (i) mLRR-Y adjusted by total tau, (j) APOE genotype, and (k) total tau. We adjusted the models by age, the time window between CSF and DNA sampling, and APOE genotype.
Next, we checked for mLRR-Y associations with proteomics data obtained with the Olink ProSeek ® multiplex immunoassay for paired plasma and CSF samples in 135 men with MCI. Because mLOY is known to affect the immune system [10,24,30], and inflammation is involved in many processes related to AD pathogenesis [31], we analysed Olink neurology and inflammation panels. We detected inflation in our models (λ = 1.86), with most proteins showing increased levels in the CSF of individuals with a higher degree of blood mLOY (Figure 3). We observed a similar pattern when we analysed the effect of APOE genotype and total tau levels, with a large fraction of the proteins showing increased CSF levels in individuals carrying APOE risk alleles or displaying higher tau levels, respectively ( Figure 3). Moreover, after adjusting our models by total tau, we lost most CSF associations, and the inflation factor was drastically reduced to λ = 0.86 (Figure 3). After covariation with total tau, we found seven nominally significant markers in plasma and one nominally significant marker in CSF. Nevertheless, no proteins passed false- showing association of CSF proteins in the Olink inflammation and neurology panels with (d) mLRR-Y, (e) mLRR-Y adjusted by total tau, (f) APOE genotype, and (g) total tau. (h-k) QQ plots obtained in the models for (h) mLRR-Y, (i) mLRR-Y adjusted by total tau, (j) APOE genotype, and (k) total tau. We adjusted the models by age, the time window between CSF and DNA sampling, and APOE genotype.
Next, we checked for mLRR-Y associations with proteomics data obtained with the Olink ProSeek ® multiplex immunoassay for paired plasma and CSF samples in 135 men with MCI. Because mLOY is known to affect the immune system [10,24,30], and inflammation is involved in many processes related to AD pathogenesis [31], we analysed Olink neurology and inflammation panels. We detected inflation in our models (λ = 1.86), with most proteins showing increased levels in the CSF of individuals with a higher degree of blood mLOY (Figure 3). We observed a similar pattern when we analysed the effect of APOE genotype and total tau levels, with a large fraction of the proteins showing increased CSF levels in individuals carrying APOE risk alleles or displaying higher tau levels, respectively ( Figure 3). Moreover, after adjusting our models by total tau, we lost most CSF associations, and the inflation factor was drastically reduced to λ = 0.86 (Figure 3). After covariation with total tau, we found seven nominally significant markers in plasma and one nominally significant marker in CSF. Nevertheless, no proteins passed false-discovery rate (FDR) correction, suggesting that most mLOY associations can be explained by the previously observed correlation between mLOY and tau levels. Summary statistics for association of mLRR-Y to the CSF and plasma proteins are available (Tables S7-S10).

Discussion
In the present study, we found that MCI men with high genetic risk of developing mLOY have increased chances of progressing to AD over time. The autosomal loci used to construct mloy-PRS had no effect on AD progression in the female subset of our cohort, strongly suggesting that the observed effect is produced via loss of the Y chromosome among men. Importantly, modelling mLOY through its associated genetic variance allowed us to observe mLOY-induced alterations in AD pathogenesis in an age-independent manner, an approach that is unparalleled in previous studies. These results add to previous evidence reporting mLOY as a male-specific AD pathogenic factor. mLOY is the most common known form of somatic mosaicism among men [10]. Concordantly, we detected mLOY in 18.9% men aged 65-85 years in our cohort. Although classically considered to be a harmless age-related trait, recent studies have revealed that mLOY increases risk of all-cause mortality and several diseases [10,[20][21][22]. With such a high prevalence in the older population, interest in determining the effect of mLOY in age-related diseases has increased over the past decade. Previous studies have reported that mLOY is associated to an increased risk and progression rate for AD [19]. A more recent publication claimed that extreme transcriptomic downregulation of chromosome Y decreases AD resilience in men [32]. However, whether mLOY acts as an AD-promoting factor or is just a by-product of ageing needs to be clearly established.
Consistent with previous studies [19], we found a higher degree of mLOY mosaicism (mLRR-Y) in our AD versus control population in unadjusted Kolmogorov-Smirnoff models (D = 0.18124, p < 2.2 × 10 −16 ) and age-adjusted ANCOVA (F = 68.0, p = 2.54 × 10 −16 ). We then performed a case-control logistic regression in all available men in our cohort, obtaining significant results (OR = 2.74; p = 0.01). Even though we adjusted for age, these results should be interpreted cautiously due to the dramatic age differences between the AD and control groups (Figure 1), as age is an important risk factor for both phenotypes. Thus, aiming to reduce age confounding in our models, we restricted analysis to men aged 65-85 years. However, despite not completely correcting the age gap between the groups, this also reduced our sample size (Table S1). We found an increased degree of LOY mosaicism (mLRR-Y) in 65-85-year-old cases versus controls (OR = 2.19, p = 0.09), but statistical significance was not reached in the models.
Researchers have also found that mLOY increases the rate of AD conversion in MCI men [19]. We selected individuals recruited at the ACE Alzheimer Center Barcelona with an MCI diagnosis at the moment of sampling and available clinical follow-ups and fitted Cox proportional-hazards models. Even though we found risk, i.e., positive, effect directions for mLOY phenotypes towards AD progression, the models were not significant ( Figure 2 and Table 2). However, given our small sample size (N = 400), we may have lacked sufficient statistical power in this analysis. Remarkably, we noticed that the quantitative mLRR-Y variable performed superiorly in the case-control and disease progression models compared with mLOY calls, implying that if the effects are genuine, the mLOY-induced increase in AD risk and progression may be proportional to the mosaic fraction of LOY cells in blood.
Due to the age-dependent nature of both mLOY and AD, controlling age confounding was very challenging in our cohort. For this reason, we checked mLOY causality in AD by creating an instrument variable and conducted a Mendelian randomisation study. To this end, we generated an age-independent and sex-independent PRS, using 114 independent autosomal genetic variants (Table S4) previously associated with mLOY [14]. Of note, mloy-PRS successfully predicted mLOY events in our data and was not associated with age or APOE genotype ( Figure S4 and Table S5). A recently published work found similar effect sizes of this PRS for predicting mLOY calls [33]. Therefore, analysis of mloy-PRS instead of mLOY phenotypes allowed us to overcome the main limitations of the study (age differences and sample size) by providing an age-independent mLOY instrument. This approach allowed us to increase the effective sample size in two ways: (a) by removing the need to restrict analysis to samples with available age at DNA sampling information and (b) by allowing us to introduce all individuals with MCI and subsequent clinical records in disease progression models instead of only those with an MCI diagnosis at the closest clinical evaluation to DNA sampling. Importantly, we found a male-specific, significant (HR = 1.23; p = 0.01) association between mloy-PRS and MCI phenoconversion to AD. Case-control models also reported positive, i.e., risk, effects of mloy-PRS in AD in the male subset (OR = 1.07, p = 0.12), but the models did not reach statistical significance even though our sample size was considerably larger in the case-control dataset (N men = 2471) than in the longitudinal, prospective MCI dataset (N men = 682). These results suggest that mLOY could be more involved in the MCI, early clinical stages of AD aetiopathogenesis than in the preclinical stages of the disease, namely AD risk. However, because the mloy-PRS only explains a fraction of the variance that causes mLOY, a larger sample size may be needed to reach sufficient statistical power to obtain more robust associations in the case-control models. Importantly, mloy-PRS effects were neutral in the groups including women ( Table 3), implying that the observed effect of mloy-PRS on AD is unlikely to be driven by the same mechanisms that confer mLOY risk (increased genomic instability and impairment of DNA reparation mechanisms) [14]. Instead, the observed effects are male specific and, therefore, more likely produced via loss of the Y chromosome exclusively in men.
One of the most commonly proposed mechanisms to explain blood cell LOY pathogenesis is the impairment of immune functions [10,14,19]. Interestingly, deregulation of the immune system is one of the hallmark features of AD [34], and genome-wide association studies are revealing an increasing number of genes related to immune functions [35]. LOY has been reported to deregulate the expression of approximately 500 autosomal transcripts in leucocytes [24]. Furthermore, levels of CD99, a cell surface protein involved in several key immune functions, such as leucocyte migration through the vascular endothelium, cell adhesion, and apoptosis [36,37], have been found to be significantly lowered in immune cells with LOY [24,30]. Thus, mLOY-induced alterations in the homeostasis and migration of leucocytes through the brain-blood barrier could explain the observed associations. Additional studies are necessary to corroborate our findings and to identify the potential mechanisms within mLOY that modify AD aetiopathogenesis. Of note, functional restoration of the lost Y-chromosome loci promoting aberrant clonal expansion or transcriptomic deregulation of LOY leucocytes could be an attractive therapeutic strategy to combat AD progression.
One strength of our study is that we modelled LOY through an age-independent PRS instead of just analysing the age-dependent mLOY phenotype and adjusting our data by age. In our opinion, this allowed a clearer and more robust approach for inferring causality between mLOY and AD. We also obtained an independent validation of our findings through AD-related biomarkers, with mLOY phenotypes associated with higher levels of total tau and p-tau in the CSF and displaying the proteomic neurodegenerative biochemical signature observed with other AD-related factors (Figure 3). Higher tau levels are associated with faster rates of cognitive decline [38], supporting the hypothesis that mLOY modulates disease progression. However, our work also faced several limitations: (a) the lack of age at sampling information for most controls and (b) a significantly younger control population compared with the AD population. Both of these limitations ultimately decreased our statistical power to find more robust mLOY-AD associations in the casecontrol models. We are planning to expand our analysis to additional European population cohorts, which may help us to determine whether mLOY-PRS acts as a male-specific AD risk factor and to confirm the observed effects in disease progression.
We believe Mendelian randomisation analysis is key to confirm causality between mLOY and age-related diseases such as AD, cancer, or cardiovascular disease, where age can also act as a heavy confounder. As we have shown here, sex-stratified Mendelian randomisation can help elucidate whether associations of an mLOY instrument with the outcome are caused by pleiotropy or mediated by the loss of the Y chromosome. Thus, a significant association between an mLOY instrument and a specific outcome in a female sample would likely indicate pleiotropy, as the Y chromosome is absent in women, and the instrument acts as a proxy of genomic instability. For example, in a previous publication, the authors found that their mLOY PRS instrument was associated with breast cancer in women from the UK biobank [14], arguing that these results were reasonable, as genomic instability is a known risk factor for cancer. However, if the association between the mLOY instrument and the outcome is found exclusively in men, or its effect size is significantly greater in men than in women (as both mechanisms could independently increase disease risk), then the Y chromosome loss mediates, at least partly, the observed effect. Encouraging authors to report sex-stratified GWAS summary statistics would open the door to sex-stratified two-sample Mendelian randomisation, which would be a powerful tool to determine the causality of mLOY in the etiopathogenesis of diseases.
In summary, we did not find such strong associations between the blood mLOY phenotype and AD as those reported previously [19]. Due to the demographic features of the GR@ACE-DEGESCO cohort, with older AD patients and younger population-based controls, adjusting our data by age was challenging. Consequently, we modelled the genetic variance associated with mLOY risk, generating a PRS that was associated with MCI conversion to AD in a male-specific manner. This approach allowed us to efficiently control the effect of ageing and to evaluate the potential causality of the mLOY phenotype. Furthermore, lack of association between mloy-PRS and AD in women suggests that the observed effect is produced via the inherent loss of the Y chromosome and that mLOY could be a male-specific AD risk factor. Larger studies may benefit from modelling mLOY using Mendelian randomisation, as case and control populations do not always represent the same age groups in AD cohorts, and the date of DNA sampling of the subjects may not be available.

The GR@ACE-DEGESCO Cohort
The GR@ACE-DEGESCO cohort comprises AD patients and controls from the Spanish population. Patients with AD were collected from the ACE Alzheimer Center Barcelona and 12 other cohorts included in the Dementia Genetics Spanish Consortium (DEGESCO) ( Table S11). Control individuals were provided by the ACE Alzheimer Center (Barcelona, Spain), Valme University Hospital, the Spanish National DNA Bank Carlos III (Salamanca, Spain), and other DEGESCO members. DNA extracted from peripheral blood or saliva (Table S11) was genotyped in the Spanish National Center for Genotyping (CeGen, Santiago de Compostela, Spain) using the Axiom 815K Spanish Biobank Array (Thermo Fisher), as described previously [26,27].

The ACE MCI-EADB Cohort
The EADB cohort is a prospective cohort comprising individuals with MCI recruited between 2006 and 2013 at ACE Alzheimer Center Barcelona. Briefly, individuals with a clinical dementia rating (CDR) of 0.5 and older than 60 years were selected and underwent at least one follow-up consisting of neurological, neuropsychological, and social work evaluations. A detailed definition of the ascertainment of this cohort has already been described [39,40]. DNA genotyping was performed as described elsewhere [35]. Briefly, DNA extracted from peripheral blood was genotyped with the Illumina Infinium Global Screening Array (GSA, GSAsharedCUSTOM_24+v1.0) at the LIFE & BRAIN CENTER, (EADB node, Bonn, Germany), and SNP genotype calls were obtained from raw probe intensity data in the same centre.

Criteria for AD Diagnosis Case-Control Setup
AD diagnoses were established in all cases by a multidisciplinary working group conformed by neurologists, neuropsychiatrists, and social workers following DSM-IV criteria for dementia and the National Institute on Aging and Alzheimer's Association's (NIA-AA) 2011 guidelines for AD definition. In the present study, individuals were labelled as AD when possible or probable AD was endorsed by neurologists at any point of their clinical history. Written informed consent was obtained from all participants. The Ethics and Scientific Committees have approved this research protocol (Acta 25/2016, Ethics Committee. H., Clinic I Provincial, Barcelona, Spain).

Assessment of MCI-To-Dementia/AD Conversion
MCI-to-dementia conversion was determined by integrating the CDR, the global deterioration scale (GDS), and diagnostic assessments at the ACE Alzheimer Center Barcelona, assigned at a consensus conference including neurologists, neuropsychologists, and social workers [29]. Conversion to dementia was defined as the first clinical evaluation reporting a diagnosis of AD [41,42], vascular dementia [43], mixed dementia (AD with cerebrovascular disease), frontotemporal dementia [44,45], or dementia with Lewy bodies [46], combined with a CDR score change from 0.5 to ≥1 and GDS ≥ 4. AD converters were defined as the fraction of converters to dementia that were diagnosed with AD. The baseline criteria varied depending on whether the exposure was mLOY phenotype or its associated PRS. In the first case, baseline was defined as the moment of blood sampling used subsequently for germline DNA extraction, genome-wide genotyping, and mLOY estimation. We selected only those individuals who met Petersen's criteria [47,48] for amnestic and non-amnestic MCI at the closest clinical evaluation to DNA sampling. Because genotypes used for PRS estimates are invariable, baseline was defined as the patient's first clinical record meeting Petersen's criteria for PRS analysis. The follow-up time was defined as the time window between baseline and (a) the date of conversion to dementia (converters) and (b) the date of last clinical evaluation (non-converters). To have a prospective cohort, disease progression models only included individuals who were either originally selected as controls/MCIs in the GR@ACE-DEGESCO case-control cohort or present in the MCI cohort (ACE MCI-EADB).

LOY Determination
We used PennCNV [49][50][51] to process CEL files, following the recommended workflow for Affymetrix arrays [52], to obtain log R ratio (LRR) and B allele frequency (BAF) values for each array probe in our dataset. We determined mLOY by using the MADloy package for R [28]. Briefly, this method estimates mLOY by normalising the mean LRR of probes found at the male-specific region of chromosome Y (mLRR-Y) against the 5% trimmed mean LRR of autosomal chromosomes. Only probes located between PAR1 and PAR2 in chromosome Y, excluding the X transposed region (chrY:6611498-24510581; hg19/GRCh37), are used to compute mLRR-Y. To call mLOY status, we used the mLRR-Y thres method of the MADloy package. Briefly, a threshold is determined by extrapolating the 99% confidence interval of the positive side of the cohort mLRR-Y distribution [10]. Then, samples with mLRR-Y values below the empirically calculated threshold are assigned an mLOY status (or calls). To overcome computational power limitations, we obtained mLOY calls in two randomised batches. We used Bdev, defined as the mean deviation from the expected BAF (0.5) for heterozygous SNPs, in PAR1 (PAR1-Bdev) as a complementary indicator of mLOY ( Figure S6).

Sample Processing and QC
We obtained LRR and BAF values for all biallelic markers from 20 068 CEL files (call rate > 0.97 per sample and >0.985 per plate). Then, we retrieved reported male samples and discarded samples with mean LRR-X and LRR-Y corresponding to female (XX) or sex chromosome aneuploidies. Additionally, we removed samples with a high heterozygosity rate, high chromosome X heterozygosity, and population outliers from our dataset. We removed samples with LRR SD > 0.46, a standard QC parameter for Affymetrix LRR data. We used the GENESIS R package [53] to examine relatedness within our dataset. We detected second-or lower-degree relatives by using a kinship threshold of 0.046875 and filtered them out of the dataset. Finally, we removed outliers in the mLRR-Y and Bdev distribution. Specific QC procedures and sample filtering steps for each analysis are summarized in Figure S3.

mloy-PRS
We performed processing, QC, and imputation of the genome-wide SNP data as described elsewhere [27,35]. We calculated mloy-PRS based on independent genomewide significant variants described previously [14]. Briefly, the authors determined the presence/absence of mLOY in 205011 male samples in the UK biobank and performed a genome-wide association study identifying 18146 variants associated with the mLOY phenotype (p < 5 × 10 −8 ). Then, they resolved these signals to 156 independent variants by (a) applying LD clumping at 1 Mb and removing correlated signals (r 2 > 0.05) and (b) performing conditional analysis, keeping only secondary signals that reached genomewide significance before and after conditional analysis. These variants were replicated in 757,114 male samples from European and Japanese ancestry. Out of the 156 reported SNPs, we excluded those unavailable in our dataset, considered rare variants (MAF < 0.01), with low imputation quality (R 2 < 0.3), or located within the sex chromosomes, leaving us with a final number of 114 autosomal SNPs (Table S4). We calculated mloy-PRS for all individuals in the GR@ACE-DEGESCO and ACE MCI-EADB cohorts by adding the dosage of risk alleles weighted by their reported male-specific effect sizes (beta coefficients). To ease interpretation of results, we standardized mloy-PRS units (SD = 1).
CSF and paired plasma samples collected the same day, as described elsewhere [54,55], underwent targeted proteomics using ProSeek ® multiplex immunoassay by Olink Proteomics (Uppsala, Sweden). The protein concentration was measured for 184 proteins included in the commercially available ProSeek ® Multiplex panels (inflammation and neurology) in both fluids. QC details and further description of this data are provided elsewhere [56].

Statistical Analysis
We used R software [57] for data processing and analysis. To harmonise effect directions, we multiplied the mLRR-Y variable by −1 due to lower values of mLRR-Y representing a higher degree of mLOY. We fitted logistic regressions adjusted by age, APOE genotype, and population structure for case-control analysis. We used the survival R package [58] to fit Cox proportional-hazards models to assess MCI conversion to all-cause dementia or AD. Due to the age-dependent nature of mLOY [11,59], we only included individuals with available age at DNA sampling information for analyses involving mLOY phenotypes. To correct for population structure, we only adjusted by the PCs that were associated with the dependent variable in the models. Thus, we did not adjust for population structure in the Cox models, as PCs showed no effect on disease progression (Table S12, Figure S7), likely because all MCI samples came from the same centre. We modelled APOE genotypes as a continuous variable ranging from −2 to 2, where each APOE-ε2 allele contributed with −1, and each APOE-ε4 allele added +1, as described previously [40]. To control ascertainment and genotyping bias between MCI cohorts (GR@ACE-DEGESCO & ACE MCI-EADB), we introduced a dichotomous variable in the models testing the association between mloy-PRS and disease progression. For analysis of Olink proteomic data, we adjusted linear regressions by age, the time window between DNA sampling and LP, and APOE genotype. Due to the high correlation between the levels of many CSF proteins, total tau, and p-tau ( Figure S8), we also included models adjusted by total tau levels. We performed fixed-effect inverse variance weighted meta-analysis with the rma.uni function included in the metafor R package [60].