A Multi-Trait Association Analysis of Brain Disorders and Platelet Traits Identifies Novel Susceptibility Loci for Major Depression, Alzheimer’s and Parkinson’s Disease

Among candidate neurodegenerative/neuropsychiatric risk-predictive biomarkers, platelet count, mean platelet volume and platelet distribution width have been associated with the risk of major depressive disorder (MDD), Alzheimer’s disease (AD) and Parkinson’s disease (PD) through epidemiological and genomic studies, suggesting partial co-heritability. We exploited these relationships for a multi-trait association analysis, using publicly available summary statistics of genome-wide association studies (GWASs) of all traits reported above. Gene-based enrichment tests were carried out, as well as a network analysis of significantly enriched genes. We analyzed 4,540,326 single nucleotide polymorphisms shared among the analyzed GWASs, observing 149 genome-wide significant multi-trait LD-independent associations (p < 5 × 10−8) for AD, 70 for PD and 139 for MDD. Among these, 27 novel associations were detected for AD, 34 for PD and 40 for MDD. Out of 18,781 genes with annotated variants within ±10 kb, 62 genes were enriched for associations with AD, 70 with PD and 125 with MDD (p < 2.7 × 10−6). Of these, seven genes were novel susceptibility loci for AD (EPPK1, TTLL1, PACSIN2, TPM4, PIF1, ZNF689, AZGP1P1), two for PD (SLC26A1, EFNA3) and two for MDD (HSPH1, TRMT61A). The resulting network showed a significant excess of interactions (enrichment p = 1.0 × 10−16). The novel genes that were identified are involved in the organization of cytoskeletal architecture (EPPK1, TTLL1, PACSIN2, TPM4), telomere shortening (PIF1), the regulation of cellular aging (ZNF689, AZGP1P1) and neurodevelopment (EFNA3), thus, providing novel insights into the shared underlying biology of brain disorders and platelet parameters.


Introduction
Platelets have represented, for decades, an interesting setting to investigate the biological underpinnings of neuropsychiatric and neurodegenerative disorders since they are considered "circulating mirrors of neurons" [1]. Indeed, despite their different embryonic origin, platelets and neurons share common characteristics in subcellular organization and in protein composition. There are proteins typically expressed in both neurons and circulating platelets, and they were found to regulate processes such as platelet activation, hemostasis and thrombosis [2]. For example, Reelin-neuronal protein that regulates cell migration, synaptic plasticity and memory formation-is also expressed in blood and is actively released following platelet activation [3,4]. Amyloid Aβ peptides, which accumulate in senile plaques in dementia, and the amyloid precursor protein (APP), are expressed in megakaryocytes, stored in platelet α-granules and released upon platelet activation [1]. To have further biological insights into the underlying biology of common variance in different neuropsychiatric/neurodegenerative disorders and platelet parameters, associations underwent gene, gene ontology and pathway enrichment analysis through MAGMA v1.08 [21], within the FUMA platform [20]. This was carried out for all the protein-coding genes to which at least one SNP was annotated within a ±10 kb interval, namely, 18,781 genes. A Bonferroni correction for multiple testing was applied accordingly, based on the number of genes tested (α = 2.7 × 10 −6 ) ( Table 2).
To estimate protein-protein interactions (PPIs) among the genes enriched for associations, we used the search tool for the retrieval of interacting genes/proteins in db-STRING v11.5 [22]. We analyzed genes significantly enriched for associations with AD, PD and MDD, first separately and then merged into a single list, to compute "global" interactions among all the genes significantly enriched for any of the three disorders. The obtained network included both direct (physical) and indirect (functional) associations, specifically evidence of interaction from curated databases; evidence experimentally determined; gene neigh-borhood; gene fusions; gene co-occurrence; joint gene mentioning based on text mining in published articles; gene co-expression; and protein homology. We set a minimum interaction score of >0.7 so that only high-confidence interactions between proteins were included in the analysis. The observed excess of interactions compared to the expected number of edges among nodes, average node degree (i.e., the average number of edges per node in the graph) and local clustering coefficient (i.e., a measure of the extent to which nodes in the graph tend to cluster together) were taken as measures of network density and clustering levels.
The molecular network resulting from the gene enrichment test, as produced by STRING v11.5 analysis, showed more significant interactions than expected for AD (enrichment p = 1.0 × 10 −16 ; 57 nodes, 57 edges vs. 2 expected, average node degree 2.00 and average local clustering coefficient 0.  Table S1). Similarly, we observed evidence of an interaction also when the genes enriched for the three disorders were analyzed together (enrichment p = 1.0 × 10 −16 ; 240 nodes, 281 edges vs. 61 expected, average node degree 2.34 and average local clustering coefficient 0.3) (Figure 3; Table S1).
Gene-set analysis also revealed significant enrichment for the three disorders, the most associated gene ontology (GO) term was negative regulation of amyloid precursor protein catabolic process (13 genes, β(SE) = 2.23(0.26); enrichment p after Bonferroni correction = 2.7 × 10 −13 ) for AD, IgG binding (9 genes, β(SE) = 0.03(0.31); P bonf = 0.0056) for PD and GABAergic synapse (64 genes, β(SE) = 0.74(0.13); P bonf = 9.5 × 10 −5 ) for MDD (Table S2; see URLs to access the full list of pathways tested and the genes driving these enrichments).  associations, was based on the STRING v11.5 database [22]. Only high-confidence interactions between proteins are reported (interaction score > 0.7), while disconnected nodes in the network are hidden. Each node represents all the proteins produced by a single protein-coding gene locus, while edges represent protein-protein associations. Line color indicates the type of interaction evidence: light blue- from curated databases; purple-experimentally determined; green-gene neighborhood; red-gene fusions; blue-gene co-occurrence; yellow-text mining; black-co-expression; and violet-protein homology. Figure 3. Protein-protein interaction network of genes significantly enriched for associations with AD, PD and MDD. The reported network, including both direct (physical) and indirect (functional) associations, was based on the STRING v11.5 database [22]. Only high-confidence interactions between proteins are reported (interaction score > 0.7), while disconnected nodes in the network are hidden. Each node represents all the proteins produced by a single protein-coding gene locus, while edges represent protein-protein associations. Line color indicates the type of interaction evidence: light blue-from curated databases; purple-experimentally determined; green-gene neighborhood; red-gene fusions; blue-gene co-occurrence; yellow-text mining; black-co-expression; and violet-protein homology.

Discussion
We report the first multi-trait association analysis of structural platelet parameters routinely assessed in blood tests and three of the most common neurodegenerative/neuropsychiatric disorders, identifying novel candidate susceptibility genes for AD, PD and MDD. The most significant associations were detected in some of the most implicated genes in neurodegenerative/neuropsychiatric disorders, namely, APOE (apolipoprotein E, with AD) [39], SNCA (alpha synuclein, with PD) [40] and ZSCAN12 (zinc finger and SCAN domain-containing 12, with MDD) [41]. APOE is a protein associated with lipid particles that mainly functions in lipoprotein-mediated lipid transport between organs via plasma and interstitial fluids [42]. Alpha synuclein is involved in synaptic activities such as the regulation of synaptic vesicle trafficking and subsequent neurotransmitter release [43,44]; moreover, it modulates DNA repair processes, including the repair of double-strand breaks [45]. ZSCAN12 encodes a Zinc finger and SCAN domain-containing protein involved in transcriptional regulation.
Still, MTAG analysis also revealed novel genes showing significant multi-trait associations, which, to our knowledge, were never associated with these disorders before. Among genes associated with AD, EPPK1, TTLL1, PACSIN2 and TPM4 play a role in the organization of cytoskeletal architecture, which has been identified as an important component in the development of neurodegenerative disorders [46][47][48][49]. PIF1 prevents telomere elongation by inhibiting the action of telomerase, while ZNF689 and AZGP1P1 are transcription factors involved in cell viability and apoptosis, and both molecular functions can affect cellular aging and the development of age-related disorders such as AD and PD [24]. Moreover, TTLL1 [19], PACSIN2 [26,40], TPM4 [27] and PIF1 [19] were previously associated with platelet parameters, suggesting a possible pleiotropic effect of these genes. EFNA3, a novel gene resulting in an association with PD, encodes a member of the ephrin family, previously implicated in mediating developmental events, especially in the central nervous system [50].
A gene-based enrichment analysis also revealed novel genes associated with AD, PD and MDD. Interestingly, among these genes are several encode transcription factors that may be involved in development, maintenance and survival of neurons and olfactory receptors [51]. Indeed, olfactory dysfunction, which is thought to be due to the loss of synaptic function, has been linked with most neurodegenerative, neuropsychiatric and communication disorders [52]. Moreover, among these genes, there are also some histone complex proteins, in line with some recent studies revealing associations between histone methylation/acetylation and AD [53] and implicating several histone deacetylases in the pathogenesis of PD [54]. These findings suggest the pleiotropic influence of several genes on the risk of neurodegenerative and neuropsychiatric disorders, which were not previously detected through classical univariate GWAS analyses.
Of note, we found several overlaps between genes and SNPs multi-trait associations with the brain disorders and platelet parameters analyzed. Among genes enriched for association, we identified clusters of genes encoding products involved in mitochondrial function (e.g., NDUFS2, NDUFAF2 and TOMM40L), cytoskeleton remodeling (CD2AP and KLC3, as discussed above) and histone proteins (HIST1H2BK, HIST1H4K, HIST1H2AK and HIST1H3B, as explained below). Indeed, NDUFS2 and NDUFAF2 encode for a subunit and for a chaperone involved in the assembly of complex I, located on the inner mitochondrial membrane, while TOMM40L is involved in mitochondrial transmembrane translocation. Several studies suggest that platelet mitochondrial dysfunction may be involved in neurodegenerative diseases such as AD and PD [55][56][57]. Still, further studies are needed to clarify the variant association overlap between platelet parameters and MDD, which we were not able to identify here, possibly due to the genetic and phenotypic heterogeneity of depression.
Protein-protein interaction analysis revealed a significant excess of interactions among enriched genes for the brain disorders tested both separately and jointly, suggesting that their gene products are highly likely to be linked in a global molecular network.
In particular, this highlighted some local networks of interests, such as the one among the histone proteins complex-HIST1H2BI, HIST1H2BF and HIST1H2BJ-which may play a role in the onset of neurodegenerative diseases due to the alteration of methylation patterns [58]. Similarly, the apolipoproteins APOE, APOA2, APOC1 and APOC4 have been repeatedly implicated in triglyceride and cholesterol transport and metabolism [59,60], as well as in neurodegenerative [61] and cardiovascular risk [62], while the local network among EPHA1, EPHB2, EFNA1, EFNA3 and EFNA4, highlights the importance of the interactions between ephrins and ephrin receptors in the etiology of several neurodegenerative and neuroinflammatory disorders, suggesting potential links with (cellular) immunity [63].
Gene-set analysis revealed significant enrichments of GO terms involved in the regulation, formation and catabolic processes of amyloid beta and in the negative regulation of metalloendopeptidase activity for AD, supporting the hypothesis that metallopeptidases are implicated in the pathogenesis of several central nervous system diseases such as multiple sclerosis and AD [64]. For PD, significant enrichments of IgG binding is interesting in light of a higher fraction of IgG, a different IgG glycosylation profile [65], and of increased IgG (but not IgM) binding in dopaminergic neurons of PD cases vs. controls [66]. Moreover, serum IgG levels in PD patients are negatively associated with mood/cognition scores [67], in line with a potential pleiotropic role of humoral immunity at the interface among the mood, cognitive and motor control domains. Similarly, the significant enrichment of GABAergic synapse GO term in MDD analysis corroborates the hypothesis that the alteration in GABAergic receptors may play a role in long-term depression [68].
Overall, we provide insights into the shared underlying biology of these disorders and related platelet parameters, proposing novel molecular targets for the risk prediction and treatment of these disorders.

Strengths and Limitations
The strengths of this study include the novelty of the analysis performed; indeed, to our knowledge, this represents the first attempt to identify the shared genomic underpinnings of platelet parameters and three of the most common neurodegenerative/neuropsychiatric disorders through a comprehensive approach, including not only multi-trait association analysis but also gene-/gene-set enrichment and molecular network analyses. Moreover, our analyses are based on large GWASs, with an important amount of genetic data, which confers robustness to our observations. Last, our focus on novel associations allowed identifying proteins and biological pathways, which should be functionally validated in the future.
This study presents some limitations. First, currently only partial evidence of genetic correlation among the disorders and platelet parameters tested exists, which may have hampered the power of the analyses. Second, in multi-trait association approaches, association significance is often driven by the largest source GWAS involved in the MTAG analysis, which may have biased the analyses towards the largest studies. Still, this represents a useful approach to identifying the pleiotropic variants and genes influencing multiple traits and/or disorders, which were already proven successful with multiple correlated phenotypes [15]. Third, functional studies are warranted to explain the role of the novel susceptibility genes identified here, both in neurodegenerative/neuropsychiatric risk and platelet variability.
However, although there are some limitations, these studies may reveal potential molecular targets for future treatments of three of the most common neurodegenerative/neuropsychiatric disorders.  Table S1: Network statistics of protein-protein interaction networks of genes significantly enriched for associations with AD, PD and MDD; Table S2

Conflicts of Interest:
The authors declare no conflict of interest.