Alzheimer’s Disease-Associated SNP rs708727 in SLC41A1 May Increase Risk for Parkinson’s Disease: Report from Enlarged Slovak Study

SLC41A1 (A1) SNPs rs11240569 and rs823156 are associated with altered risk for Parkinson’s disease (PD), predominantly in Asian populations, and rs708727 has been linked to Alzheimer’s disease (AD). In this study, we have examined a potential association of the three aforementioned SNPs and of rs9438393, rs56152218, and rs61822602 (all three lying in the A1 promoter region) with PD in the Slovak population. Out of the six tested SNPs, we have identified only rs708727 as being associated with an increased risk for PD onset in Slovaks. The minor allele (A) in rs708727 is associated with PD in dominant and completely over-dominant genetic models (ORD = 1.36 (1.05–1.77), p = 0.02, and ORCOD = 1.34 (1.04–1.72), p = 0.02). Furthermore, the genotypic triplet GG(rs708727) + AG(rs823156) + CC(rs61822602) might be clinically relevant despite showing a medium (h ≥ 0.5) size difference (h = 0.522) between the PD and the control populations. RandomForest modeling has identified the power of the tested SNPs for discriminating between PD-patients and the controls to be essentially zero. The identified association of rs708727 with PD in the Slovak population leads us to hypothesize that this A1 polymorphism, which is involved in the epigenetic regulation of the expression of the AD-linked gene PM20D1, is also involved in the pathoetiology of PD (or universally in neurodegeneration) through the same or similar mechanism as in AD.

SY5Y cells, and that MgSO 4 can reverse its decline [23]. The same group has also provided data revealing that, in a rat PD model, 6-OHDA alters the expression of A1/A1 (at both the RNA and protein levels), and that the extent of this alteration is responsive to [MgSO 4 ] [23].
The PARK16 locus comprises five genes, namely SLC45A3, NUCKS1, RAB7L1, A1, and PM20D1 [24]. Its role in the susceptibility to PD has been pointed out by numerous genomewide association studies (GWAS) and case-control studies. Three A1 single nucleotide polymorphisms (SNP(s)) have been extensively studied with respect to their association with PD.
The major G allele of the A1 polymorphism rs11240569 (for characteristics see Table 1) of a Han cohort in China has been shown to reduce the risk of idiopathic PD, with people who have the GG and AG genotypes exhibiting a reduced risk compared with those who have the AA genotype [25]. A similar outcome has been obtained in a study performed with an Iranian cohort [26].  Another A1 SNP, rs708727 (Table 1), has been studied in a UK cohort, but no association between this SNP and PD has been found [19]. However, this SNP has been linked to Alzheimer's disease (AD) [27].
In relation to PD, rs823156 (Table 1), is probably the most intensely studied, but is also the most controversial among the A1 SNPs. This SNP has been associated with PD in cohorts from mainland China [28], Japan [29], and Korea [30], but not in cohorts from Eastern China [31], the north of Spain [32], and Malaysia [33]. Bai and colleagues have predicted, following in silico analyses, that rs823156 as a noncoding variant of A1 "might affect PD risk by altering the transcription factor-binding capability of the genes" [34].
Previously published work has made it obvious that cells regulate the extent of Mg 2+ efflux via A1 at the level of proteins and at the level of transcription [17,20,35,36]. However, the amount of information about the organization of the promoter of A1 and its transcription-binding capacity is rather scarce [34].
In 2019, we published a study showing that the three aforementioned A1 SNPs are not associated with any susceptibility toward PD in the Slovak population, as demonstrated by the means of frequentist statistics and by machine learning [37]. A major limitation of that study might have been the relatively low number of participants in both the PD (150) and the control (120) cohorts. Therefore, the aim of this study has been twofold, as follows: (1) to elucidate any possible association of rs11240569, rs708727, and rs823156 in a larger group of PD patients (150 + 358) and control probands (120 + 352), and (2) to sequence the promoter region of A1 in a sub-cohort of PD samples in order to identify any possible SNPs within the promoter region and to examine their possible association with PD.

Sequencing of SLC41A1 Promoter Region
The Sanger sequencing and sequences analysis was performed in a sub-cohort of 96 PD patients (all from the PD Center in Martin). A fragment of the A1 promoter region was studied, spanning from position 205,814,626 to 205,812,988 on chromosome one. The sequence of the fragment was chosen according to the Genecopoeia database [www.genecopoeia.com/ product/search/view_seq_promoter.php?cid=&type=promoter&prod_id=HPRM53412 (accessed on 2 May 2018)]. The gene organization of A1 and of its promoter/regulatory sequences is depicted in Figure 1. The sequencing allowed the identification of the following four SNPs in the A1 promoter region: rs9438393, rs56152218, rs61822602, and rs144056491 ( Figure 1). Next, we utilized the RFLP strategy to examine rs9438393 (restriction with Hpy166II), rs56152218 (restriction with NIaIII), and rs61822602 (restriction with BmrI) in a sub-group of 100 control samples. The SNP rs144056491 was not examined in the control group because of the lack of a suitable restriction enzyme.
ConSite [38] (Table 2), a web-based tool for finding cis-regulatory elements in genomic sequences, was employed to examine whether the variant (minor) allele for each of the four aforementioned SNPs altered the TF-binding profile of the A1 promoter by rendering a new TF-binding site or by erasing the existing one. As input, we used 33 bp long sequences, one with the reference (major) allele and other with the variant (minor) allele for each SNP, respectively. At rs144056491, which is located within the binding site of transcription factor p50, both the major allele (C) and the minor allele (-, CC, CCC) presumably allow the binding of this transcription factor ( Figure 1). The presence of the major allele (A) at rs9438393 might permit the binding of the transcription factor FREAC-4 (Table 2, Figure 1). However, if the minor allele (G) is present, then the FREAC-4 binding site is no longer recognized by the TF-binding predictive software. At the same SNP, the minor allele putatively allows the binding of SP1, which is not the case in the presence of the major allele (Table 2, Figure 1). The major allele (T) at rs56152218 putatively allows the binding of Gata2, but according to the prediction, this will not be the case in the presence of the minor allele (C). On the other hand, YY1 might bind the minor C-allelic variant, but not the major T-allelic variant (Table 2, Figure 1). According to the in silico prediction, SNP rs61822602 is not located within any TF-binding sequences (Table 2, Figure 1). Table 2. Alterations of transcription-factor-binding domains resulting from presence of respective variants.

Figure 1.
Gene organization of A1 including adjacent upstream 5′UTR. According to Ensembl Transcript: SLC41A1-201 ENST00000367137.4, this gene is located on chromosome 1 and consists of 11 exons. Exon 1 represents 5′UTR (untranslated region), and exon 2 contains a part of this 5′UTR. 3′UTR is included in exon 11. In our previous study, we studied three SNPs (single nucleotide variants), namely rs11240569, rs708727, and rs823156 in A1 [37]. In this work, we analyzed a sequence (1638 bp in length) located upstream of this gene. This sequence covers the 5′upstream sequence and, partially, exon 1. According to the UCSC genome browser [39], the sequence is a regulatory region represented by CpG islands (green rectangle). A promoter-like signature (EH38E1415811) and a proximal enhancer-like signature (EH38E14112) (red and orange rectangle, respectively) have been described in this region. We have identified four SNPs (rs144056491, rs61822602, rs56152218, and rs9438393) in this sequence. At rs144056491, a search within the reference sequence and then in the sequence with the variant resulted in the identification of a binding site for transcription factor p50. At rs9438393, the search resulted in the identification of a binding site for transcription factor FREAC-4 (the A allele). However, no binding site was detected in the variant sequence (G allele). At the same SNP, the G allele allows the binding of SP1. At rs56152218, the dominant T allele enables the binding of Gata2, and the minor allele that of YY1.

Genetic Analyses
The genetic analyses were performed on A1 SNPs rs11240569, rs708727, and rs823156 in the cohort of 508 PD patients (vs. 150 patients in the pilot study) and the cohort of 472 controls (vs. 120 controls in the pilot study) [37]. Thus, the numbers of the PD patients and of the control probands were increased in this study by 3.4-fold and 3.9-fold, respectively, in comparison with the pilot study. In the sub-cohort of 96 PD patients and 100 controls, we also examined A1 SNPs rs9438393, rs56152218, and rs61822602 (first identified in the PD sub-cohort by the Sanger sequencing and afterwards by RFLP analysis in the subcohort of controls). They were not analyzed in the pilot study [37]. The allele and genotype count and frequencies (fq) for each particular A1 SNP in the PD and the control cohorts are summarized in Table 3. The minor allele fq was, for rs11240569 (G > A) in our total cohort (PD cases + control probands), roughly comparable with the minor allele total population fq range (MATPFR) reported by the gnomAD and ExAC databases, as follows: fqo (observed) vs. fqr (reported) = 33% vs. 29-30%. The minor allele fq for rs708727 (G > A) in our total cohort was clearly higher than MATPFR in the gnomAD and ExAC databases (fqo vs. fqr = 40% vs. 29-30%) but was comparable with the rs708727 minor allele fq reported for the European population in the ALFA database (41%). The rs823156 (A > G) minor allele fq in our total cohort was 17% and was thus lower than MATPFR in the 3 UTR is included in exon 11. In our previous study, we studied three SNPs (single nucleotide variants), namely rs11240569, rs708727, and rs823156 in A1 [37]. In this work, we analyzed a sequence (1638 bp in length) located upstream of this gene. This sequence covers the 5 upstream sequence and, partially, exon 1. According to the UCSC genome browser [39], the sequence is a regulatory region represented by CpG islands (green rectangle). A promoter-like signature (EH38E1415811) and a proximal enhancer-like signature (EH38E14112) (red and orange rectangle, respectively) have been described in this region. We have identified four SNPs (rs144056491, rs61822602, rs56152218, and rs9438393) in this sequence. At rs144056491, a search within the reference sequence and then in the sequence with the variant resulted in the identification of a binding site for transcription factor p50. At rs9438393, the search resulted in the identification of a binding site for transcription factor FREAC-4 (the A allele). However, no binding site was detected in the variant sequence (G allele). At the same SNP, the G allele allows the binding of SP1. At rs56152218, the dominant T allele enables the binding of Gata2, and the minor allele that of YY1.

Genetic Analyses
The genetic analyses were performed on A1 SNPs rs11240569, rs708727, and rs823156 in the cohort of 508 PD patients (vs. 150 patients in the pilot study) and the cohort of 472 controls (vs. 120 controls in the pilot study) [37]. Thus, the numbers of the PD patients and of the control probands were increased in this study by 3.4-fold and 3.9fold, respectively, in comparison with the pilot study. In the sub-cohort of 96 PD patients and 100 controls, we also examined A1 SNPs rs9438393, rs56152218, and rs61822602 (first identified in the PD sub-cohort by the Sanger sequencing and afterwards by RFLP analysis in the sub-cohort of controls). They were not analyzed in the pilot study [37]. The allele and genotype count and frequencies (fq) for each particular A1 SNP in the PD and the control cohorts are summarized in Table 3. The minor allele fq was, for rs11240569 (G > A) in our total cohort (PD cases + control probands), roughly comparable with the minor allele total population fq range (MATPFR) reported by the gnomAD and ExAC databases, as follows: fq o (observed) vs. fq r (reported) = 33% vs. 29-30%. The minor allele fq for rs708727 (G > A) in our total cohort was clearly higher than MATPFR in the gnomAD and ExAC databases (fq o vs. fq r = 40% vs. 29-30%) but was comparable with the rs708727 minor allele fq reported for the European population in the ALFA database (41%). The rs823156 (A > G) minor allele fq in our total cohort was 17% and was thus lower than MATPFR in the gnomAD and ExAC databases (23-30%) but was comparable with the rs823156 minor allele fq reported for the European population in the ALFA database (18%). The frequency of the minor allele of rs9438393 (A > G) in the total cohort was 40% and was, therefore, notably higher than the MATPFR of 26-29% reported in the gnomAD and TOPMED databases, but was comparable with the rs9438393 minor allele frequency reported in the ALFA database for the European population (41%). The minor allele of rs56152218 (T > C) in the total cohort was present with an fq of 38%, which is within the MATPFR of 32-46% reported by the ALSPAC, TOPMED, and ALFA (European population) databases. Interestingly, in the Vietnamese, Korean, and Quatari populations, the minor allele is T and not C, as observed in the European population [https://www.ncbi.nlm.nih.gov/snp/?term=rs56152218, accessed on 2 August 2021)]. The fq of the minor allele in rs61822602 (G > T) was found to be 12%. This is comparable with the T allele frequencies reported by the ALSPAC and TWINSUK databases (12% and 13%, respectively) but is far higher than the fq of the T allele reported in the European population in the ALFA database (6%).  All tested A1 SNPs in our PD and control cohorts were in Hardy-Weinberg equilibrium (HWE; Table 4). Table 4. Genotype distribution of all tested A1 SNPs in PD and control cohorts conforms to Hardy-Weinberg equilibrium.   Next, we calculated the odds ratios (OR) of the minor allele and of genotypes containing the minor allele for each tested SNP. These results are summarized in Table 5. The GA genotype in rs708727 was associated with PD (OR = 1.42 (1.08-1.87), p = 0.01) in our population. Furthermore, we tested the association of particular genotypic combinations for each tested SNP with PD in recessive, dominant, and completely over-dominant genetic models (Table 6). Coherent with previous data, we identified an association of the rs708727 minor allele (A) with PD in dominant (GG vs. GA + AA) and completely over-dominant (GG + AA vs. GA) genetic models (OR D = 1.36 (1.05-1.77), p = 0.02 and OR COD = 1.34 (1.04-1.72), p = 0.02, respectively). The remaining SNPs showed no association with PD in the tested genetic models (Table 6).  We also tested the equality of population proportions of any genotypic combination composed of duplets, triplets, quadruplets, quintuplets, or sextuplets of the tested SNPs in the PD cohort (N = 96) and the cohort of controls (N = 100) with the aim of examining the size of the effect of the interactions among the tested A1 SNPs toward the susceptibility for developing PD in our population. In all, a total of 12 genotypic combinations (two duplets, seven triplets, and three quadruplets, Table 7) with significantly (p < 0.05, 10 genotypes; p < 0.06, two genotypes) different counts in the PD and control cohorts were identified (Table 7). Following Cohen's criteria [39], which describe the differences in proportions, only triplet GG (rs708727) + AG (rs823156) + CC (rs61822602) out of the 12 genotypes showed the "medium" size difference defined by Cohen's h (2arcsin

SNP rs9438393 (G > A) Cohort rs56152218 (T > C) Cohort rs61822602 (G > T) Cohort
The h value for the remaining 11 genotypes ranged between 0.32 and 0.46 and was thus within the h interval from 0.2 to 0.5, which defines small differences in proportions (Table 7) [39]. Therefore, triplet GG (rs708727) + AG (rs823156) + CC (rs61822602) might be clinically meaningful, and future examinations of this genotype with regard to PD susceptibility should be conducted. For rs11240569, rs708727, and rs823156, we performed the same type of analysis with source data from the cohort of 508 PD patients and the cohort of 472 controls. Four genotypes, two duplets (GG (rs11240569) + AG (rs708727) , GG (rs708727) + AG (rs823156) ) and two triplets (GG (rs11240569) + GG (rs708727) + AA (rs823156) , (GG (rs11240569) + GG (rs708727) + AG (rs823156) ), with significantly (p < 0.05) different counts in the PD and control cohorts, were identified. Cohen's h calculated for each of the four genotypes was below the threshold of 0.2 [39], and thus, the difference between the population proportions between the tested groups was, in all four cases, negligible. Ageing, followed by male gender, are considered to be the most prominent risk factors for the onset of idiopathic PD [37]. In our pilot study, the age of onset of idiopathic PD in the cohort of 150 PD patients was not correlated with the presence of any genotypic combination for SNPs rs11240569, rs708727, and rs823156 [37]. Here, we have correlated the age of onset of PD with (1) the presence of each genotypic combination for SNPs rs11240569, rs708727, and rs823156 in the group of 508 PD patients (Figure 2), and (2) with the presence of each genotypic combination for SNPs rs11240569, rs708727, rs823156, rs9438393, rs56152218, and rs61822602, in the sub-cohort of PD patients, randomly selected from the PD cohort for A1 promoter sequencing (Figure 3). With regard to the age of onset, a one-way ANOVA analysis revealed that there was no significant (p < 0.05) difference between the genotypic sub-populations for each of the tested SNPs in both the large PD cohort and in the sub-cohort of PD patients. Thus, any particular genotype in the tested SNPs obviously does not influence the age of onset of the idiopathic form of PD. This is in full agreement with the conclusion drawn in our pilot study [37].
factors for the onset of idiopathic PD [37]. In our pilot study, the age of onset of idiopathic PD in the cohort of 150 PD patients was not correlated with the presence of any genotypic combination for SNPs rs11240569, rs708727, and rs823156 [37]. Here, we have correlated the age of onset of PD with (1) the presence of each genotypic combination for SNPs rs11240569, rs708727, and rs823156 in the group of 508 PD patients (Figure 2), and (2) with the presence of each genotypic combination for SNPs rs11240569, rs708727, rs823156, rs9438393, rs56152218, and rs61822602, in the sub-cohort of PD patients, randomly selected from the PD cohort for A1 promoter sequencing (Figure 3). With regard to the age of onset, a one-way ANOVA analysis revealed that there was no significant (p < 0.05) difference between the genotypic sub-populations for each of the tested SNPs in both the large PD cohort and in the sub-cohort of PD patients. Thus, any particular genotype in the tested SNPs obviously does not influence the age of onset of the idiopathic form of PD. This is in full agreement with the conclusion drawn in our pilot study [37].  factors for the onset of idiopathic PD [37]. In our pilot study, the age of onset of idiopathic PD in the cohort of 150 PD patients was not correlated with the presence of any genotypic combination for SNPs rs11240569, rs708727, and rs823156 [37]. Here, we have correlated the age of onset of PD with (1) the presence of each genotypic combination for SNPs rs11240569, rs708727, and rs823156 in the group of 508 PD patients (Figure 2), and (2) with the presence of each genotypic combination for SNPs rs11240569, rs708727, rs823156, rs9438393, rs56152218, and rs61822602, in the sub-cohort of PD patients, randomly selected from the PD cohort for A1 promoter sequencing (Figure 3). With regard to the age of onset, a one-way ANOVA analysis revealed that there was no significant (p < 0.05) difference between the genotypic sub-populations for each of the tested SNPs in both the large PD cohort and in the sub-cohort of PD patients. Thus, any particular genotype in the tested SNPs obviously does not influence the age of onset of the idiopathic form of PD. This is in full agreement with the conclusion drawn in our pilot study [37].  We have also performed the same type of analysis for each of the tested SNPs in sub-populations of women and men that were derived from the large cohort (N = 508) and the sub-cohort (N = 96) of PD patients. As demonstrated in supplemental Figures S1-S3, no significant (p < 0.05) association between the age of onset and the presence of any particular genotype combination in the tested SNPs rs11240569, rs708727, rs823156, rs9438393, rs56152218, and rs61822602 in the gender-split sub-groups was derived either from the large PD cohort (N M = 306, N F = 202) or from the PD sub-cohort selected for A1 promoter sequencing (N M = 54, N F = 42).

RandomForest Machine Learning (RF-ML)
All of the A1 SNPs were tested for their ability to discriminate between PD patients and controls with RF-ML. The RF-ML algorithm was trained using our data, and the discriminative importance of individual SNPs by a technical construct, known as graph depth, was evaluated [37,40]. As in our pilot study [37], the predictive ability of the tested SNPs was visualized and quantified by ROC (receiver operating characteristic) curves and by AUC (area under ROC curve), respectively. The discriminative ability of predictors is given within the interval of AUC from 100% (maximal discriminative ability) down to 50% (minimal discriminative ability); AUC < 50% corresponds to no discriminative ability.
The RF algorithm was trained in the following modes: (1) with three or six (Table 8) particular A1 SNPs (each SNP, three genotypes (A M A M /A M A m /A m A m , where A M stands for major allele and A m for minor allele), as predictor (three or six RF models, one for each SNP), (2) with genotypic duplets of the paired SNPs as predictors (three RF models (three SNPs) or 15 RF models (six SNPs), one for each pair of SNPs; Table 8), (3) with genotypic triplets of the three SNPs as predictors (one RF model (three SNPs) or 20 RF models (six SNPs), one for each triplet of SNPs; Table 8), (4) with genotypic quadruplets of the six SNPs as predictors (15 RF models, one for each quadruplet; Table 8), (5) with genotypic quintuplets of the six SNPs as predictors (six RF models, one for each quintuplet; Table 8), and (6) with a genotypic sextuplet (one RF model, Table 8). Table 8. A1 SNPs I (rs11240569), II (rs708727), III (rs823156), IV (rs9438393), V (rs56152218), and VI (rs61822602) used as isolated genotypic singletons (predictors, no color background) or as paired predictors in genotypic duplets (gray), in genotypic triplets (darker gray), in genotypic quadruplets (blue), in genotypic quintuplets (cyclamen), or in a genotypic sextuplet (turquoise). The left panel shows the AUC (area under receiver operation curve) calculated for isolated singletons, duplets, and triplet from the source data collected in the large cohort (508 PD cases and 472 controls); the right panel shows the AUC calculated for duplets, triplets, quadruplets, quintuplets, and the sextuplet from the source data collected from the sub-cohort of 96 PD cases and 100 controls). Abbreviations: (SNP) single nucleotide polymorphism.

SNPs I-III (N = 980) SNPs I-VI (N = 196) Predictor AUC (%) Predictor AUC (%) Predictor AUC (%)
Thus, when singletons, duplets, triplets, quadruplets, quintuplets, and the sextuplet of A1 SNPs were used as predictors, they carried no ability to discriminate between the PD patients and the controls (Table 8). Hence, according to RF-ML analysis, and in agreement with the pilot study [37], the A1 SNPs have no potential to serve as discriminators between controls and PD patients and, with regard to PD, carry no predictive or diagnostic value in the Slovak population.

Discussion
The PARK16 locus has gained attention in the scientific community because of its association with PD and its discussed role in defining susceptibility to this complex ailment. In 2019, we reported a pilot study in which we analyzed the association of the three A1 SNPs, namely rs11240569, rs708727, and rs823156, with the idiopathic form of PD in the Slovak (Western Slavs) population. The reported association of rs11240569 and rs823156 with susceptibility to PD mainly in Asian/Oriental populations was not found in our study [37]. No association could be confirmed by means of frequentist statistic (conservative genetic analyses) or by RF-ML analysis [37].
The major limitation of our pilot study was, however, the relatively low number of patients/probands in the PD and control cohorts (150 and 120, respectively). We emphasized that, from a statistical point of view, the data had to be interpreted cautiously because of the small sample size in both of the cohorts and because of the low statistical power of the performed analyses [37]. Nevertheless, by utilizing the ML approach, which requires considerably smaller sample sizes than conventional frequentist statistics or approximate Bayesian computation, we were able to examine with confidence the ability of particular A1 SNPs to discriminate between PD patients and controls. In all instances, the ML approach revealed essentially zero diagnostic and predictive relevance of SNPs rs11240569, rs708727, and rs823156 in the Slovak population [37].
In this study, we have examined not only the three aforementioned SNPs, but also the three SNPs (rs9438393, rs56152218, and rs61822602) localized within the promoter region of A1. These SNPs have been identified by the sequencing of 96 PD samples followed by the RFLP analysis of the control samples and the RFLP verification of the sequenced PD samples. Our genetic analyses have revealed essentially no association of any of the three newly studied SNPs with PD. The same is true for rs11240569 and rs823156, which have been analyzed in the PD and control sub-cohorts (96/100) and also in the cohorts of 508 PD patients and 472 controls. These results are in complete agreement with the outcome of the pilot study [37]. However, in the large cohorts, rs708727 has been shown to be associated with an increased risk of PD in the Slovak population.
To our best knowledge, no study has as yet directly associated A1 SNP rs708727 with an altered risk for developing PD [19,37]. In our enlarged study (when compared with the pilot study [37]), the minor allele A in rs708727 is associated with an increased risk of PD onset when tested in dominant [40] (OR D = 1.36 (1.05-1.77), p = 0.02) or completely overdominant [41] (OR COD = 1.34 (1.04-1.72), p = 0.02) models (Table 5). Thus, the presence of the minor allele (A) in rs708727 might be associated with an increased risk of developing PD. Whereas Sanchez-Mut et al. [27] and Wang et al. [42] have clearly shown that the presence of the minor allele A in rs708727 alters the methylation of the PM20D1 promoter (and, thus, its expression) in a dose-dependent (quantitative) manner, the best-fitting genetic models in our study indicate that the presence of one rs708727 A allele is sufficient to alter the susceptibility to the onset of PD. PD-susceptibility (phenotype), with regard to whether one or two copies of the minor allele are present in rs708727 (genotype), remains to be further elucidated in detail.
Sanchez-Mut et al. have identified PM20D1 (localized within the PARK16 locus and encoding for the peptidase M20-domain containing protein one enzyme with both hydrolase and peptidase activities; N-fatty acyl amino acid synthase/hydrolase) as being a methylation and expression quantitative trait locus (mQTL) coupled to an AD-risk associated haplotype, which displays enhancer-like characteristics and contacts the PM20D1 promoter via a haplotype-dependent, CTCF (CCCTC-binding-) transcription-factor-mediated chromatin loop [27]. By comparing samples from healthy controls and patients with advanced-stage AD, they have found that PM20D1 consistently displays promoter hypermethylation in patients with AD [27]. A1 SNP rs708727 correlates with the levels of PM20D1 DNA methylation in the human frontal cortex and hippocampus [27], as does SNP rs960603 [27]. Wang and colleagues, in their work on peripheral blood, have acquired data confirming that PM20D1 is an mQTL mediated primarily by the AD-risk associated A1 SNP rs708727 [42]. Furthermore, their longitudinal data demonstrate that hypomethylation occurs before the symptomatic onset of AD, conceivably to facilitate the increasing expression of PM20D1 in order to activate its protective function [42]. AD progress is hallmarked by an increasing level of methylation in the CpG islands in the DMR (differentially methylated region) of the PM20D1 promoter in AD patients, ultimately leading to the inhibition of gene transcription and expression [27,42,43]. PM20D1 has also been associated with diabetes [44], obesity [45], and multiple sclerosis [46], and since it is localized within the PARK16 locus, its possible involvement/association with PD is assumed [47].
Despite a lack of molecular mechanistic analyses, we speculate, in light of our current data, that AD and PD (and other less frequent neurodegenerative diseases) share not only "well-known" pathophysiological mechanisms (e.g., disturbed mitophagy, retromer, and proteasome functions), but also epigenetic mechanisms, such as A1 rs708727-dependent regulation of PM20D1 expression [27,42]. Our work indirectly adds to the need for the detailed elucidation of the role of endogenous N-acyl amino acids (NAAs) that are metabolized by PM20D1 in the pathoetiology of PD and other neurodegenerative disorders. NAAs and N-acyl conjugates of neurotransmitters (NAANs) are now known to play an important role in neuromodulation [48,49]. Evidence linking PM20D1 expression with N-acyl dopamine (NADA) has been provided by the recent work of Song et al. These authors have shown that the deletion of kir6.2 (a pore-forming subunit of the ATP-sensitive K + channels) leads to a reduced count of mitochondria and lowered ATP production via an increase in the levels of PM20D1 and of agents uncoupling mitochondrial respiration, including NADA, in the murine midbrain [49].
NADA is a potent inhibitor of 5-lipoxygenase (5-LOX) and has a distribution limited to the brain with levels being the highest in the striatum and very low elsewhere [48,50,51]. 5-LOX catalyzes the synthesis of leukotriene or 5-HpETE (5-hydroperoxyeicosatetraenoic acid) from arachidonic acid and has been associated with neurodegeneration (AD and PD) via its involvement in neuroinflammation [51,52]. We therefore suggest that the decreased PM20D1 expression, the subsequently lower abundance of NADA, and the increased activity of 5-LOX significantly contribute to the pathology of PD (and other neurodegenerative diseases).
Sanchez-Mut and coworkers have demonstrated that the overexpression of PM20D1 in the murine AD hippocampus results in learning improvement, whereas its knock-down increases the amyloid plaque load [27]. Both the Lewy-type and Alzheimer-type pathologies are important in PD-related dementia [53]. A significant pool of PD patients suffers from worsening dementia during the course of the disease [54]. Taking these observations into consideration we speculate, whether the rs708727-linked activity that silences PM20D1 contributes to the PD-dementia onset, and whether the monitoring of PM20D1 activity can serve as a prognostic parameter for the onset of PD-dementia.
Previously published work has suggested the involvement of Mg 2+ transporters in the pathoetiology of PD [18][19][20][21][22]55]. A1, being the key player in the Mg homeostasis of somatic cells, has been linked to PD directly [20][21][22]. Point mutations in A1 leading to substitutions p.A350V, p.R244H, and p.R285Q have been putatively associated with PD, and both the lack of function and the loss of function mutations in A1 are assumed to have detrimental consequences in neurons, thereby contributing to the PD phenotype [20][21][22]. This work indirectly points toward a possibility that not only perturbations of A1 core function (Na + /Mg 2+ exchange), but also A1-linked (rs708727) epigenetic regulation of PM20D1 expression (and activity), both contribute to pathoetiology of PD.
In this study we have identified genotypic triplet GG (rs708727) + AG (rs823156) + CC (rs61822602) as being potentially clinically meaningful (h ≥ 0.5). Interestingly, rs708727 with genotype GG is part of the triplet. However, in light of previous research, the genotype containing the minor allele A in rs708727 would be expected to be linked to a potential PD risk associated with this triplet. Currently we are not able to comment on any molecular/genetic/epigenetic interactions involving the SNPs in the triplet GG (rs708727) + AG (rs823156) + CC (rs61822602) , and thus, on any putative contribution of this triplet to a sum of the risk of PD onset.
In our pilot study, we utilized RF-ML to evaluate and interpret our data [37]. The major advantage of RF-ML data analysis is twofold, as follows: (1) it permits the discriminative ability of SNPs between PD patients and controls to be quantified [37], and (2) it requires lower sample sizes for the evaluation of the discriminative importance of individual SNPs or their combinations [37]. Furthermore, RF-ML bypasses the p-value problem often associated with larger samples, even when they are available [56]. As in our previous report, none of the tested A1 SNPs have been shown to have the power to discriminate between PD patients and non-PD probands in our cohort (Table 8). Thus, we can assume that none of the tested A1 SNPs are suitable for serving as a PD/non-PD discriminator in the Slovak population.
Regarding the association of A1 rs708727 with altered risk for PD, the outcome of our RF-ML analysis seems to be, on first sight, contradictory (Tables 5 and 6 vs. Table 8). Jakobsdottir et al. in their logistic regression and ROC curve analyses showed that even strong genetic associations do not automatically guarantee effective discrimination between cases and controls [57]. In spite of being poor classifiers, SNPs with significant OR might be very valuable for establishing etiological hypotheses [57]. In our case, A1 SNP rs708727 carries no classification power regarding PD, thus, it is of no clinical importance. However, its weak, but significant association with the altered risk for PD allowed us to speculate about involvement of rs708727 in the pathoetiology of PD in a similar or the same way as it is involved in AD [27].

Study Participants (Basic Characteristics)
In total, 980 probands were included in the study (508 PD patients and 472 controls, fulfilling inclusive criteria). The idiopathic form of PD was diagnosed by neurologists in five PD diagnostic centers (in Martin, Bratislava, Trencin, Zvolen, and Kosice) according to the PD diagnostic criteria of the MDS (Movement Disorder Society). All patients were treated with standard anti-PD therapy. The average age of the PD patients was 68.4 ± 9.6 years. The average age of the disease onset was 61.7 ± 10.7 years. The youngest case was diagnosed at the age of 34 and the oldest case at 89 years. The PD group consisted of 202 female (F) and 306 male (M) patients, and thus, the F:M ratio was 1:1.5.
The control cohort of probands consisted of outward and inward patients from the Clinic of Occupational Medicine and Toxicology (University Hospital Martin (UHM)) and the Neurology Clinic (UHM). Only those patients who had not been previously diagnosed with any neurodegenerative and neuropsychiatric disease, such as diabetes mellitus, or osteoporosis (all maladies putatively associated with altered A1 expression and deregulation of A1 function), were enrolled into the control study group. The average age of the control probands was 68.3 ± 11.6 years. The control group consisted of 208 female and 264 male individuals, and thus, the F:M ratio was 1:1.3 in the control group.
The sub-cohort of PD patients, in which the A1 promoter region was sequenced, consisted of 96 randomly selected subjects. The F:M ratio was 1:1.3 (42 female and 54 male). The average age of patients in the PD sub-cohort was 67.0 ± 9.5 years. The control subcohort consisted of 41 females and 59 males, thus the F:M ratio was 1:1.4. The average age of probands in the control sub-cohort was 59.8 ± 5.0 years. This study was approved by the Ethical Committee at the Jessenius Faculty of Medicine, Comenius University (JFM CU). Approval was recorded under ID: EK 66/2019. All study participants signed informed consent forms.

Sample Processing
Blood samples were collected into EDTA-treated BD Vacutainer ® tubes (Becton, Dickinson and Company, Franklin Lakes, NJ USA). Genomic DNA was isolated from fresh (UHM) or frozen blood samples (other PD centers) by using the Wizard ® Genomic DNA Purification Kit (Promega Corporation, Maddison, WI, USA) according to the manufacturer's protocol. Isolated DNA samples were stored at −80 • C.

Genotyping
Genotyping was performed on 358 PD samples and 352 control samples. Results from these experiments were analyzed together with the results previously reported in Cibulka et al. [37]. SNPs rs11240569, rs708727, and rs823156 were analyzed by using TaqMan ® genotyping probes C_34251_20/rs11240569, C_375742_10/rs823156, and C_9238453_10/rs708727 (all Thermo Fisher Scientific, Waltham, MA USA) in the same manner as reported previously [37].

Sanger Sequencing
The promoter region was divided into four overlapping fragments/amplicons, as it was too long for a single sequencing run. Before being sequenced, the target regions were amplified. Primers were designed by using the online tool Primer3Plus [https:// primer3plus.com/cgi--bin/dev/primer3plus.cgi (accessed on 3 May 2018)]. Each pair of primers was checked for the presence of multiple amplification products by the PCR online tool UCSC [http://www.genome.ucsc.edu/cgi--bin/hgPcr (accessed on 3 May 2018)]. The primers and PCR programs used for amplification of the four fragments are summarized in supplemental Tables S1 and S2. Compositions of master mixes for each fragment are summarized in SA3. Fragments 3 and 4 have a high content of G and C nucleotides (67.4% and 71.9%, respectively). The reaction yield was increased by addition of the 10× G-C Rich Enhancer (Solis Biodyne, Tartu, Estonia). The PCR product was purified by using NucleoSpinTM Gel and a PCR Clean-up kit (Macherey-Nagel GmbH&Co. KG, Düren, Germany). In the next step, the purified PCR product was diluted to an appropriate concentration for pre-sequencing PCR (SA4, SA5). Only forward (fw) primers were used in the pre-sequencing PCR. The reaction mix also included BigDye Terminator v3.1 (Applied Biosystems, Waltham, MA, USA) and dideoxynucleotides. As a result, we obtained a mixture of products of various sizes terminated by fluorescence-marked dideoxynucleotides. The products were subsequently purified by using the SigmaSpin Sequencing Reaction Clean-Up kit (Sigma-Aldrich, St. Louis, MO, USA) according to the manufacturer's protocol. A volume of 3 µL of purified product was transferred into a 96-well plate together with 12 µL high-grade de-ionized formamide (Applied Biosystems, Waltham, MA, USA). The mixture was denatured for 5 min at 95 • C in a thermocycler. Fragments were separated on the 8-microcapillary device ABI 3500 (Applied Biosystems, Waltham, MA, USA). Sequences were exported and visualized by Chromas software (Technelysium Pty Ltd., South Brisbane, Australia). FASTA files were uploaded to BLAST (Basic Local Alignment Search Tool) and aligned to the reference human genome (version GRCh38.p12).
Prediction of transcription factor binding sites and their alterations was performed by the online tool ConSite [available at: http://consite.genereg.net/ (accessed on 3 September 2020)] [38]. Sequences with the major allele and the minor allele were uploaded, and spectra of the transcription factors (TF) were generated. The analysis was run without pre-setting minimum specificity. Changes in TF-binding profiles based on the presence of variants are summarized in Table 2.

RFLP (Restriction Fragment Length Polymorphism) Analysis
The reaction mix components and PCR conditions of the amplified PCR are summarized in SA6, SA7. The online in silico tool NEBCutter 2.0 [http://nc2.neb.com/NEBcutter2 / (accessed on 14 February 2020)] was used to design restriction of the amplicons. The following restriction enzymes were chosen for RFLP analysis: Hpy166II (detection of rs9438393), NIaIII (detection of rs56152218), and BmrI (detection of rs61822602). All enzymes were purchased from New England Biolabs. For variant rs144056491, we were unable to design an RFLP experiment, as no suitable enzyme was available. Following restriction, we expected the fragments summarized in SA8 to form the yield. After restriction, the products were separated by agarose gel electrophoresis (NIaIII and Hpy166II 1% gel; BmrI 2% gel) and then visualized on a PharosFX instrument (Bio-Rad Laboratories). Genotypes for each SNV were determined. One-way ANOVA was used to test the null hypothesis of equality of the population mean age of onset for the three genotypes sub-populations for each SNP. Findings with a p-value below 0.05 were considered statistically significant.

Conclusions
In summary, our data suggest a weak, but significant association of A1 SNP rs708727 with PD in dominant and over-dominant genetic models in a Slovak population. None of the other tested A1 SNPs (rs11240569, rs823156, rs9438393, rs56152218, and rs61822602) associated with the disease in any of the tested genetic models. RF-ML analyses identified all of the tested A1 SNPs as being poor classifiers/predictors of PD, thus their use in clinical praxis as diagnostic or prognostic markers remain negligible. However, the association of rs708727 with PD allowed us to speculate that PD-risk associated minor allele (G > A) in rs708727 contributes to the disease onset and progression via the derangement of epigenetic regulation of PM20D1 expression, the mechanism known to play a role in pathoetiology of AD. This hypothesis should be further examined in order to make a conclusive statement. Furthermore, a possible association between PD-associated dementia and rs708727 should be elucidated.  Data Availability Statement: The complete dataset is contained within the article. The nature and extent of the included data allows for further meta-analysis. Any information regarding the study is available on request from the corresponding author.