Genetic Variation and Autism: A Field Synopsis and Systematic Meta-Analysis

This study aimed to verify noteworthy findings between genetic risk factors and autism spectrum disorder (ASD) by employing the false positive report probability (FPRP) and the Bayesian false-discovery probability (BFDP). PubMed and the Genome-Wide Association Studies (GWAS) catalog were searched from inception to 1 August, 2019. We included meta-analyses on genetic factors of ASD of any study design. Overall, twenty-seven meta-analyses articles from literature searches, and four manually added articles from the GWAS catalog were re-analyzed. This showed that five of 31 comparisons for meta-analyses of observational studies, 40 out of 203 comparisons for the GWAS meta-analyses, and 18 out of 20 comparisons for the GWAS catalog, respectively, had noteworthy estimations under both Bayesian approaches. In this study, we found noteworthy genetic comparisons highly related to an increased risk of ASD. Multiple genetic comparisons were shown to be associated with ASD risk; however, genuine associations should be carefully verified and understood.

the search terms (autism AND meta OR meta-analysis) and obtained relevant articles, first, by scanning the titles and abstracts and, second, by reviewing the full-text ( Figure 1). During the selection process, all genetic, gen*, and related terms were included in the relevant articles. Any disagreements were resolved by discussion and consensus. In the case of GWAS, the GWAS catalog was additionally used, as well as PubMed, for a more precise search.

Data Extraction
From each article, we extracted the first author, year of publication, the number of individual studies included, the number of cases and controls, and the number of families if a meta-analysis

Data Extraction
From each article, we extracted the first author, year of publication, the number of individual studies included, the number of cases and controls, and the number of families if a meta-analysis included family-based studies, the type of statistical model (fixed or random) and study design. We also recorded gene name, gene variants, genotypic comparison, OR with 95% CI, and the corresponding p-value. We retrieved all the main data (preferably adjusted), and, for comprehensiveness we Brain Sci. 2020, 10, 692 5 of 25 additionally extracted subgroup analysis data if the main data were not statistically significant. When data were incomplete, we contacted the corresponding authors for additional information.
Reported association was considered statistically significant if p-value < 0.05 for meta-analyses of observational studies, and <5 × 10 −8 for GWAS or meta-analyses of GWAS. Meanwhile, genetic associations with a 5 × 10 −8 < p-value < 0.05 were defined as being of borderline significance in GWAS or meta-analyses of GWAS. In addition, we recorded genetic comparisons with p-value < 5 × 10 −8 for our gene network, even when they were not re-analyzable due to insufficient raw data.

Statistical Analysis
Evaluations of the statistical significance of studies about genetic polymorphisms too often inferred false positives, when the evaluations were solely based on p-value [15]. Therefore, to clarify "noteworthy" association between re-analyzable genetic variants and ASD, we employed the two Bayesian approaches: FPRP and BFDP [15]. We used the Excel spreadsheets created by Wacholder et al. [15] and Wakefield [14] to calculate FPRP and BFDP, respectively. We computed FPRP at two prior probability levels of 10 −3 and 10 −6 and used statistical power to detect two OR levels, 1.2 and 1.5, so that readers can make their own judgment about the evidence for each genetic variant. BFDP is similar to FPRP but uses more information than FPRP [14]. Both prior probability levels were chosen as one of the low and very low values of levels, respectively. We computed BFDP at two prior probabilities levels, 10 -3 and 10 −6 . We set the thresholds of noteworthiness of FPRP and BFDP to be <0.2 and <0.8, respectively, as recommended by the original papers and highlighted corresponding results in bold type [14,15]. Gene variants were determined to have a noteworthy association with ASD if they satisfied both thresholds.

Construction of Protein-Protein Interaction (PPI) Network
We collected genetic comparisons either with noteworthy results under both FPRP and BFDP or with p-value < 5 × 10 −8 to establish a network of genes using STRING 9.1 (protein-protein interaction network, PPI network) related to ASD [16]. Genetic comparison results, which show genome-wide significance (p-value < 5 × 10 −8 ) or borderline significance (p-value < 0.05) with a noteworthy association under both Bayesian approaches, were included. Any results with a p-value < 5 × 10 −8 that were not re-analyzable were also added in the network analysis. PPI networks provide a critical assessment of protein function on ASD including direct (physical) as well as indirect (functional) associations.

Study Characteristics
The initial PubMed literature search yielded 747 articles. Out these, 656 articles were excluded after screening the title and abstract, and 64 articles were omitted after reviewing the full-text. Twenty-seven studies were finally included for the re-analysis of observational studies, GWAS, and meta-analyses of GWAS ( Figure 1).
Additionally, 25 articles were searched on the GWAS catalog, but 14 articles did not meet the criteria were excluded. Among the remaining 11 articles, five articles were not re-analyzable due to insufficient raw data. Moreover, five articles were already included in our dataset from the PubMed search. However, we retained three of the non-re-analyzable articles [17][18][19] since they satisfied the cut-off value of statistical significance for our PPI network (p-value < 5 × 10 −8 ). Out of the remaining six articles, two were already in our dataset from the literature search from PubMed. Finally, four articles from the GWAS catalog were manually added to 27 articles previously screened from PubMed, leading to a total of 31 eligible articles  being included in the systematic review ( Figure 1).

Re-Analysis of Meta-Analyses
This paper is divided into two parts: (1) the observational studies part, and (2) the GWAS part. In the observational studies, all statistics were collected considering the overlapping, and results of gene variants with/without statistical significance (Table 1, Supplementary Table S2). Even though genetic variants examined in several studies, we excluded the studies if the data were not significant performed by FPRP or BFDP. In the GWAS part, data from previously published meta-analyses and newly added data from the GWAS catalog were re-analyzed.

Re-Analysis of Meta-Analyses of Observational Studies
Among the 31 eligible studies, 19 were meta-analyses of observational studies, which corresponded to 125 genetic comparisons. Thirty one out of 125 genotype comparisons were reported as being statistically significant using the criteria of p-value < 0.05 as listed in Table 1.
We examined the 203 genetic comparisons with a genome-wide or borderline significance using both FPRP and BFDP estimation. With FPRP estimation, forty-one (20.2%) and four (2.0%) were assessed to be noteworthy at a prior probability of 10 −3 and 10 −6 with statistical power to detect an OR of 1.2. Moreover, fifty-four (26.6%) and eight (3.9%) were identified as noteworthy at a prior probability of 10 −3 and 10 −6 with statistical power to detect an OR of 1.5. Overall, forty genetic comparisons (19.7%) were found noteworthy under both Bayesian approaches, which included a single genetic comparison satisfying the conventional significance threshold of p-value < 0.05 (Table 2).

Re-Analysis of Results from the GWAS Catalog and GWAS Datasets Included in the GWAS Meta-Analyses
Genetic comparisons additionally extracted from the GWAS catalog were also re-analyzed (Table 3). Among the 20 included comparisons, two (10.0%) genotype comparisons, MACROD2/rs4141463 and LOCI105370358-LOCI107984602/rs4773054, extracted from the GWAS catalog were reported to be significant with a p-value < 5 × 10 −8 . The remaining 18 comparisons were of borderline statistical significance (p-value between 0.05 and 5 × 10 −8 ).
While assessing noteworthiness, five (25.0%) and three (15.0%) were verified as being noteworthy using FPRP estimation, at a prior probability of 10 −3 and 10 −6 , respectively, with the statistical power to detect a 1.2 OR. In addition, eighteen (90.0%) and four (25.0%) showed noteworthiness at a prior probability of 10 −3 and 10 −6 with the statistical power to detect a 1.5 OR, respectively. In the BFDP estimation, nineteen (95.0%) and two (10.0%) were assessed as being noteworthy at a prior probability of 10 −3 and 10 −6 , respectively. Finally, 18 genetic associations (95%) of both significant and borderline statistically significant results were verified as being noteworthy under both the FPRP and BFDP approaches. The total number of associations included two comparisons with genome-wide significance (p-value < 5 × 10 −8 ) and sixteen comparisons with borderline significance (p-value between 0.05 and 5 × 10 −8 ).
In order to develop the analysis further, we extracted the GWAS data that was both statistically significant and noteworthy under both Bayesian approaches, from the GWAS meta-analysis and GWAS catalog. They were extracted from five articles [30][31][32][33][34], with 70 of the GWAS data being noteworthy under both FPRP and BFDP. Results with noteworthy association are summarized in Table 4.          Abbreviations: ASD, Autism spectrum disorders; A, Adenine; C, Cytosine; G, Guanine; T, Thymine; D, Deletion; I, Insertion; FPRP, false positive rate probability; BFDP, Bayesian false discovery probability; OR, odds ratio; CI, confidence interval; GWAS, Genome-Wide Association Studies; NA, not available.

Protein-Protein Interaction (PPI) Network
We established PPI networks related to the risk of ASD by filtering genes noteworthy under both FPRP and BFDP or genes with a p-value < 5 × 10 −8 . We included the results of both re-analyzed and non-re-analyzable genetic comparisons from meta-analyses of observational studies and GWAS, GWAS included in meta-analyses of GWAS, and the GWAS catalog. The statistically significant results of non-re-analyzable studies are presented in the Supplement Table S3.

Protein-Protein Interaction (PPI) Network
We established PPI networks related to the risk of ASD by filtering genes noteworthy under both FPRP and BFDP or genes with a p-value < 5 × 10 −8 . We included the results of both re-analyzed and non-re-analyzable genetic comparisons from meta-analyses of observational studies and GWAS, GWAS included in meta-analyses of GWAS, and the GWAS catalog. The statistically significant results of non-re-analyzable studies are presented in the Supplement Table S3.
The major genes that included a strong genetic connection were the myc-associated factor X (MAX) network transcriptional repressor (MNT), oxytocin receptor (OXTR), nucleolar and coiledbody phosphoprotein (NOLC1), peroxisome proliferator-activated receptor gamma related coactivator-related 1 (PPRC1), pyruvate carboxylase (PC), methylenetetrahydrofolate reductase (MTHFR), multiple epidermal growth factor like domains 10 (MEGF10), nuclear factor kappa B subunit 2 (NFKB2), histone deacetylase 4 (HDAC4), etc. (Figure 2 and Table 5).   Tyr protein kinase family and the epidermal growth factor receptor subfamily; binds to and is activated by neuregulins, and induces mitogenesis and differentiation OR2M4 Members of a large family of GPCR; olfactory receptors initiating a neuronal response that triggers the perception of a smell BCAS1 Oncogene; highly expressed in three amplified breast cancer cell lines and in one breast tumor without amplification at 20q13.2.

CYP24A1
Cytochrome P450 superfamily of enzymes; drug metabolism and synthesis of cholesterol, steroids and other lipids TMEM132B The function remains poorly understood despite their mutations associated with non-syndromic hearing loss, panic disorder, and cancer KRR1 Nucleolar protein; 18S rRNA synthesis and 40S ribosomal assembly HAT1 Type B histone acetyltransferase; rapid acetylation of newly synthesized cytoplasmic histones; replication-dependent chromatin assembly SGSM2 GTPase activator; regulators of membrane trafficking EXT1 Endoplasmic reticulum-resident type II transmembrane glycosyltransferase; involved in the chain elongation step of heparan sulfate biosynthesis OR2T33 Members of a large family of GPCR; share a 7-transmembrane domain structure with many neurotransmitter and hormone receptors TAF1C Binds to the core promoter of ribosomal RNA genes to position the polymerase properly; acts as a channel for regulatory signals HDAC4 Class II of the histone deacetylase/acuc/apha family; represses transcription when tethered to a promoter MEGF10 Member of the multiple epidermal growth factor-like domains protein family; cell adhesion, motility and proliferation; critical mediator of apoptotic cell phagocytosis; amyloid-beta peptide uptake in brain NFKB2 Subunit of the transcription factor complex nuclear factor-kappa-B; central activator of genes involved in inflammation and immune function BNC2 Conserved zinc finger protein; skin color saturation NMB Member of the bombesin-like family of neuropeptides; negatively regulate eating behavior; regulate colonic smooth muscle contraction HPS6 Organelle biogenesis associated with melanosomes, platelet dense granules, and lysosomes ELOVL3 GNS1/SUR4 family; elongation of long chain fatty acids to provide precursors for synthesis of sphingolipids and ceramides PITX3 Member of the RIEG/PITX homeobox family; transcription factors; lens formation during eye development NAALADL2 Not well-known, but diseases associated with NAALADL2 include Chromosome 6Pter-P24 Deletion Syndrome and Cornelia De Lange Syndrome.

MACROD2
Deacetylase removing ADP-ribose from mono-ADP-ribosylated proteins; translocate from the nucleus to the cytoplasm upon DNA damage

Discussion
To our knowledge, this study is the first study of ASD genetic risk factors, which assessed the levels of evidence of the published meta-analyses showing the association between susceptible loci and ASD. Overall, genetic comparisons with noteworthy results were confirmed as risk factors for ASD. The genetic comparisons highly related to an increased risk of ASD might reflect the implication in neurodevelopment and specific synaptogenesis of ASD.
According to the PPI network, composed of noteworthy results obtained when using both Bayesian approaches, multiple genes were included as a risk factor for ASD. Investigating the lists genes as a risk factor, promising candidates encoded the protein associated with neural development and specification, and also with neurotransmitters and its receptors. These genes were RELN and DRD3 from observational studies, and PC, OPCML, ERBB4, OR2M4, MEGF10, OR2T33, NMB, and NOLC1, from GWAS. In line with our findings, previous reports have supported that the migration and proliferation of neuronal cells is essential to understanding neurodevelopmental disorders such as ASD or schizophrenia [49,50]. In addition, apart from anatomical approaches, genes correlated with neuropeptides and receptors, such as those in the brain or hippocampus, also explain the pathophysiology of the disease at a molecular level [51]. The list of genes included is presented in Table 5.
The present comprehensive re-analyses shows that, although a large number of studies have suggested numerous possible genetic risk factors for ASD, truly significant results are small and a partial part of whole results. For instance, we detected false positive results in 26 out of 31 (83.9%) meta-analyses of observational studies and 163 out of 203 (80.3%) in meta-analyses of GWAS, respectively. However, only a small portion of genetic comparisons with a p-value < 0.05 exhibited noteworthy associations with ASD under both Bayesian approaches (Tables 1-4).
Moreover, we also detected that genetic comparisons with borderline statistical significance (5 × 10 −8 < p-value < 0.05) accounted for 53 out of 126 (42%) noteworthy comparisons from GWAS or meta-analyses of GWAS. These genetic comparisons might have been neglected if the p-value alone was considered to determine noteworthiness. Using the two Bayesian approaches as we did, or relaxing the current GWAS threshold as Panagiotou et al. suggests, might enable better interpretation of GWAS results [48].
Based on the observational studies, out of 31 statistically significant genotype comparisons, five (16.1%) were found noteworthy under both FPRP and BFDP: T vs. C, MTHFR C677T; T (minor), MTHFR C677T; G vs. A, DRD3/rs167771; C vs. G, RELN/rs362691; A (minor), OXTR/rs7632287. From the meta-analyses of GWAS, we could confirm that 34 distinct genes are noteworthy under both Bayesian approaches with about 30 genetic connections. However, the fact that all three comparisons with a p-value < 5 × 10 −8 -rs1879532 (Table S3), rs4773054 (Table 2), rs4141463 (Table 2)-displayed noteworthiness may indicate that the stringent threshold of p < 5 × 10 −8 is a good tool for verification of the true noteworthiness of genetic risk factors.
There are several limitations in our review. First, we did not include studies that have not been meta-analyzed, or meta-analyses that had insufficient data in our review. Secondly, we only included the single findings of a meta-analysis with the lowest p-value per genetic variant. Therefore, we could not consider potentially meaningful subgroup analyses for different ethnicity, location, gender, and type of genotype comparison (i.e., random or fixed) when selecting a certain outcome. We focused on whether the individual genotype variant was truly associated with ASD or not, regardless of the specific type of the genotype comparison or ethnicity.
Our study has several strengths and implications. For example, to our knowledge, this is the first study that simultaneously analyzed a sizeable amount of data about genetic factors including not only GWAS but also the GWAS catalog. Despite the known high heritability of ASD and abundant research in ASD that has focused on the underlying genetic causes, the literature on genetic risk factors for ASD has not fully reached a consensus. This comprehensive review of genetic associations linked to ASD may improve understanding of the strengths and limitations of each form of research, and advance better and novel approaches for examining ASD in the field of genetic research. The findings of this study could provide mechanisms that may be explored for the development of novel neurotherapeutic agents both for the prevention and treatment of ASD.

Conclusions
In conclusion, we synthesized published meta-analyses on risk factors of ASD to acquire noteworthy findings and false positive results by adopting two Bayesian approaches for genetic factors. We attempted to synthesize all meta-analyses on genetic polymorphisms linked to ASD and found noteworthy genetic factors highly related to an increased risk of ASD. We also investigated their validity by discovering false positive results under Bayesian methods. To verify results obtained from genetic analyses, both approaches may have advantages, especially for interpretation of results obtained from observational studies. We found noteworthy results from GWAS, not only with p-value ranging between 0.05 and 5 × 10 −8 , but also from genetic variants within borderline significance rage which were almost half of the genetic variants. This finding speculates that the genetic variants with borderline significance needs to be further analyzed to determine what associations are genuine.