A Systematic Review of Candidate Genes for Major Depression

Background and Objectives: The aim of this systematic review was to analyse which candidate genes were examined in genetic association studies and their association with major depressive disorder (MDD). Materials and Methods: We searched PUBMED for relevant studies published between 1 July 2012 and 31 March 2019, using combinations of keywords: “major depressive disorder” OR “major depression” AND “gene candidate”, “major depressive disorder” OR “major depression” AND “polymorphism”. Synthesis focused on assessing the likelihood of bias and investigating factors that may explain differences between the results of studies. For selected gene list after literature overview, functional enrichment analysis and gene ontology term enrichment analysis were conducted. Results: 141 studies were included in the qualitative review of gene association studies focusing on MDD. 86 studies declared significant results (p < 0.05) for 172 SNPs in 85 genes. The 13 SNPs associations were confirmed by at least two studies. The 18 genetic polymorphism associations were confirmed in both the previous and this systematic analysis by at least one study. The majority of the studies (68.79 %) did not use or describe power analysis, which may have had an impact over the significance of their results. Almost a third of studies (N = 54) were conducted in Chinese Han population. Conclusion: Unfortunately, there is still insufficient data on the links between genes and depression. Despite the reported genetic associations, most studies were lacking in statistical power analysis, research samples were small, and most gene polymorphisms have been confirmed in only one study. Further genetic research with larger research samples is needed to discern whether the relationship is random or causal. Summations: This systematic review had summarized all reported genetic associations and has highlighted the genetic associations that have been replicated. Limitations: Unfortunately, most gene polymorphisms have been confirmed only once, so further studies are warranted for replicating these genetic associations. In addition, most studies included a small number of MDD cases that could be indicative for false positive. Considering that polymorphism loci and associations with MDD is also vastly dependent on interpersonal variation, extensive studies of gene interaction pathways could provide more answers to the complexity of MDD.


Introduction
Major depressive disorder (MDD) is a common psychiatric illness accompanied by high levels of morbidity and mortality. MDD causes major psychological, physical, and social impairments [1]. It can cause the affected person to suffer greatly and function poorly at work, at school and in the family. According to the World Health Organization (WHO), at a global level, more than 264 million people are estimated to suffer from depression [2]. Depression is ranked by WHO as the single largest contributor to global disability (7.5% of all years lived with disability (YLDs) in 2015). Along with population growth and aging, many cases of depression overloaded healthcare systems, thereby generating the need for resource optimization [3]. There is a clear need to identify prognostic indicators that could be used to select individuals at higher risk of developing MDD in order to aid the management of patients in clinical practice. Identifying risk variants using genetic analysis and thereby increasing our understanding of how MDD arises, could lead to improved prevention and the development of new and more effective therapies [4].
Even though information concerning the epidemiology and symptoms of depression are well documented, the current understanding of the aetiology and pathophysiology of MDD are still rudimentary [5]. It is known that depression results from a complex interaction of social, psychological, and biological factors [6]. Genetic factors substantially contribute to MDD, as indicated by family, twin, and adoption studies [7]. For instance, a meta-analysis of twin research data shows that the heritability rate for depression is 37% and data from family studies show a two-to threefold increase in the risk of depression in first-degree offspring of patients with depression [8]. MDD is a genetically complex disease. Only a small number of genes have been proven to be associated with MDD development risk [9].
Available literature regarding the genetics of MDD is vast and complex. Researchers have taken upon themselves the task of determining the genetic architecture of MDD using different molecular approaches, including the linkage and genome-wide association (GWA) studies. Recent GWA meta-analysis of 135,458 MDD cases and 344,901 controls, identified 44 independent and significant risk loci for MDD [10]. Recent study [11] which identified MDD candidate genes and performed association studies at the polymorphism and gene level in multiple large samples, found no support for candidate gene or candidate gene-by-interaction hypotheses for major depression. A comprehensive systematic review of linkage studies was done in 2012, in which the authors stated that the results of explored studies lacked significant findings in any candidate gene meta-analysis [12].
Our aim was to analyse which candidate genes have been studied in the genetic association studies and identify their association with major depressive disorder (MDD). In order to update our knowledge of recent findings of linkage studies, this systematic review will explore available data regarding candidate genes for MDD from July 2012 until March 2019.

Literature Selection
The protocol for this systematic review has been registered in the international prospective register of systematic reviews (PROSPERO protocol ID: CRD42019129194 available at https://www.crd.york.ac.uk/PROSPERO/display_record.php?RecordID=129194 (accessed on 9 December 2022). We conducted a systematic literature search to update genetic case-control linkage studies on MDD, using a gene-candidate approach with those published between 1 July 2012 and 31 March 2019; the end search date was 23 March 2019 [13] (e PUB publication) using MEDLINE1 via PUBMED. The search strategy we first developed in MEDLINE (OVID) and then adapted in PubMed. Two investigators (AN and RG) independently conducted a literature search in PubMed for relevant articles between September 2019 and February 2020. The search strategy included articles published from July 2012 until March 2019 using the following combinations of relevant keywords: "major depressive disorder" OR "major depression" AND "gene candidate", "major depressive disorder" OR "major depression" AND "polymorphism". Titles, abstracts, and full texts we screened sequentially for eligibility criteria and any discrepancies were resolved by consensus or by a third reviewer (GB). Data we crosschecked to ensure consistency. Inclusion criteria were as follows: (i) the patients had a primary diagnosis of major depressive disorder; (ii) the study examined the association between a candidate gene (a SNP) and MDD; (iii) the study was a case-control study; and (iv) the study was published in the English language. Exclusion criteria: (i) non-human studies; (ii) non-genetic studies; (iii) other than case-control study design; (iv) another genetic approach than gene candidate association analysis used; (v) enrolled patients did not have a primary diagnosis of major depressive disorder or had psychiatric comorbidities; (vi) studies focusing on treatment of MDD; (vii) articles with insufficient data for analysis reported (only abstract available). We excluded the abstracts that did not mention investigating the genetic association(s) between MDD and one or more genetic polymorphisms.

Data Synthesis
A descriptive and tabular synthesis was carried out using the extracted data and major findings of each included study. A team of three members extracted the data. The data included: (1) author information, (2) year of publishing, (3) information on the setting for each study (the genotyping method employed, statistical model, and statistical detected power), (4) characteristics of study participants (phenotypic definitions and geographic characteristics), (5) characteristics of candidate gene (type, locus, and evidence of functional role), (6) outcome measure (raw p-values and odds ratios (ORs) for genotypes and/or allele frequencies and the corrected results if the study had applied corrections for multiple testing, Hardy-Weinberg equilibrium (HWE) test). A "Tool to Assess Risk of Bias in Case Control Studies" [14], which is recommended by CLARITY group, was used to assess the quality of all eligible studies. Based on the tool, each study was assessed in five dimensions (assessment of exposure, outcome development, case and control subject selection, group comparability) and appointed to one of the three categories: of low, higher, and high bias, which stratified the studies into studies of low, higher, or high risk of bias. Examples of low risk of bias: comprehensive matching or adjustment for all plausible prognostic variables; from a data base with documentation of accuracy of abstraction of prognostic data. Examples of higher risk of bias: matching or adjustment for most plausible prognostic variables; data base with uncertain quality of abstraction of prognostic information. Examples of high risk of bias: Prognostic information from data base with no available documentation of quality of abstraction of prognostic variables; matching or adjustment for a minority of plausible prognostic variables, or no matching or adjustment at all. Statements of no differences between groups or that differences were not statistically significant are not sufficient for establishing comparability.
If sufficient information could not be extracted and bias could not be assessed, those studies were considered as having an unclear risk of bias. Synthesis focused on describing the direction and consistency of effect, assessing the likelihood of bias, and investigating factors that may explain differences between the results of studies.

Gene Functional Enrichment Analysis
For the selected gene list after the literature overview, a functional enrichment analysis was performed using WEB-based Gene Functional Classification tool (DAVID Bioinformatics Recourses 6.8). Gene Ontology (GO) term enrichment analysis was conducted using medium classification stringency, and significance value was adjusted by the false discovery rate (FDR) analysis using the Benjamini-Hochberg (BH) procedure. GO terms were assigned to one of three categories, Biological process (BP), Molecular function (MF), and Cellular component (CC) terms, and only included in further analysis if statistically significant (p < 0.05).

Literature Search
The first search strategy on PubMed using the following keywords: "major depression" OR "major depressive disorder" AND "polymorphism", identified a total of 1232 studies, and the second search strategy using the following keywords: "major depression" OR "major depressive disorder" AND "gene candidate", identified 305 studies, of which 141 were duplicates. After the removal of duplicates and 1255 hits not fulfilling the inclusion criteria, 141 studies were finally included in the qualitative review of gene association studies focusing on MDD. The detailed flowchart of the literature review process we reported in Figure 1.

Summary of Eligible Studies
We present a narrative summary on investigated candidate genes for MDD reported from July 2012 until March 2019. We have summarized the main characteristics of the identified studies in Table S1 (Table S1: Main characteristics and findings of studies included in the systematic review.) Only significant p values (abbr. p) and odds ratios (abbr. OR) are presented in Table S1. Selected articles explored associations (see Table S1) of 595 polymorphisms in 175 genes. The 87 articles reported nominal significant associations (p < 0.05). These 87 articles reported significant results for 172 polymorphisms in 85 candidate genes. In 25 of these genes, multiple SNPs associations were found by one or more studies. The 13 SNPs associations were confirmed by at least two studies. Other SNPs associations were confirmed only by one study, 20 of those delivered conflicting results. Two studies [15,16] delivered results of meta-analyses, in which two SNPs showed significance. Four studies found significant associations only after population stratification by gender: in the women subgroup [17,18], in the men subgroup [19,20]; one study enrolled only female participants [21]; and one study presented significant results in a particular age group of 37 years or older participants [22]. The genotyping methods and main tests for statistical analyses used in the studies are summarized in Table S2 (Table S2: Summary of genotyping and statistical methods used in the eligible studies).
We observed that 31% of all studies reported using quantitative polymerase chain reaction (qPCR); other methods were used in 15% or less of the eligible studies. The 80 out of 141 studies (57%) applied a method to correct for testing multiple gene variants. Association remained significant after correction for multiple testing in 35 studies. Most of the articles (122 out of 141) reported examining HWE for studied polymorphisms. According to the authors, the results showed that in most studied populations both case and control groups were in HWE, in some of these studies SNPs not in accordance with HWE were exempted from further analysis [23][24][25][26][27][28][29][30][31][32][33][34][35]. Fifty-four of the eligible studies were performed in China, 13 in Poland, 11 in Korea, 5 in Japan, 4 in the USA, Spain, Germany, Italy, Denmark, and Turkey, 3 in the Netherlands, 2 in Australia, Malaysia, Taiwan, Iran, Slovakia, Hungary, and the United Kingdom, and 1 in the other 17 countries of Europe, Asia, or South America.
Ten (7%) studies were evaluated to be in the high risk of bias group; 35 (25%) had a higher risk of bias; 61 (43%) had a low risk of bias; and 35 (25%) studies could not be sufficiently evaluated and were stated as having unclear risk of bias (for detailed assessment information see Supplementary information, Section A).
It is worth mentioning that 97 out of 141 studies (69%) did not use or describe power analysis, which may have had an impact over the significance of their results, as some of the authors discuss in their articles.

Comparison of Results with the Previous Systematic Analysis Regarding Associations between Gene Candidates and MDD
The previous comprehensive systematic review of linkage studies regarding MDD was published in 2012 [12]. After comparing findings of previously analysed studies with selected articles of our review, only 18 genetic polymorphism associations were confirmed in both reviews by at least one study. Most of the recent studies analysed other SNP's. Both reviews included studies with various sample sizes; however, it is important to note that the majority were lacking in power analysis. The previous review also conducted a replication study with a considerably large sample size. They found that 13 SNPs in 12 genes showed significant associations with MDD in the full sample. Twp of those SNP's (rs1360780 and rs2522833) were further tested and significant associations in two studies from our review were confirmed.

Results of Gene Functional Enrichment Analysis
To explore the biological knowledge of genes associated with MDD (n = 85) in BP, MF and CC and molecular pathways, we used GO and KEGG enrichment. The functional annotation of GO was successful with assigned 73 GO Biological process, 82 GO Molecular function and 83 GO Cellular component terms ( Figure 2). Additionally, KEGG pathway enrichment was performed, and we determined that 16 of these MDD associated genes are primarily involved in neuroactive ligand-receptor interaction (KO04080)-which may be an indication of highly important differentiated brain activity during MDD (for detailed outline of investigated genes (see Supplementary information, Section B).
Further analysis was conducted only with genes with assigned significant (p < 0.05) molecular function and biological process (n = 30 and n = 50).
Findings from DAVID-GO term and KEGG pathway analysis were further refined. We overlapped significant genes from both MF and BP groups and identified 23 genes that could be related to MDD based on our sample of studies ( Figure 3).  The analysis of significant MF from GO showed that MDD-associated genes mainly function as enzyme binding (GO:0019899) (n = 11, e.g., AKT, CREB1, CAT, SORT1, TGFB1, YWHAE), as receptors (e.g., glutamate receptor (GRM) family or receptor subunits (GO:0008-066, GO:0004970, GO:0001642) (GRIA, GRIN, GRIK) or are involved in transportation (GO:0022857) such as SLC6A family. The majority of ascertained significant BP were associated with the chemical synaptic transmission (n = 17, GO:0007268). Furthermore, genes significantly related with CC were with the key word "membrane" in the attributed terms (GO:0005887, GO:0042734, GO:0045211, GO:0016021) (more information in Supplementary information, Section B).

Characteristics of Most Studied Genes
In the present study, we analysed 141 publications that performed candidate gene association studies to determine MDD associated SNPs. Intensive literature analysis, we also performed on most prominent genes to further unravel possible gene candidate association with MDD. We examined significant 172 polymorphisms in 85 candidate genes. We found the top 23 genes with successfully assigned molecular function and biological process that in GO terms could be involved in brain disorders such as major depressive disorder. The in-depth analysis revealed that the most crucial genes for MDD could be GRIA, GRIN, and GRIK family genes as well as more well-known SLC6A family members. However, it is important to stress that associations with all these candidate genes were found mostly in single studies, so replication studies are crucial in order to determine the significance of these results.

Genes Involved in the Glutamatergic Pathway
Gene functional analysis from MDD associated genes revealed that the majority of genes from this sample of studies are involved in signal transmission, especially glutamate neurotransmission. Glutamate receptors genes such as GRIA2 (glutamate ionotropic receptor AMPA Type Subunit 2), GRIN2A (glutamate ionotropic receptor NMDA type sub-unit 2A), GRIK1 and GRIK4 (glutamate ionotropic receptor kainate type sub-unit 1 and 4) are ligand activated ion channels which allow ions to flow to the neurons upon activation. The metabotropic glutamate receptor (mGluR) genes, namely, GRM3, GRM4, and GRM7 (glutamate metabotropic receptor 3, 4, and 7) are G-protein-coupled receptors which enable cell activation by extracellular signalling molecules [36]. Several authors from our review investigated and found associations between gene candidates involved in the glutamatergic pathway; however, the lack of power analysis in some studies with potentially too small sample sizes does not allow us to view the results as more promising [15,[37][38][39]].

Genes Involved in Neurotransmition through Regulation of Calcium Channel Activity
Other genes researched in the studies of our systemic review, encode proteins involved in neurotransmission through regulation of calcium channel activity, such as NPY (neuropeptide Y) and NPY2R (neuropeptide Y receptor Y2), while others-SLC6A2, SLC6A3, and SLC6A4 (solute carrier family 6 member 2, 3, and 4) are responsible for monoamine transmembrane transporter activity. According to recent literature, NPY was associated with the resistance to treatment in MDD [40]. One study in our review investigated and found a positive association for a few polymorphisms of NPY (33); it also had a quite large sample of participants; however, power analysis was not performed or described. NPY receptor gene NPY2R previously had been reported to be involved in neurodegenerative disorders such as Huntington's disease [41]. Only 1 study in our review investigated this gene [42]. Despite a determined statistically positive association, it was likely lacking in statistical power due to a small sample size.
Among calcium channel regulators, SLC6A4 is one of the most widely studied genes. It is responsible for transportation of serotonin. The repeat polymorphism with long (L) and short (S) alleles of this gene could be associated with depression and response to treatment [43,44]. In a recent report, family members 2 and 3 of this gene family have been also identified as MDD candidate genes [45]. Studies included in our review further investigated associations between MDD and polymorphisms of genes like SLC6A2, SLC6A3, and SLC6A4. However, the results of those studies were conflicting. For example, some studies with various samples showed positive association for polymorphisms of SLC6A4 [22,[46][47][48][49][50][51][52][53] while others did not confirm such findings [54][55][56][57][58]. The same tendency was seen for the polymorphisms of SLC6A2 [59][60][61], which may indicate a possibility of false positive results. As for SLC6A3, 1 study found a significant association of 1 polymorphism in a moderate sample of participants [59].

Genes Involved in Apoptosis
Some genes with assigned molecular function of enzyme binding or growth factor activity were also attributed to the biological process of apoptosis. Specifically, AKT1 (AKT serine/threonine kinase 1), SORT1 (sortilin 1), YWHAE (tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein epsilon), GDNF (glial cell derived neurotrophic factor), NRG1 (neuregulin 1), and VEGFA (vascular endothelial growth factor A). Apoptosis could be considered as one of the main metabolic pathways related to MDD pathophysiology [62]. Apoptosis suppressors, including AKT1, GDNF, VEGFA, and NRG1, have been previously associated with MDD. AKT1 gene polymorphisms were associated with MDD severity [63]. Another apoptosis blocker is GDNF that promotes the differentiation, maintenance, and viability of various cell populations and thus neuronal survival in the nervous system [64]. It is especially important to mention GDNF interaction with the brain neurotransmitters-specifically, previously mentioned AMPA receptors, and this cross-talk could be important in the pathogenesis of depression [65]. The role of VEGFA in the nervous system is very similar to GDNF. The growth factor in certain conditions is involved in neuronal survival by increasing the proliferation and decreasing the apoptosis of neural progenitors, and the polymorphic variation in VEGFA might explain the variability in response to treatment [66]. Meanwhile, NRG1-a family of extracellular growth factors are responsible for maturation and migration of neurons that simultaneously block cell death through interactions with their tyrosine kinase receptor on the neural cell surface [67]. In a previous study, polymorphic variants of NRG1 was associated with predisposition for bipolar disorder [68]. On the other hand, apoptosis inducer sortilin encoded by the SORT1 gene located on chromosome 1p13.3 was also associated with other neurotrophic factors, such as VEGFA [69]. Finally, in other neural disorders-for instance, suicidal behaviour and schizophrenia, the apoptosis stimulator YWHAE, which encodes eta-polypeptide 14-3-3ε, haplotype (rs1532976), indicated more pronounced suicidal behaviour or risk of disease occurrence [70]. We also investigated in the reviewed studies, polymorphisms of the discussed genes. Studies for NRG1 delivered conflicting results, with a larger sample size in the study which found a positive association [27,71]. YWHAE was also investigated in a large study [19] with significant statistical power and found a positive association. Other studies which investigated VEGFA [72], GDNF [28], AKT [34], and SORT [69] were lacking in statistical power so it is likely that replication studies with larger samples will be needed in order to find out if these genes have any substantial impact on development of MDD.
There are many reasons why pinpointing the SNPs in MDD is very difficult. MDD is presumably a polygenic disease impacted by many genetic variations, as other well studied neural diseases such as schizophrenia [10]. Considering this, many of the genes associated with MDD may not be responsible for a causal effect of the disease as previously mentioned by Border et al. [11], but may be involved in interconnecting gene effect pathways and in treatment resistance. Interpersonal variation also plays an important role in manifold genetic associations identified in MDD. Considering this, extensive studies of gene interaction pathways could provide more answers to the complexity of MDD.
It is interesting that only a few polymorphisms were further studied after the 2012 systematic review, and even fewer gained any evidence in their favour. This may show trends of investigators mostly choosing new fields. It is also possible that the investigation of gene candidates as a method of choice may not be that promising in finding out the genetic biomarkers of MDD, as a rather recent large replication study of candidate genes suggests [11].
Most recent research 2019-2021 was focused on genes such as MTHFR [73] and BDNF [74]. The literature review (Fratelli at all, 2020) [75] showed the association between the 5HTTLPR genetic variants and several aspects of MDD. These findings show a consistently similar trend of genetic biomarker selection for further investigation as in our review.

Important Characteristic of Included Studies
We consider the major design weaknesses of the analysed studies as follows: lack of matching or adjustment for gender and age between case and control groups, small sample sizes and lack of statistical power analysis, as well as lack of description concerning used statistical analysis.
Because many studies were small, and most of all eligible studies have not reported sample size estimation or power analysis, they were prone to type-II error. Because the reliability of the results is affected by the sample size, the lack of statistical power analysis in most studies leaves some uncertainty about the significance of the results.
We observed a substantial heterogeneity among the included studies. Studies differed widely on their population characteristics, case definitions, selection of controls, and statistical and genotyping analyses. In most cases, samples were collected from only one hospital, which increases the selection bias. Some studies did not provide sufficient description of methods used in their statistical analysis. Considering this and striving for good transparency, we carefully considered the risk of bias, but by trying to lessen the effect of those limitations and increase the review comprehensiveness, we did not exclude high risk-of-bias studies. In assessing the risk of bias in these studies, we specified risks specific to this content area. As described previously, each study we assessed in five dimensions and the fifth dimension addressed the question if the statistical adjustment was carried out for two variables: gender and age. Some of the studies with significant differences between case and control groups included in our review did not make an adjustment based on their gender and age in their statistical analysis. Because gender and age are important in the onset of depression, the lack of correction may have influenced the significance of the results.
In addition, we should mention that in many studies results were not consistently corrected for multiple testing, so their findings were prone to type-I error.
We should consider that the results might have been influenced by the fact that deviation from Hardy-Weinberg equilibrium (HWE) was detected in some studies and was not even calculated in others. Deviation from HWE in MDD subjects could be a result of a sampling bias or genotyping error. Alternatively, this could indicate that those SNPs are causative variants for MDD. This latter possibility should be evaluated in further studies using additional independent populations.
We found that more than two thirds of candidate genes were investigated in the Han Chinese population, and less studied in individuals of Western European ancestry or Mexican-American populations. In comparison, the previous systematic review, which analysed the September 2007 and June 2012 studies [12], was dominated by a Western European population. This could influence the generalization of the results. Research showed that due to natural selection, genetic heterogeneity of susceptibility to complex diseases, such as affective disorders, intensifies. In addition, there are inconsistencies in detecting the genetic markers of these diseases among different ethnic populations.

Strengths and Limitations of This Review
The strengths of this study include its broad inclusion criteria, using explicit methods which limit bias, and assist in revealing new factors. However, we should acknowledge the limitations of our systematic review-using only one data base, not exploring Grey literature (opinions, case studies, and doctoral theses), and excluding studies that were published in other languages.
Unfortunately, there is still insufficient data on the links between genes and depression. Despite the reported genetic associations, most studies were lacking in statistical power analysis, research samples were small, and most gene polymorphisms have been confirmed in only one study.
Further genetic research with larger research samples is needed to ascertain whether the relationship is random or causal.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/medicina58020285/s1, Table S1: Main characteristics and findings of studies included in the systematic review ; Table S2