Replicated Risk Nicotinic Cholinergic Receptor Genes for Nicotine Dependence

It has been hypothesized that the nicotinic acetylcholine receptors (nAChRs) play important roles in nicotine dependence (ND) and influence the number of cigarettes smoked per day (CPD) in smokers. We compiled the associations between nicotinic cholinergic receptor genes (CHRNs) and ND/CPD that were replicated across different studies, reviewed the expression of these risk genes in human/mouse brains, and verified their expression using independent samples of both human and mouse brains. The potential functions of the replicated risk variants were examined using cis-eQTL analysis or predicted using a series of bioinformatics analyses. We found replicated and significant associations for ND/CPD at 19 SNPs in six genes in three genomic regions (CHRNB3-A6, CHRNA5-A3-B4 and CHRNA4). These six risk genes are expressed in at least 18 distinct areas of the human/mouse brain, with verification in our independent human and mouse brain samples. The risk variants might influence the transcription, expression and splicing of the risk genes, alter RNA secondary or protein structure. We conclude that the replicated associations between CHRNB3-A6, CHRNA5-A3-B4, CHRNA4 and ND/CPD are very robust. More research is needed to examine how these genetic variants contribute to the risk for ND/CPD.


Introduction
Nicotine dependence (ND) is commonly assessed for cigarette smokers with DSM-IV criteria or a severity scale such as the Fagerstrom Test for Nicotine Dependence (FTND) [1]. FTND assesses the frequency of smoking, the number of cigarettes smoked and the urgency to smoke, and is widely used to index the severity of ND. Of the six questions assessed in FTND, the number of cigarettes smoked per day (CPD) has been shown to carry the highest genetic loading [2]. It has been hypothesized that the nicotinic acetylcholine receptors (nAChRs) play important roles in the development of ND and shows a strong association to CPD. The nAChR is named because its endogenous agonist is acetylcholine and the plant alkaloid nicotine also binds to these receptors. Neuronal nAChR include α2-α10 and β2-β4 subunits that are encoded by CHRNAs 2-10 and CHRNBs 2-4, respectively, whereas muscle-type nAChRs include α1, β1, γ, δ and ε subunits that are encoded by CHRNA1, CHRNB1, CHRNG, CHRND and CHRNE, respectively (reviewed by Zuo et al. [3]).
In this article, we reviewed the relationship between CHRNs and ND or CPD that were replicated across studies. We show that most significant risk variants (84%) for ND/CPD at the CHRNs are typically located in non-coding regions, and 95% of them have no direct effects on protein structure (see below). These non-coding genetic variants may have effects on the function of genes by altering the transcription, splicing or stability of the coding mRNAs. The association signals detected from the non-coding regions might be related to the roles of non-coding RNAs (ncRNAs) existing within, or proximate to, these regions, and thus these ncRNAs were explored in this study.
ncRNAs include long non-coding RNAs (LncRNAs) and small non-coding RNAs such as miRNAs, piRNAs, siRNAs, snoRNAs and rasiRNAs. Recent evidence suggests that LncRNAs are involved in a wide variety of cellular functions, including epigenetic silencing, transcriptional regulation, RNA processing and modification [4][5][6]; LncRNAs are also implicated in neural plasticity [7], neuropathological process [8], neurotransmission [9], and stress response [7]. Dysregulation of many LncRNAs has been found to contribute to substance use disorders including alcohol, nicotine, heroin and cocaine dependence. For example, NEAT2, an LncRNA regulating synapse formation [10], was up-regulated in alcoholics' brains [11]; NEAT2, NEAT1, MIAT and MEG3 were up-regulated in the nucleus accumbens (NAc) of heroin abusers [12]; and NEAT2, MIAT, MEG3 and EMX2OS were elevated in the NAc of cocaine abusers [12]. Smokers had dramatically elevated H19 expression in airway epithelium [13]; demethylation of H19 was correlated to chronic alcohol use in men [14]; and many LncRNAs mediated cocaine-induced neural plasticity in the NAc and conferred risk for cocaine dependence [8]. Together, evidence accumulates to support the hypothesis that LncRNAs contribute to the severity of ND, including the number of cigarettes smoked per day (CPD).
In addition to LncRNAs, piRNAs are also increasingly being studied for their roles in cellular functions. Numerous research indicates that piRNAs have important roles in modulating mRNA stability, regulating target mRNAs and translation [15], preserving genomic integrity [16], suppressing transposons [17], remodelling euchromatin, developmental regulation and epigenetic programming [18,19]. Recent evidence suggests that piRNAs are abundant in the brain [17,[20][21][22][23][24][25][26][27]. These piRNAs have unique biogenesis patterns and are associated with a neuronal Piwi protein. Thus, it has been hypothesized that piRNAs may potentially play roles in ND/CPD too. The LncRNAs and piRNAs that might regulate the effects of the replicated risk CHRNs on disease were analyzed in this study. This analysis is a necessary step towards identification of the missing regulatory pathways after a long history of attention to the coding mRNAs and other ncRNAs such as miRNAs.
In this article, we also reviewed the distribution of the nAChRs encoded by the replicated risk CHRNs in the human/mouse brain and then verified their expression in an independent sample of mouse brain. Furthermore, we explored the possible mechanisms underlying these replicated associations using a series of bioinformatics analyses.

Materials and Methods
2.1. The Replicated Associations between Nicotinic Cholinergic Receptor Genes (CHRNs) and Nicotine Dependence/Cigarettes per Day (ND/CPD) and the Expression of Risk Genes in Brain In PubMed (http://www.ncbi.nlm.nih.gov/pubmed), we searched for the literature using the keywords "(nicotinic acetylcholine receptor OR nAChR OR nicotinic cholinergic receptor OR CHRN) AND (nicotine dependence OR nicotine addiction OR smoking OR cigarette)" and obtained 2463 reports (as of 19 September 2016). From these articles, we extracted the established associations between CHRNs and ND/CPD. We noticed that although most of the distinct CHRNs have been associated with ND/CPD, the replicable associations at single-point level by different studies are rare. We list such rare associations for six genes in three genomic regions from a total of 20 studies in Table 1.
Additionally, the distribution of the nAChRs encoded by the replicated risk CHRNs reported in the literature is illustrated in Figure 1 (http://anatomy-bodychart.us/) .

The Replicated Associations between Nicotinic Cholinergic Receptor Genes (CHRNs) and Nicotine Dependence/Cigarettes per Day (ND/CPD) and the Expression of Risk Genes in Brain
In PubMed (http://www.ncbi.nlm.nih.gov/pubmed), we searched for the literature using the keywords "(nicotinic acetylcholine receptor OR nAChR OR nicotinic cholinergic receptor OR CHRN) AND (nicotine dependence OR nicotine addiction OR smoking OR cigarette)" and obtained 2463 reports (as of 19 September 2016). From these articles, we extracted the established associations between CHRNs and ND/CPD. We noticed that although most of the distinct CHRNs have been associated with ND/CPD, the replicable associations at single-point level by different studies are rare. We list such rare associations for six genes in three genomic regions from a total of 20 studies in Table  1.
Additionally, the distribution of the nAChRs encoded by the replicated risk CHRNs reported in the literature is illustrated in Figure 1 (http://anatomy-bodychart.us/) .

Expression Correlation Analysis in Human Brain
Based on our review (Figure 1), all six replicated risk CHRNs are expressed in the midbrain that is enriched with dopaminergic neurons, and four CHRNs (i.e., CHRNA4, CHRNA5, CHRNA6 and CHRNB3) are expressed in the striatum that is enriched with GABAergic terminals. These are two main neurotransmission systems that have been related to CHRNs in the literature (see Section 4: Discussion). We evaluated the mRNA expression levels of these genes and the dopaminergic and GABAergic receptors/enzymes in two independent brain tissue samples using Affymetrix Human ST 1.0 exon arrays (validated by qPCR). The first sample included ten human brain tissues extracted from 134 Europeans (UK Brain Expression Consortium (UKBEC) [74]). These 134 individuals were free of neurodegenerative disorders, and the ten brain tissues included cerebellar cortex, frontal cortex, temporal cortex, occipital cortex, putamen, thalamus, hippocampus, substantia nigra, intralobular white matter and medulla. The second sample included 93 autopsy-collected human frontal cortical tissues [75]. These 93 individuals included 55 male and 38 female Europeans, from 34 to 104 years old with an average of 74 ± 16 years. The postmortem intervals, i.e., the time from death to brain tissue collection, were 1.2-46 h with an average of 14.3 ± 9.5 h. These 93 individuals had no defined neuropsychiatric condition either. Correlations between expression of the risk CHRNs and expression of 25 dopaminergic and GABAergic receptor/enzyme genes were tested using Pearson correlation analysis for the first sample and generalized linear model (GLM) analysis for the second sample ( Table 2). The 25 dopaminergic and GABAergic genes were DRD1-5, TH, GABRA1-6, GABRB1-3, GABRD, GABRE, GABRG1-3, GABRR1-3, GABRP and GABRQ. In the GLM, the expression levels of CHRNs served as dependent variable, and those of dopaminergic and GABAergic receptor genes as independent variable, by correcting for age, sex and postmortem interval. The directions of the correlations will be shown by the signs of correlation coefficients (r) or regression coefficients (β) (Supplementary Tables S1 and S2). α was set at 3.5 × 10 −5 for the first sample because 10 brain regions, 25 dopaminergic and GABAergic genes and six CHRNs were evaluated, 6.9 × 10 −7 for the second sample because 12,114 transcripts in the array and six CHRNs were evaluated.

Detection of Chrn mRNA Expression in Mouse Brains
To verify the expression of the six replicated risk genes (Figure 1), we examined their mRNA expression in mouse brains in our own samples. The levels of mRNA expression for the whole brain and in eight brain areas were examined, including the cortex, dorsal striatum, NAc, hippocampus, amygdala, midbrain, ventral tegmental area (VTA) and cerebellum ( Table 3). The details for mouse strains, gene expression analysis, and calculation for standardized expression values (SEVs) and fold changes (FCs) were published previously [3].

Bioinformatics Analysis
The linkage disequilibrium (LD) between the replicated risk SNPs was assessed using online HapMap data. To verify the potential functions of these replicated risk SNPs, we predicted their functions using a series of bioinformatics analyses. We used UCSC Genome Browser data or other bioinformatics analysis software packages (e.g., FuncPred [76] or VE!P [77]) to see whether the risk SNPs are located in LncRNAs, in transcription factor binding sites (TFBS), in open chromatin regions, within methylated CpG islands, within copy number variations (CNVs) or in exonic splicing silencers (ESS) or enhancers (ESE). Additionally, Polyphen [78] and SIFT [79] were applied to predict the pathogenicity in order to see whether these risk SNPs affect protein function or structure, and MFOLD [80] was applied to predict whether these risk SNPs alter secondary RNA structure. The conservation of these risk SNPs across 17 species was also predicted [81]. The tertiary structure of the mutant and wild-type protein obtained by translation of each mutant gene was simulated using SWISS-MODEL software [82] so as to find the difference between them.

Long Non-Coding RNAs (LncRNA) and piRNA Analysis
There are tens of thousands of LncRNAs (>200 nt) across the transcriptome [5,83,84], and more than half of them are expressed in the brain [85]. According to the positional relationship between LncRNAs and their associated protein-coding genes, LncRNAs can be classified as intergenic, intronic, antisense, sense overlapping, and bidirectional lncRNAs [86]. In this study, we extracted the LncRNAs close to, or within, the risk CHRN genes from the National Center for Biotechnology Information (NCBI) Gene database (http://www.ncbi.nlm.nih.gov/gene).
The RNAs interacting with the Piwi subfamily of proteins in Piwi/piRNA complex are named piRNAs. piRNAs are a class of small ncRNAs originally isolated from the mammalian germline, but recently they have also been detected in the brain [21,22,24]. Each species usually has hundreds of thousands of unique piRNA sequences. Mature piRNAs are short, single-stranded RNA molecules approximately 24-32 nucleotides in length. They are unevenly distributed in the genome, and usually cluster in some specific genomic loci. In this article, we searched for piRNAs within the risk CHRN genes from the piRNABank database [87]. (Table 1) Replicated associations for ND/CPD were found at 19 SNPs in three genomic regions (CHRNB3-A6, CHRNA5-A3-B4 and CHRNA4) in Europeans, Africans and Asians. They were replicated across at least three independent samples in at least two independent studies including genome-wide association studies (GWASs) [54,62,66,68,69,71,88] and candidate gene studies [55][56][57][58][59][60][61]63,64,67], and some of them were verified by functional studies.

Distributions of Nicotinic Acetylcholine Receptors (nAChRs) Encoded by the Replicated Risk CHRNs in
Brain (Figure 1) The three replicated genomic regions including six genes are expressed in at least 18 brain areas. They are most commonly expressed in medial habenula, midbrain (including the VTA, substantia nigra, interpeduncular nucleus (IPN), lateral and medial geniculate bodies, and superior colliculus) and the mesolimbic system (VTA→NAc). They are also expressed in cortex, entorhinal cortex, striatum, thalamus, hippocampus, amygdala, locus coeruleus, brainstem nuclei and cerebellum. Specifically, CHRNA5, CHRNA3 and CHRNB4 are highly expressed in medial habenula. All six genes are expressed in the midbrain, although different genes have distinct densities in different midbrain areas. CHRNA4 is expressed in the thalamus at the highest level. CHRNA3 and CHRNA5 are also expressed in the thalamus, with α5 in low density. CHRNA5 and CHRNA3 are expressed in or around the hippocampus. Both have expression in amygdala and entorhinal cortex. CHRNA3 also has a low level of expression in hippocampus. CHRNA5 and CHRNA4 are expressed in cortex. CHRNA3 is also expressed in cingulate cortex and insular cortex at low density. CHRNA5, CHRNA4, CHRNA6 and CHRNB3 are expressed in striatum. CHRNA6 is expressed in locus coeruleus, a noradrenergic nucleus with wide projections to cortical and subcortical structures [107]. CHRNA3 is also expressed in brainstem nuclei. Finally, CHRNA5 and CHRNA3 are expressed in cerebellum. (Table 2, Tables S1 and S2) These six CHRN genes were detected in ten human brain areas. In many areas, their expression was significantly correlated with the dopaminergic or GABAergic expression (p < α) ( Table 2). The correlation coefficients (0.358 ≤ |r| ≤ 0.920), regression coefficients (0.008 ≤ |β| ≤ 0.749) and p values (3.9 × 10 −42 ≤ p ≤ 3.3 × 10 −5 ) for these correlations are shown in the Supplementary Tables S1 and S2.

All Six Chrn Genes Were Expressed in Mouse Brain in Distinct Areas and at Different Levels, a Majority of Which Verified Previous Reports (Table 3)
We found that all six Chrn genes were expressed in mouse brain at different levels. All of these genes were expressed in the hippocampus, in which the gene with the most highly abundant expression (SEV > 9) was Chrna4 (SEV = 9.95). α4 mRNA was also abundant in other brain areas examined (SEV = 8.29-10.73), with 2.5-13.3-FCs in mRNA level compared to the expression base. Compared with other genes, α4 mRNA was also the most abundant in the whole brain and five other brain areas including cortex, striatum, NAc, amygdala and cerebellum (Table 3).

Bioinformatics Analysis
The 15 SNPs at CHRNB3-CHRNA6 are all in high LD (D' > 0.95) (https://hapmap.ncbi.nlm.nih.gov/). Among the 19 risk SNPs, 10 SNPs are located in LncRNAs that might regulate the gene expression. There are eight SNPs located in the TFBS. Most of them are located in the 5 to CHRNB3. They may affect the local DNA conformation, and thereby influence the binding of transcription factors [108]. Two SNPs, i.e., rs4954 and rs2236196, are located in the open chromatin regions, which are often associated with regulatory factor binding. One SNP, rs1051730 at the exon 7 of CHRNA3, is located within a 234 bp CpG island, whose methylation status may affect the expression of CHRNA3 [109]. rs16969968 (Asp398Asn) at the exon 5 of CHRNA5 is located in an exonic splicing silencer or enhancer. Furthermore, seven SNPs are predicted to significantly or highly significantly alter the RNA secondary structures, including rs10958725, rs13273442, rs4736835, rs13277524, rs6474412 and rs4952 at CHRNB3, and rs16969968 at CHRNA5. Two SNPs are predicted to mildly alter the RNA secondary structures, including rs1955186 and rs7004381 at CHRNB3. They may affect the downstream activities of the RNA molecules [110]. rs16969968 is also predicted to be conservative across species. Finally, amino acid sequence alignment and three-dimensional computer space model verify that rs16969968 highly significantly alters protein structure and function (Supplementary Figure S1). 2nd RNA alteration, the alteration of secondary RNA structure predicted using MFOLD; LncRNA, these SNPs are located in LncRNAs; TFBS, these SNPs are located in the transcription factor binding sites; chromatin, this SNP is located in an open chromatin region; splicing, this SNP is located in an exonic splicing silencer or enhancer; tolerated/benign, these SNPs are predicted by SIFT/Polyphen not to significantly affect protein function or structure; conservative, this SNP is predicted to be conservative; CpG, this SNP is located within a 234 bp methylated CpG island.

The LncRNAs and piRNAs Related to the Replicated Risk CHRNs
The LncRNAs proximate to each gene are listed in Table 6. One sense LncRNA 37 kb to CHRNB4 is a large intergenic non-coding RNA (LincRNA), with a length of 35 kb; two others overlapping with CHRNB3 and CHRNA4 are antisense LncRNAs, with lengths of 11 kb to 22 kb. The annotated piRNAs mapping within the two replicated CHRN gene regions are listed in Table 7. These piRNAs show a size distribution between 26 and 31 nt. Intergenic, located between two protein-coding genes and at least 1 kb away from these genes; Sense, LncRNAs are transcribed from the same genomic strand as the protein-coding mRNAs; Antisense, LncRNAs are transcribed from the antisense strand.

Discussion
Replicated associations for ND/CPD were found at 19 SNPs in three genomic regions (CHRNB3-A6, CHRNA5-A3-B4 and CHRNA4). Many of these associations are highly replicable across studies, highly significant, verified by functional studies, and supported by bioinformatics analysis, and thus are very robust. Interestingly, these three replicated loci were just the top three peak risk loci for ND identified by a GWAS meta-analysis using a large sample size of 17,074 [69]. We believe that CHRNB3-A6, CHRNA5-A3-B4 and CHRNA4 play important roles in the susceptibility to ND/CPD. Mechanisms underlying these roles may be related to the brain areas where the risk genes are expressed, the specific functions of the risk variants, or the regulatory pathways for the expression of these risk genes.
All replicated risk genes were expressed in human/mouse brain regions, which was verified at the mRNA level in our independent samples of both human and mouse brains. Many of these brain areas are important for the development of drug dependence [111]. Functional data have shown changes in nicotine intake following manipulations of α5*, α3* and β4* nAChRs in the medial habenula, supporting that medial habenula could contribute to the reinforcing effect of nicotine [28]. Many areas in midbrain are enriched in dopaminergic neurons, including VTA (where all six replicated risk genes were expressed) and substantia nigra (β3* and β4*). We demonstrated that the mRNA expression of six CHRNs was correlated with the expression of dopaminergic receptor/enzyme genes in ten brain areas. Thus, the CHRN receptors in these areas may modulate dopamine release, and contribute to the reinforcing effect of nicotine. Several pathways in the midbrain, e.g., habenula-IPN pathway (α5*, α3* and β4*) and VTA-NAc pathway (i.e., mesolimbic system; α5*, α3* and α4*), are also critical to drug-induced reward responses. The thalamus plays a major role in relaying and transforming information to the cortex and in turn modulates cortical outputs. Imaging studies in humans implicated the thalamus in cognitive control [112], a process frequently compromised in individuals with addiction [113]. Nicotine binding to α4* nAChRs in the human thalamus is very high in most thalamic nuclei, especially in the lateral dorsal, the medial geniculate, lateral geniculate and anterior nuclei. Striatum (α5*, α4*, α6* and β3*) receives dopaminergic input to the GABAergic medium spiny neurons. We demonstrated that the mRNA expression of six CHRNs was correlated with the expression of GABA receptor genes, supporting that nicotine stimulation of dopamine and GABA terminals in striatum may facilitate the release of these neurotransmitters. The locus coeruleus (α6*) is the principal site for brain synthesis of norepinephrine (noradrenaline). This nucleus may be involved in physiological responses to stress and panic, and some symptoms of ND. Finally, the hippocampus, amygdala, cortex including entorhinal cortex, and cerebellum are involved in reward, learning, motor co-ordination, memory and/or emotion. Nicotine may direct information flow through the neural circuits via the activation of α5*, α4* and α3* in these areas.
Our cis-eQTL and bioinformatics analyses provided additional evidence to support the previous findings that these replicated risk SNPs were functional [114][115][116]. They might influence the transcription, expression and splicing of the risk genes; they might alter the RNA secondary structure and thus affect the downstream activities of the RNA molecules; or they might even alter the structure and function of the proteins encoded by these risk genes. This analysis supports the roles of CHRNs in ND/CPD.
The sense LincRNA usually collaborates with chromatin modifying proteins (PRC2, CoREST and SCMX) to regulate expression of proximate genes [117]. Accordingly, we can postulate that the LincRNA XR_932509.1 might potentially regulate the expression of the CHRNB4 and might be functional components of the pathways through which the CHRNB4 variants influence risk for ND/CPD. The other two antisense-overlapping LncRNAs, i.e., XR_949716.1 and NR_110634.1, might use diverse transcriptional and post-transcriptional mechanisms [118,119] to regulate CHRNB3 and CHRNA4 to play roles in ND/CPD.
The piRNAs in brain usually show unique biogenesis patterns and predominantly nuclear localization [20]. The influence of piRNAs on disease might depend on the neurotransmitters/genes they interact with or the brain areas they are expressed in. For example, the piRNAs may have robust sensitivity to serotonin, a neurotransmitter with important roles in learning and memory and widely implicated in the etiology of many mental disorders [18,20]. The Piwi/piRNA complex may facilitate serotonin-dependent methylation of a conserved CpG island in the promoter of CREB2, the major inhibitory constraint of memory, leading to enhanced long-term learning-related synaptic facilitation [20]. Some piRNAs expressed in hippocampal neurons may influence dendritic spine morphogenesis [21]. For instance, piRNAs may target Astrotactin, which has been implicated in neuronal migration [120] or regulate genes to control nervous system function [21]. One is tempted to speculate that these piRNAs might potentially regulate the expression of the risk genes and serve as functional components of the pathways through which the risk SNPs influence risk for ND/CPD. These hypotheses regarding LncRNAs and piRNAs should be tested in the future.
Supplementary Materials: The following are available online at www.mdpi.com/2073-4425/7/11/95/s1, Table S1, Significant expression correlation between CHRNs and dopaminergic and GABAergic receptor genes in ten human brain areas; Table S2, Significant expression correlation between CHRNs and dopaminergic and GABAergic receptor genes in human frontal cortex; Figure S1, The tertiary structures of α5 nAChR altered by rs16969968 (D: Asp; N: Asn).

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: