Epigenetic Signatures of Smoking in Five Brain Regions

(1) Background: Epigenome-wide association studies (EWAS) in peripheral blood have repeatedly found associations between tobacco smoking and aberrant DNA methylation (DNAm), but little is known about DNAm signatures of smoking in the human brain, which may contribute to the pathophysiology of addictive behavior observed in chronic smokers. (2) Methods: We investigated the similarity of DNAm signatures in matched blood and postmortem brain samples (n = 10). In addition, we performed EWASs in five brain regions belonging to the neurocircuitry of addiction: anterior cingulate cortex (ACC), Brodmann Area 9, caudate nucleus, putamen, and ventral striatum (n = 38–72). (3) Results: cg15925993 within the LOC339975 gene was epigenome-wide significant in the ACC. Of 16 identified differentially methylated regions, two (PRSS50 and LINC00612/A2M-AS1) overlapped between multiple brain regions. Functional enrichment was detected for biological processes related to neuronal development, inflammatory signaling and immune cell migration. Additionally, our results indicate the association of the well-known AHRR CpG site cg05575921 with smoking in the brain. (4) Conclusion: The present study provides further evidence of the strong relationship between aberrant DNAm and smoking.


Introduction
Tobacco smoking has a strong impact on human health and has been identified as a risk factor for a variety of diseases [1,2]. While the mechanisms are still unclear, it has been assumed that epigenetic factors, especially DNA methylation (DNAm) changes, play a role in the pathophysiology of smoking-associated diseases [3,4]. To date, most epigenetic association studies (EWASs) in the context of smoking have been conducted in peripheral blood, largely given the convenient, less-invasive sampling procedure in comparison to other tissues. The largest EWAS meta-analysis of smoking in blood (n = 15,907) found 18,760 cytosine-phosphate-guanine (CpG) sites associated with smoking at epigenomewide significance [5]. Annotation of these CpG sites to genes pointed towards differential J. Pers. Med. 2022, 12, 566 2 of 12 methylation for almost one-third of all genes in the human genome, indicating an extensive effect of smoking on DNAm levels.
Findings from blood have been partially replicated in other tissues such as the lungs [6,7] and adipose tissue [8]. As the brain is involved in the development and maintenance of tobacco use disorder (TUD), DNAm changes in the brain are of interest but knowledge on smoking-associated DNAm signatures remains sparse. One EWAS has examined DNAm signatures of smoking in the nucleus accumbens (NAc, n = 221) using postmortem tissue, finding seven CpG sites to be associated with smoking at epigenome-wide significance [9]. None of these were found to be significant in an EWAS of smoking in blood, which suggests heterogeneity of associations between tissues [9]. Deciphering these tissue-specific and tissueshared methylation patterns could provide insights into a potential proxy function of blood for predicting differential methylation in the brain.
In substance use disorders (SUDs), a whole neurocircuitry of addiction consisting of multiple cortical, striatal, and limbic brain regions exhibits functional changes [10]. Cortical regions such as the prefrontal and the anterior cingulate cortex (ACC) are involved in executive control and are especially important in the preoccupation/anticipation stage of the addiction cycle [10]. The striatum is subdivided into a ventral (VS) and a dorsal part with the latter comprising the caudate nucleus (CN) and the putamen (PUT). The VS is thought to be related to reward processing [11], whereas CN and PUT are involved in sensorimotor processing and the habitual behavior observed in later stages of addiction [12].
To investigate differential methylation in the context of smoking, methylation changes within brain regions which are part of the neurocircuitry of addiction need to be assessed. In the present study, we investigated the tissue-specificity of smoking-associated methylation signatures by performing EWASs of smoking and evaluating the similarity of smoking-associated methylation patterns between blood and multiple brain regions (ACC, Brodmann Area 9 (BA9), CN, PUT and the VS). Based on the results, we performed downstream analyses including the assessment of differentially methylated regions (DMRs), gene ontology (GO) enrichment analysis, and GWAS enrichment analysis.

Samples
Postmortem human brain tissue was obtained from the New South Wales Tissue Resource Center (NSWTRC, University of Sydney, Sydney, Australia), as part of a previous study on alcohol use disorder (AUD) [13]. Information on smoking was available for a total of 304 postmortem brain samples originating from 80 European American tissue donors. For each donor, tissue from at least one of the five brain regions (ACC, BA9, CN, PUT, and VS) was available. BA9 brain samples from an additional 12 subjects were obtained from the University of Texas Health Science Center at Houston (UTHealth, Prof. Consuelo Walss-Bass). Here, sample donors were of Asian (n = 1), Black (n = 1), Hispanic (n = 2), and White (n = 8) ethnicities. For 10 of the 12 subjects, matched blood samples were available. Total sample sizes of different brain regions ranged from 38 to 72 (Supplementary Table S1). Inclusion criteria were age > 18 and no history of severe psychiatric, neurodevelopmental, or substance use disorder (except AUD and TUD). These criteria as well as smoking status were assessed in next-of-kin interviews. Subjects included in the present study had a smoking status of either current smoking or never smoking prior to death. A descriptive summary of phenotypic information on tissue donors is shown in Table 1.

DNA Extraction, DNAm Analysis, and Quality Control
Extraction of DNA, sample randomization, epigenome-wide DNAm analysis, and quality control of methylation data was performed as described in Zillich et al. [13]. In brief, epigenome-wide DNAm was determined using the Infinium Human Methylation EPIC BeadChip (Illumina, San Diego, CA, USA) and quality control was performed with a customized and updated version of the CPACOR pipeline [14]. Samples from the NSWTRC were processed separately for each brain region in either a single or in two batches (ACC and PUT: one batch, BA9, CN and VS: two batches). The UTHealth samples were processed in a single batch after randomization according to sex, AUD status, and tissue. Methylation data were quantile-normalized.

Statistical Analyses
All statistical analyses were performed using R version 3.6.1 [15]. A schematic workflow depicting the different analysis levels within the present study is shown in Figure 1.

DNA Extraction, DNAm Analysis, and Quality Control
Extraction of DNA, sample randomization, epigenome-wide DNAm analysis, and quality control of methylation data was performed as described in Zillich et al. [13]. In brief, epigenome-wide DNAm was determined using the Infinium Human Methylation EPIC BeadChip (Illumina, San Diego, CA, USA) and quality control was performed with a customized and updated version of the CPACOR pipeline [14]. Samples from the NSWTRC were processed separately for each brain region in either a single or in two batches (ACC and PUT: one batch, BA9, CN and VS: two batches). The UTHealth samples were processed in a single batch after randomization according to sex, AUD status, and tissue. Methylation data were quantile-normalized.

Statistical Analyses
All statistical analyses were performed using R version 3.6.1 [15]. A schematic workflow depicting the different analysis levels within the present study is shown in Figure 1.

Between-Tissue Correlation
M-values of methylation were generated by logit-transformation of β-values as described by Du, Zhang [16]. For each of the 632,086 CpG sites remaining after QC, average

Between-Tissue Correlation
M-values of methylation were generated by logit-transformation of β-values as described by Du, Zhang [16]. For each of the 632,086 CpG sites remaining after QC, average M-values were calculated separately in blood and brain samples. Correlation of mean M-values between blood and brain (n = 10) was determined using the Pearson correlation method.

EWAS in Five Brain Regions
Separate EWASs were performed to identify smoking associated CpG sites in each of the five brain regions (ACC, BA9, CN, PUT, VS). The EWASs were restricted to samples originating from donors with European American ancestry to reduce confounding by ancestry. A linear regression model with M-values specified as the dependent variable and smoking status as a predictor was used. As covariates, sex, age, postmortem interval (PMI), AUD status, batch, and the first 10 principal components of control probes (PCcp) were included. Neuronal cell fractions were estimated using the Houseman approach [17] with the dorsolateral prefrontal cortex reference dataset [18], and included as a covariate. For BA9, tissue from both resources (NSWTRC and UTHealth) was combined in the EWAS, whereas for all other brain regions, tissue samples from NSWTRC were used exclusively. In the EWAS of smoking in BA9, the first four genotype principal components (PCgeno) were used as additional covariates to correct for genotype differences between sample donors. Genotype data of NSWTRC samples were generated using the Illumina Infinium OMNI5 array (Illumina, San Diego, CA, USA). The Illumina Infinium Global Screening Array (Illumina, San Diego, CA, USA) was used for genotyping of UTHealth samples. Genotype QC was performed as described by Turner, Armstrong [19]. Sample sizes used in the EWAS were n = 38 (ACC), n = 65 (VS), n = 68 (CN and PUT), and n = 72 (BA9). Prior to EWAS, a variance inflation factor (VIF) analysis was performed to investigate multicollinearity in the linear model. If a VIF larger than 10 was detected, one of the correlated covariates was removed in a way so that the final models for each brain region were most comparable to each other (Supplementary Information 1). p-values obtained in the association analyses were FDR-adjusted for multiple testing using the Benjamini-Hochberg procedure [20].

Differentially Methylated Regions
Differentially methylated regions were identified using the comb-p software (v. 0.50.2) [21]. The following DMR definition was used: seed p-values < 0.01 for a minimum number of two CpG sites within a 500 bp genomic window. Correction for multiple testing was performed using the Sȋdaák method as implemented in comb-p.

Results
While sample sizes ranged between 38 and 72 per brain region, tissue of 92 donors was included in the present study. Sample characteristics are displayed in Table 1 and a detailed breakdown of batches and brain regions is available in Supplementary Table S1, along with the causes of death in Supplementary Table S2.

Results
While sample sizes ranged between 38 and 72 per brain region, tissue of 92 donors was included in the present study. Sample characteristics are displayed in Table 1 and a detailed breakdown of batches and brain regions is available in Supplementary Table S1, along with the causes of death in Supplementary Table S2.

Correlation of Methylation Levels
Hierarchical clustering based on M-values of all CpG sites (n = 632,086) revealed a clear distinction based on tissue of origin (Figure 2A). At the same time, the betweentissue correlation between averaged methylation levels of blood and brain was high, with a Pearson correlation of r = 0.91, p < 2.2 × 10 −16 ( Figure 2B).

Differentially Methylated Regions
We identified a total of 16 DMRs in the five brain regions that were associated with tobacco smoking. Two DMRs were observed in multiple brain regions. One was a region consisting of 5 probes in PRSS50 in ACC and VS, and another contained the LINC00612/A2M-AS1 genes in both dorsal striatal regions (CN and PUT). DMRs for each brain region are summarized in Table 2 and are highlighted using green gene names in the Manhattan plots.

Gene-Ontology Analysis
In the ACC, smoking-associated CpG sites were enriched for GO terms related to neurodevelopment, cell growth, and morphogenesis. Enriched GO terms in BA9 were related to dendritic spine development and chromatin modification. In the VS, neuronspecific pathways were enriched, as well as processes associated with the regulation of vessel development. In both dorsal striatal regions, genes harboring differentially methylated CpG sites were enriched for GO terms related to immune pathways. After correction for multiple testing, no GO terms remained statistically significant. Result tables for the top 10 associated GO terms are shown in Supplementary Table S5a-e.

GWAS Enrichment Analysis
Gene sets consisting of smoking-associated CpG sites were overrepresented in GWAS summary statistics of smoking traits, such as smoking initiation, age of initiation, and cigarettes per day. Furthermore, significant enrichment for other SUDs, such as cannabis-, alcohol-, and opioid use disorder was observed. Statistical significance of enrichment for all tested gene-sets is displayed in Figure 4.

Consistency of Smoking-Associated CpG Sites in EWAS Results of Blood and Brain
We examined differential methylation of CpG sites in the brain, which have previously been suggested to predict smoking status in peripheral blood. When we investigated the nine available CpG sites from the prediction model by Maas et al. [28], we found no consistent differential methylation across the five brain regions. For two of them, nominally significant associations were detected. The AHRR CpG site cg05575921, even used as a single-marker smoking status predictor, was significantly associated with smoking in the ACC (p = 0.038) and PUT (p = 0.033). cg21566642, an intergenic CpG site, was associated with smoking in BA9 (p = 0.018). Full results of the consistency analysis are listed in Table 3.
Gene sets consisting of smoking-associated CpG sites were overrepresented in GWAS summary statistics of smoking traits, such as smoking initiation, age of initiation, and cigarettes per day. Furthermore, significant enrichment for other SUDs, such as cannabis-, alcohol-, and opioid use disorder was observed. Statistical significance of enrichment for all tested gene-sets is displayed in Figure 4.

Consistency of Smoking-Associated CpG Sites in EWAS Results of Blood and Brain
We examined differential methylation of CpG sites in the brain, which have previously been suggested to predict smoking status in peripheral blood. When we investigated the nine available CpG sites from the prediction model by Maas et al. [28], we found no consistent differential methylation across the five brain regions. For two of them, nominally significant associations were detected. The AHRR CpG site cg05575921, even used as a single-marker smoking status predictor, was significantly associated with smoking in the ACC (p = 0.038) and PUT (p = 0.033). cg21566642, an intergenic CpG site, was associated with smoking in BA9 (p = 0.018). Full results of the consistency analysis are listed in Table  3. Table 3. Associations of the predictor CpG sites in blood and brain.  For the nine CpG sites derived from Maas, Vidaki [28], effect sizes and p-values of the association analyses with smoking in blood (* EWAS from Joehanes, Just [5]) and the five brain regions are summarized. Nominally significant results are highlighted in bold type. B, EWAS effect size; pval, EWAS p-value; ACC, anterior cingulate cortex; BA9, Brodmann Area 9; CN, caudate nucleus; PUT, putamen; VS, ventral striatum.

Discussion
In the present study, we investigated the consistency of smoking-associated methylation signatures in blood and brain, examined differential methylation in the brain associated with smoking status, and performed several EWAS downstream analyses in five brain regions related to the neurocircuitry of addiction. The strong overall correlation of methylation levels in matched blood and brain samples is in line with previous findings of high cross-tissue correlation coefficients [29,30].
For the EWAS of smoking in the ACC, one epigenome-wide significant CpG site was observed within LOC339975. We identified a total of 16 differentially methylated regions associated with smoking status. A DMR in PRSS50 was shared between the ACC and the VS. A functional role of PRSS50 in the brain has not been systematically evaluated so far. However, in cancer cells, knockdown of PRSS50 resulted in impaired cell proliferation and increased levels of apoptosis [31]. Also, promoter hypermethylation of PRSS50 was detected in an EWAS of age-related macular degeneration (AMD) in blood and retinal tissue [32]. In the present study, also hypermethylation was observed for the PRSS50 DMR in the ACC and the VS. The second smoking-associated DMR shared between two regions of the brain (CN and PUT) was annotated to LINC00612/A2M-AS1. Both lncRNAs, LINC00612 and A2M-AS1, are involved in inflammatory processes. An anti-inflammatory function disrupted by smoking has been discovered for LINC00612 in the lungs [33], while A2M-AS1 has been linked to interleukin 1 receptor signaling in cardiomyocytes [34]. Further studies need to investigate if these lncRNAs are also involved in inflammatory signaling in the brain. Functional enrichment of inflammatory and immune-related processes was also supported by the results of the GO enrichment analysis: in PUT, genes harboring smoking-associated CpG sites were enriched in immune cell migration pathways. In the ACC, BA9, and the VS, GO enrichment analysis revealed pathway enrichment related to neuronal development and morphogenesis. Given the direct influence of nicotine on immune cell function [35] and neuronal development [36], results from the present study may point towards an additional epigenetic effect of smoking on these cellular processes.
GWAS enrichment analysis revealed significant overrepresentation of EWAS-derived gene-sets within GWAS signals of several smoking phenotypes and SUDs. This points towards a fraction of genes detected by GWAS and EWAS contributing to the development and maintenance of tobacco smoking and SUDs. Further research needs to uncover how genetic and epigenetic mechanisms collectively contribute to the disease course in SUDs.
The well-known AHRR CpG site cg05575921 was associated with smoking in the ACC and PUT and was consistently hypomethylated in all investigated brain regions. As hypomethylation and significant association with smoking has previously also been detected for cg05575921 in blood [5], the lungs [6] and in adipose tissue [8], it may be a common locus representing smoking status across tissues.
In contrast to the strong general correlation of methylation levels between tissues, associations with smoking were heterogenous between tissues. Blood samples might thus be limited in the extent to which they can function as a proxy for smoking-associated differential methylation in the brain. Concurrently, a potential specificity of associations implies that specific methylation signatures of smoking in the brain could contribute to the pathophysiology of TUD. This is also supported by the overlap of polygenic and poly-epigenetic signals identified for smoking phenotypes. Nevertheless, smoking could also have a non-specific effect on the entire organism and further studies are needed to investigate the functional relevance of smoking-associated differential methylation within the neurocircuitry of addiction and its contribution to the development and maintenance of TUD.
Several limitations need to be addressed. First, a large fraction of tissue donors was diagnosed with AUD prior to death. Despite adjustment for AUD in the linear model, we cannot rule out residual confounding by alcohol consumption and future studies should investigate smoking independent of other SUDs. However, a recent Mendelian randomization study has shown that a smoking-specific risk on other disorders remains after correcting for the genetic risk for alcohol consumption [37]. Second, smoking status was assessed based on next-of-kin reports which are known to be less precise than direct measurement of the nicotine metabolite cotinine [38]. Smoking status and TUD status often overlap, but TUD was not specifically assessed in the present study. A recent EWAS of smoking status and TUD found both specific and shared methylation signals between smoking status and lifetime TUD [39], which underlines the importance of investigating both traits simultaneously. Third, due to multiple testing, even with our largest sample size of N = 72, the EWAS of smoking in the brain is still insufficiently powered. At the same time, postmortem brain tissue is scarce which makes it challenging to obtain enough samples.
The present study identified smoking-associated DNAm changes in the neurocircuitry of addiction related to immunological and neurodevelopmental processes. However, certain DNAm signatures might represent a predisposition to tobacco smoking rather than a consequence of it. Follow-up studies should thus investigate tissues of donors deceased at different stages of TUD, which may enable the identification of changes in DNAm levels during the disease course and differentiate between predispositions and consequences of smoking. The functional consequences of smoking-associated changes in DNAm levels should also be addressed in a more comprehensive design, for example using a multi-omics approach integrating methylation and transcriptomic data. Ultimately, deeper insight into methylation changes within the neurocircuitry of addiction could lay the foundation for better understanding of the pathomechanisms of tobacco use disorder.