Abstract
Background: There are important gaps in describing the associations between variants found by GWAS and various phenotypes. Prior reports suggest that SNPs in regulatory regions should be further investigated to uncover these associations. Thus, this study involved a novel approach, along with Pharmacoepigenomics, prompting a new coined term “CpG-PGx SNP”. Methods: The rationale behind our analysis strategy was based on the impact of SNPs playing dual roles both in the CpG site disruption/formation and having PGx associations. Thus, we employed GeneCards (relevance score), PharmGKB (significant p-value), and GWAS catalog data for each gene (p < 5 × 10−8). Following the obtainment of the 25 best-scored genes of four major epigenetic processes (methylation, demethylation, acetylation, and deacetylation), we generated two lists of candidate genes, including potential CpG-PGx SNPs and possible CpG-PGx SNPs. Results: Among 2900 significant PGx annotations, we found 99 potential CpG-PGx SNPs related to 16 genes. CYP2B6, CYP2C19, CYP2D6, and COMT genes were the top genes. Additionally, we found 1230 significant GWAS-based SNPs, among them 329 CpG-SNPs related to 48 genes with at least one CpG site disruption/formation. The top gene with the highest CpG-SNPs was TET2, followed by JMJD1C and HDAC9. Importantly, we detected some synonymous variants in the Epigenetically Modifiable Accessible Region (EMAR), which can provide insights into undiscovered roles of these SNPs. We identified 173 CpG-Disruptive SNPs, 155 CpG-Forming SNPs, and just 1 CpG SNP with both impacts. Conclusions: In conclusion, here we introduce CpG-PGx SNP for the first time and suggest three major genes playing crucial roles in Pharmacoepigenomics (PEpGx), CYP2D6 as the heart of PEpGx, and TET2 with the highest possibility of having CPG-PGx SNPs. We believe that this approach will help the scientific community to utilize “CpG-PGx SNP” to unravel complex disease-driven genetic and epigenetic interactions, yielding therapeutic opportunities.
1. Introduction
Numerous traits have been effectively linked to specific areas of the genome by genome-wide association studies (GWAS) []. The process by which variations impact the phenotype that they are linked to is still unclear for many of these observations []. The majority of trait-associated variations found by GWAS are thought to function by changing gene expression rather than changing the protein coding, and are found in regulatory areas of the genome []. This hypothesis is supported by the discovery of overlaps between GWAS risk variants and genomic loci influencing markers of genome regulation (like histone modifications) and enrichment of expression quantitative trait loci (eQTLs) at identified GWAS risk loci [,,,]. Therefore, combining GWAS with gene expression data is one plausible way to improve knowledge of the processes related to GWAS findings.
DNA methylation is a key process in gene regulation. As such, it is an essential intermediary molecular trait that connects genes to other macro-level phenotypes and may contribute to missing heritability []. Despite their physiological importance, the genetic drivers of DNA methylation patterns remain poorly understood. There is evidence that genetic variation at certain loci correlates with the quantitative characteristic of DNA methylation [,,,]. Additionally, previous studies discovered that genetic variants at CpG sites (meSNPs) can possibly disrupt the substrate of methylation reactions and thus, severely alter the methylation status at a single CpG site [,].
While methylation-associated single-nucleotide polymorphisms (meSNPs) have been identified in various studies, it remains unclear if meSNPs constitute a major class of methylation quantitative trait loci (meQTLs), or if they significantly influence the methylation status of nearby CpG sites [,,,]. Most meQTL studies to date have been limited by relatively small sample sizes and the use of low-resolution methylation microarrays, in which meSNPs are sparsely represented. Furthermore, many current meQTL analyses deliberately exclude probes overlapping with sequence variants to avoid confounding due to disrupted probe hybridization [,].
Pharmacoepigenetics explores the complex relationship between epigenetic modifications and pharmacological responses, emphasizing how drugs can both alter and be affected by epigenetic mechanisms [,]. Gaining insight into these epigenetic changes is essential in pharmacology, as it could help optimize drug efficacy, reduce adverse reactions, and drive forward the progress of personalized medicine. As a multidisciplinary and continuously evolving field, pharmacoepigenetics merges pharmacology, epigenetics, and other life sciences to shape innovative therapeutic strategies and uncover new drug targets []. Alongside pharmacogenomics—a pharmacological sub-discipline focused on genetic variability in drug response—Pharmacoepigenomics has emerged as a key area of interest. It concentrates on epigenetic therapy, the impact of epigenetic regulation on pharmacokinetics, and its implications in adverse drug reactions [].
Finally, both pharmacoepigenetics and pharmacogenomics play a vital role in advancing personalized medicine by shedding light on the intricate relationships between genes, epigenetic mechanisms, and drug responses []. Here, we introduce an innovative idea to test the Genomic-Epigenomic-Phenomic-Pharmacogenomics (G-E-Ph-PGx) axis by employing CpG-PGx SNPs (whose PGx roles are known) and possible CpG-PGx SNPs.
2. Materials and Methods
2.1. General Design
Firstly, we analyzed the four major processes in epigenetics, including methylation, demethylation, acetylation, and deacetylation. Then, we searched for the best-scored genes in each epigenetic process according to the relevance score of GeneCards (access date: 1 July 2025, https://www.genecards.org/) []. Accordingly, we calculated the 1st–25th best-scored genes for each of the 4 epigenetic processes. Secondly, we checked every 100 top genes in PharmGKB (access date: 1 July 2025, now ClinPGx available at: https://www.clinpgx.org/) [] to see if they had at least one significant PGx annotation. Thirdly, each PGx variant was subsequently checked for possible categorization as a CpG-PGx SNP. To accomplish this, we employed the Ensembl genome browser with reference genome GRCh38; more specifically, we manually searched the rsID for each SNP in Ensembl (Ensembl 114 corresponding release of Ensembl Genomes 61). We also checked the major allele, a nucleotide pre-and post to determine its location relevant to a CpG dinucleotide. Following this event, we checked the minor allele accordingly. If the major allele was in a CpG dinucleotide, then it would be a CpG-SNP, which could be disrupted by a minor allele, and if the major allele shaped a new CpG-SNP, then it could be considered as a forming CpG site. Ensembl was also utilized to determine SNP functions, including exonic (missense or synonymous), intronic (regulatory), upstream (5′UTR), and downstream (3′UTR). It is indeed noteworthy that the classification of regions linking SNPs was directly obtained from Ensembl (access date: 1 July 2025, https://www.ensembl.org/index.html), such as Epigenetically Modifiable Accessible Region (EMAR), Promoter, Enhancers, and CTCF Binding Sites (CBS). Thus, these regions were automatically displayed by zooming the sequential viewer. To determine any newly found CpG-PGx SNP based on our hypothesis-generating approach, we investigated genes that had no significant PGx annotation in the GWAS catalog (access date: 1 July 2025, https://www.ebi.ac.uk/gwas/home) []. Finally, we checked the potential SNPs (based on the best p-values) for possible categorization as a CpG SNP. These CpG SNPs were designated in the Section 3 as new CpG-PGx SNPs. The whole process adhered to a hierarchal flow which provided any potential role SNPS of PGx annotations in both epigenetic processes or as a pharmacoepigenetic factor (Figure 1).
Figure 1.
Pharmacoepigenomics is summarized in a hierarchal flow. Main epigenetic processes (*) depicted indicate methylation, demethylation, acetylation, and deacetylation. Numbers 1 and 2 refer to the two plausible ways to delineate a CpG-PGx SNP, whereby number 1 indicates an easier way than number 2. However, number 2 could also introduce new CpG-PGx SNP(s) compared with number 1. It is noteworthy that GeneCards, PharmGKB, and the GWAS catalog are employed in this design.
Applying GeneCards, we included 100 best-scored genes based on 4 epigenetic processes, including methylation, demethylation, acetylation, and deacetylation (25 top genes of each one). It is noteworthy that all of the included genes were protein-coding due to the major interactions of protein–drugs in real-world findings. As such, this provided high confidence in our introduction of new CpG-PGx SNP(s) for future confirmations. As a well-known dataset, we established our strategy utilizing PharmGKB information regarding its basis, consisting of CPIC and DPWG as its main pillars. Finally, to further confirm the new CpG-PGx SNPs, we utilized the GWAS catalog for a gene of interest and refined the potential SNPs based on its classified data.
2.2. Statistical Analysis
According to our analysis strategy, we prioritized and filtered genes, PGx annotations, and CpG-SNPs related to various statistical scores. Firstly, we utilized GeneCards data for determining the top genes in 4 epigenetic processes (methylation, demethylation, acetylation, and deacetylation) based on Elasticsearch 7.11, including the Relevance score. The theory behind Relevance Scoring is that Lucene (and thus Elasticsearch) utilizes the Boolean model to find matching documents, and a formula termed “the practical scoring function” to compute relevance. This formula, itself, borrows concepts from the term frequency/inverse document frequency and the vector space model; however, it adds more-modern characteristics such as a field length normalization, coordination factor, and term/query clause boosting. Importantly, supplementary boosting is provided for the annotations, including the Symbol, Aliases, and Descriptions, Accessions for the major bioinformatics databases (NCBI, Ensembl, SwissProt), Molecular function(s), Gene Summaries, Variants with Clinical Significance, and Elite disorders. Additionally, we also employed PharmGKB, which is a pharmacogenomics resource that incorporates clinical data, including clinical guidelines and medication labels, associations of potentially clinically actionable gene-drug, and genotype-phenotype linkages. To note, PharmGKB collects, curates, and publicizes knowledge regarding the effect of human genetic variation on drug responses. This is accomplished via several activities such as annotating the genetic variants and gene–drug–disease relationships via literature review: summarizing the vital pharmacogenomic genes, associations between genetic variants and drugs, and drug pathways, and curating FDA drug labels covering pharmacogenomic data. The main filtering step in PharmGKB actually considered a significant p-value of lower than 0.05 for all obtained PGx annotations. Finally, we mined a number of genes generating the primary list (remained/extracted from step 1) in the GWAS catalog, and also adjusted the False Discovery Rate (FDR). Moreover, we considered the critical threshold of p-value < 5 × 10−8. Thus, we included the most significant GWAS-based SNPs in the current study to increase the validation of our predictions and narrow the possibilities to be close to future real-world findings.
3. Results
As we described earlier in the Method Section, the aim of this study is to provide new prospects for personalized medicine treatment by advancing PGx approaches. We believe that SNPs, as the smallest genetic building blocks, can have major impact by playing multiple roles and have the potential to induce important changes by additive functions (SNP-SNP interactions) []. Pharmacoepigenomics can simultaneously be elucidated by CpG-SNPs having PG roles, and as such, we divided the data into augmented detail of the primary genes (100 genes having 4 major epigenetic impacts).
Initially, we obtained only the best-scored protein-coding genes from GeneCards for each of the 4 epigenetic processes. This was accomplished following a precise search in PharmGKB. Thus, we separated primary genes with at least 1 significant PGx annotation from genes with no PGx annotation archived in PharmGKB. This separation aligned with the two possible ways of finding CpG-PGx SNPs represented in Figure 1. It should be clearly noted that a unique SNP may have one or more than one PGx annotations. A PGx annotation refers to a Variant-Drug-Association.
3.1. Potential CpG-PGx SNPs
Table 1 summarizes the primary genes with epigenetic impact that have at least one significant PGx annotation based on PharmGKB. Accordingly, 22 unique genes out of 100 primary genes represented significant PGx annotation(s); notably, TP53, HDAC1, KAT2B, and SIRT1 were duplicated. TP53 is involved in methylation, acetylation, and deacetylation; HDAC1, KAT2B, and SIRT1 are involved in the acetylation and deacetylation process. The top Pharmacogene based on Table 1 was CYP2C19 with 949 significant PGx annotations; also, CYP2D6 and CYP2B6 were the second and third best-scored Pharmacogenes with 733 and 383 significant annotations, respectively. Interestingly, COMT was the seventh best-scored Pharmacogene with 121 PGx annotations. Secondly, we searched each annotation for a potential CpG-PGx SNP. Table 1 has a separate column showing this potential. Accordingly, the top gene based on the number of CpG-PGx SNPs was CYP2B6 with 23 CpG-PGx SNPs, followed by CYP2C19 with 21 CpG-PGx SNPs and CYP2D6 with 18 CpG-PGx SNPs. Remarkably, all of these genes are involved in the demethylation process, and COMT (11 CpG-PGx SNPs) showed the top-scored gene amongst those that are involved in the methylation process. Finally, 16 genes out of 22 genes were revealed to have potential CpG-PGx SNPs (Table 1).
Table 1.
Primary genes with significant PGx annotation(s) based on the PharmGKB database and their potential CpG-PGx SNP(s).
Thirdly, we focused on each CpG-PGx SNP to check its function (Missense, Synonymous, Intronic, Spicing, 3′UTR, 5′UTR, or being in the regulatory region, e.g., Enhancer). To do this, we checked each CpG-PGx SNP in the Genome Browser via Ensembl for its major and minor alleles. This was accomplished for finding the CpG site formation or disruption by allelic change. It is noteworthy that this is a vital check to introduce a CpG-PGx SNP for further investigations. Specifically, CpG site formation is basically hidden in the Genome Browser. However, a minor allele is denoted as a C in a dinucleotide of XpG (where X can be A, T, or G allele) or a G in a dinucleotide of CpY (where Y can be A, C, or T allele). On the other hand, CpG site disruption is derived from a SNP in either C or G of a CpG dinucleotide site (actually, there might be ApG, TpG, GpG, CpA, CpT, or CpC dinucleotides). Table 2 verifies each CpG-PGx SNP (based on known rsIDs) and its related gene, function, and CpG site situation. Generally, we found 99 CpG-PGx among them, 61 variants were missense variants, 25 variants were Intronic, 4 variants were 3′UTR, 4 variants were in a regulatory feature, 3 variants were Synonymous, 1 variant was a Spicing, and 1 variant was a Frameshift (Table 2). CYP2D6 indicated a range for having various types of variants, including missense, intronic, splicing, frameshift, and missense/inframeshift variants.
Table 2.
Details of CpG-PGx SNPs found in 22 primary genes, highlighting their functions and CpG site formation/disruption.
3.2. The Heart of Pharmacoepigenomics?
Based on the well-known data in the PGx literature, CYP2D6 is involved in the metabolism of almost 25% of commonly used drugs [] and here, following intense investigation, we suggest it as the heart of pharmacogenetics, but this should be further investigated. However, our findings suggested CYP2B6 as the top gene based on the number of CpG-PGx SNPs. To reach a more precise comparison among the three best-scored genes, including CYP2B6, CYP2C19, and CYP2D6, we considered more factors, including relevance score (obtained from GeneCards), number of significant PGx annotations (Obtained from PharmGKB), CpG-PGx SNPs (presented in Table 1), type of variants (based on the related SNP functions in Table 2), Number of CpG site formations, Number of CpG site disruptions, and including the title of papers indexed in PubMed (Table 3). CYP2D6 showed the best factors (4 out of 7), including the best relevance score (10.74248), having the most types of variants (5), the highest number of CpG site formation (10), and highly impactful indexing (2658 papers in their titles). Thus, we strongly suggest that the other two genes (CYP2C19 and CYP2B6) represent potential candidates for being the hub genes of Pharmacoepigenomics.
Table 3.
Comparison among the three best-scored genes having CpG-PGx SNPs for finding the heart of Pharmacoepigenomics.
3.3. Putative CpG-PGx SNPs
Delving into deeper layers of PEpGx, we extracted the remaining genes with no significant PGx annotation and thereafter. By mining the related data of these genes in the GWAS catalog, we refined the significant SNPs with a p-value lower than 5 × 10−8 and Minor Allele Frequency (MAF) of higher than 0.05. Finally, we checked the resultant SNPS in the Genomic Region browser (access date: 1 July 2025, https://www.ensembl.org/index.html?redirect=no) to determine (1) if they can form a new CpG site (CpG forming) or (2) disrupt a present CpG site (CpG Disruptive) (Figure 2).
Figure 2.
Schematic overview of our basic methodology in finding the CpG-SNPs and then checking them for their type concerning CpG forming or CpG disrupting. In this illustration, we selected rs4680, which is the well-known SNP of the COMT gene, to show the possibilities. As it is clear, in the first step, we searched each SNP of interest (here rs4680) in the Ensembl genome browser and then checked its sequence for being in a CpG site or not; moreover, if a SNP was not in a CpG site, we further checked the minor allele for finding whether it might be a forming CpG site or not. Notably, in this figure, we indicate that methylation-associated enzymes are involved with the methyl group of CpG site (CH3), and this process may have an influence on the binding affinities of trans-regulatory elements, which, in turn, these whole biological processes can potentially be interrupted in the CpG disruption by changing an allele (here, major to minor).
The final putative CpG-SNPs were checked for their PGx associations to confirm whether each of the CpG-SNPs would constitute its categorization of being a CpG-PGx SNP. To note, this is indeed the second pathway described in Figure 1. According to the results indicated in Table 4, for all of the remaining 69 genes (some genes were involved in more than one epigenetic process), we mined 1,230 significant GWAS associations (or SNPs), which revealed 329 CpG-SNPs related to 48 genes with at least one CpG site formation or disruption. The top gene with the highest CpG-SNPs was TET2 (42 CpG-SNPs), followed by JMJD1C (35 CpG-SNPs) and HDAC9 (26 CpG-SNPs) in the second and third places, respectively. Interestingly, the demethylation process was not only the most important process, but also demethylation was present in the second, third, fourth, and fifth places. The other most important process was methylation by GRIN2A (13 CpG-SNPs).
Table 4.
Results of searching CpG site formation/disruption among remained genes based on GWAS catalog associations.
Moreover, in the next step, we separated the CpG-Disruptive SNPs from CpG-Forming SNPs. In total, we found 173 CpG-Disruptive SNPs, 155 CpG-Forming SNPs, and just 1 CpG SNP with both disruptive and forming impact (it can be between 2 CpG sites and disrupts one and forms the second one as a new CpG site). One example we found was the intronic SNP (rs34770920) of the ACAA2 gene. Furthermore, we found an interesting epigenetic impact in synonymous SNPs, which agrees with our previous result in the Section 3.1. More specifically, we found some CpG-SNPs in the EMAR, such as rs10849885 (KDM2B; synonymous; MAF: 0.5; CpG-Disruptive SNP), rs1667619 (TET3; synonymous; MAF: 0.47; CpG-Disruptive SNP), rs601999 (NAGLU; synonymous; MAF: 0.29; CpG-Forming SNP), and rs591939 (NAGLU; synonymous; MAF: 0.18; CpG-Forming SNP). We hereby propose that these synonymous CpG-SNPs cannot change the amino acid sequence as well as the protein structure (having no visible impact on the “Human Genome”), but they still represent a CpG-Forming SNP and are involved in the regulatory mechanisms.
Finally, in the last step, we attempted to classify both CpG-Disruptive SNPs and CpG-Forming SNPs based on their MAFs. As such, it is noteworthy to point out the undeniable rule of statistical and epidemiological genetics, which defines the possibilities of carrying SNPs by either an individual or various general populations. In Table 5 and Table 6, all CpG-SNPs are presented from the most to the least common, along with their functions. In this regard, there were nine CpG-Disruptive SNPs, including one with the MAF of 0.5 (the highest prevalence) and four CpG-Disruptive SNPs with the MAF of 0.05 (the smallest prevalence).
Table 5.
List of possible CpG-SNPs with the highest priorities leading to disruption of a CpG site based on the remained genes of GWAS mining.
Table 6.
List of possible CpG-SNPs with the maximum priorities leading to the formation of a CpG site (novel CpG site) according to the remained genes of GWAS mining.
The most prevalent CpG-Disruptive SNPs were rs2984348 (HDAC8; Enhancer), rs13245206 (HDAC9; Intronic); rs10237149 (HDAC9; Intronic), rs6951745 (HDAC9; Intronic), rs10849885 (KDM2B; Synonymous/EMAR), rs12001316 (KDM4C, Intronic), rs3814177 (TET1; 3′UTR), rs9884984 (TET2; Intronic); and rs6533183 (TET2; Intronic) (Table 5 and Supplementary Table S1).
Accordingly, in the group of CpG-Forming SNPs, we found 13 CpG-SNPs with the highest prevalence (MAF = 0.5) and 3 CpG-SNPs with the lowest prevalence (MAF = 0.05). The most prevalent CpG-Forming SNPs included rs1931537 (AR; 3′UTR), rs2116942 (DNMT1; Missense), rs1935 (JMJD1C; Missense), rs7962128 (KDM2B; 3′UTR), rs6489811 (KDM2B; Intronic), rs2613766 (KDM4B; Intronic), rs7042372 (KDM4C; Intronic/EMAR/Enhancer), rs960658 (KDM4C; Intronic), rs7037266 (KDM4C, Intronic), rs5969750 (RBBP7; 3′UTR), rs7670522 (TET2; 3′UTR), rs9884296 (TET2; Intronic), and rs5952279 (KDM6A; Intronic) (Table 6 and Supplementary Table S2).
4. Discussion
To the best of our knowledge, this is the first hypothesis-generating approach introducing CpG-PGx SNP as a multi-dimensional candidate in a Genomic-Epigenomic-Phenomic-Pharmacogenomics (GEPh-PGx) axis. GEPh-PGx suggests a complicated network of regulatory-functional interactions initiated from the smallest genetic block (SNP) to the broader cellular and molecular interplay leading to known and unknown phenotypes, which, in turn, are linked to pharmacological interactions and treatments. Briefly, GEPh-PGx represents a new aspect of personalized medicine based on the disruption or formation of a CpG site by allele changes in an SNP. This phenomenon clearly helps explain the trans-regulation processes in which these CpG sites can present or remove the possible epigenetic tags for Methylation/Demethylation reactions.
We designed a logical and comprehensive strategy of analysis based on the well-known and documented list of various genes in all four classical epigenetic processes, including methylation, demethylation, acetylation, and deacetylation. In the current investigation, we mined the CpG sites for all these genes involved in methylation/demethylation and also included genes for acetylation and deacetylation processes. Therefore, we selected 100 genes and, following the removal of the duplications (some genes were present in more than one epigenetic process, like TP53), 91 unique genes remained. We followed two pathways, including searching for and introducing potential CpG-PGx SNPs and possible CpG-SNPs to be newly confirmed CpG-PGx SNPs. We found 3 major genes for having the highest number of potential CpG-PGx SNPs, including CYP2B6, CYP2C19, and CYP2D6. Among them, CYP2D6 was found to be the heart of Pharmacoepigenomics. Finally, after a deep search based on GWAS data, we found TET2 as the top-scored candidate for future PGx confirmations according to its number of possible CpG-SNPs.
There are some studies concerning CpG-SNP(s) directly in their titles (26 papers in PubMed) and also in their abstracts (20 papers in PubMed). All the PubMed-indexed papers for CpG-SNPs in their titles can be divided into three main categories, including Neuropsychological disorders, such as suicidal behavior in subjects with schizophrenia [], psychosis [] and major depressive disorder [], metabolic disorders, including type 2 diabetes [,] and obesity [,], and cancer biology [].
Pharmacoepigenetics and Pharmacoepigenomics revealed a better resulting outcome compared with CpG-SNPs in the literature. We found 24 papers in PubMed with the pharmacoepigenetics or Pharmacoepigenomics linked in their titles. Interestingly, similar to the aforementioned 3 major categories, these papers focused on the same categories. Montagna was one of the first scientists who discussed the epigenetic and pharmacoepigenetic processes in primary headaches and pain []. Leach et al. reviewed pharmacoepigenetics in heart failure and cardiovascular disease (CVD) and concluded that, because epigenetics has a vital role in shaping phenotypic variation in health and disease, understanding and manipulating the epigenome has a massive capacity for the treatment and prevention of common human diseases []. In the context of cancer, Candelaria et al., with an emphasis on gemcitabine, reviewed an update of genetic and epigenetic bases that might account for inter-individual variations in therapeutic results []. Accordingly, Nasr et al. studied pharmacoepigenetics in breast cancer [], Fornaro et al. reviewed pharmacoepigenetics in gastrointestinal cancer [], and Gutierrez-Camino et al. reported on pharmacoepigenetics in childhood acute lymphoblastic leukemia []. In a meta-analysis, Chu and Yang systematically studied the population diversity impact of DNA methylation on the treatment response and drug ADME in various tissues and cancer types. They concluded that ethnicity should be cautiously considered for future pharmacoepigenetics explorations []. Notably, Nuotio et al. performed a genome-wide methylation analysis of responsiveness to four classes of antihypertensive drugs in the pharmacoepigenetics of hypertension [].
The last and most important topic in pharmacoepigenetics is psychological and behavioral phenotypes, such as generalized anxiety disorder [], Alzheimer’s disease [], and depression [], and opioid addiction [].
Epigenetic variants have been found near genes and gene regulators, which control the metabolism of drugs, suggesting a role for epigenetic mechanisms in modulating pharmacokinetics and pharmacodynamics [,,]. Pharmacoepigenetics is a field that studies how epigenetic variability impacts variability in drug response []. Of note, Smith et al.’s idea is completely consistent with our standpoint. They stated that first, we can detect variation in epigenetic markers, second, we can choose key epigenetic biomarker(s) in regions of variance, and third, we can map these biomarker(s) to a drug response phenotype []. Smith et al.’s idea clearly agrees with our initial idea of a GEPh-PGx axis.
Since we found that the TET2 gene was top, it is important to point out that it is a key player in epigenetics, hematopoiesis, and cancer biology. Its full name is Tet methylcytosine dioxygenase two, located on chromosome 4q24. TET2 is part of the TET family of enzymes, which convert 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC), playing a role in DNA demethylation and epigenetic regulation. Specifically, TET2 is involved in the regulation of gene expression, stem cell differentiation, especially in hematopoiesis (formation of blood cells), immune system regulation, and epigenetic reprogramming during development. Interestingly, mutations in TET2 are somatic (acquired) and commonly found in (1) myeloid malignancies such as myelodysplastic syndromes (MDS), acute myeloid leukemia (AML), chronic myelomonocytic leukemia (CMML); myeloproliferative neoplasms (MPNs); and (2) lymphoid cancers such as Angioimmunoblastic T-cell lymphoma (AITL) and Peripheral T-cell lymphoma (PTCL). It is also known that TET2 mutations are among the most common in Clonal Hematopoiesis of Indeterminate Potential (CHIP), a condition where aging individuals develop hematopoietic clones without having full-blown cancer, but with an increased risk of cardiovascular disease and leukemia. Clinically, TET2 mutations may signal different outcomes depending on the context of the disease. TET2-mutant cancers may respond differently to hypomethylating agents (like azacitidine or decitabine). Vitamin C (ascorbate) has been studied to enhance TET activity and DNA demethylation in TET2-deficient cells (preclinical). TET2 mutations often co-occur with others (e.g., ASXL1, DNMT3A, IDH2), affecting disease progression and treatment [,].
Limitations
The current study faces some limitations, which should be considered in similar future investigations. First of all, we used GeneCards data, which may receive updates based on novel findings in the literature. The other limitation may rely on the number of included genes, whereby future investigations would potentially generate an augmented primary gene list. Importantly, in vitro, in vivo, and clinical validations are the vital parts of testing our presented hypothesis-generating approach. More specifically, in vitro validations can be checked by expression and regulatory experiments, in vivo validations can be designed on knock down/known out of CpG-PGx SNPs in the animal of interest (mouse, rat, rabbit) and monitoring the drug’s effects on the living body; furthermore, clinical validations can be checked on individuals receiving specific drugs who performed Epigenomics or epigenetic molecular detection on the suggested CpG-PGx SNP(s). All of these validations can be widened to trans-regulation interactions of both potential CpG-PGx SNPs and possible CpG-PGx SNPs; more clearly, a forming CpG site SNP should be verified for its new positive/negative binding affinities. We believe it is plausible that CpG islands would help predict CpG-PGx SNPs in future investigations. Obviously, both clinical and real-world confirmations are highly recommended for validating our findings.
5. Conclusions
In conclusion, pharmacoepigenetics can provide novel insights into PGx approaches and describe complicated mechanisms involved in personalized medicine treatment options. CpG-PGx SNPs, as a conceptual framework that invites further empirical testing, can represent potential biomarkers in PGx and epigenomics, which requires more confirmation by real-world clinical findings. Based on our data, we recommend that the scientific community intensively investigate the top-scored genes reported in the current study, such as CYP2B, CYP2D6, CYP2C19, and TET2, with psychiatric and other related phenotypes. Additionally, in this study, we exposed some synonymous PGx SNPs that may be involved in CpG-PGx Disruption/Formation processes as novel clues for their impact on PGx (potential CpG-PGx SNPs). We further found other synonymous CpG-SNPs in the EMAR, confirming our primary results and, as such, highlight the uncovered roles of synonymous SNPs in regulatory mechanisms instead of functional alterations in protein structures.
Supplementary Materials
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/jpm15120579/s1, Table S1: Supplementary List of possible CpG-SNPs leading to Disruption of a CpG site based on the remained genes of GWAS mining; Table S2: Complementary list of possible CpG-SNPs with the maximum priorities leading to Formation of a CpG site (novel CpG site) according to the remained genes of GWAS mining.
Author Contributions
Conceptualization, A.S. and K.B.; methodology, A.S.; software, A.S.; validation, A.S. and K.B.; formal analysis, A.S.; investigation, A.S.; resources, A.S.; data curation, A.S.; writing—original draft preparation, A.S., K.B. and K.-U.L.; writing—review and editing, A.S., K.B., K.-U.L., I.E., B.S.F., D.B., A.P., P.K.T., R.K.A.F., S.L.S., E.L.G., M.P.L., A.P.L.L. and M.S.G.; visualization, A.S.; supervision, A.S. and K.B.; project administration, A.S.; All authors have read and agreed to the published version of the manuscript.
Funding
R21 DA045640/DA/NIDA NIH HHS/United States, I01 CX002099/CX/CSRD VA/United States, R33 DA045640/DA/NIDA NIH HHS/United States, R41 MD012318/MD/NIMHD NIH HHS/United States, I01 CX000479/CX/CSRD VA/United States.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Any further data will be available on a reasonable request from the corresponding author via email (alirezasharafshah@yahoo.com).
Conflicts of Interest
Kenneth Blum is the inventor of both GARS and KB220, which have been assigned to TranspliceGen Holdings, Inc., Texas, USA. There are no other conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| GWAS | Genome-Wide Association Studies |
| PGx | Pharmacogenomics |
| PEpGx | Pharmacoepigenomics |
| SNP | Single Nucleotide Polymorphism |
| EMAR | Epigenetically Modifiable Accessible Region |
| meQTLs | methylation quantitative trait loci |
| G-E-Ph-PGx | Genomic-Epigenomic-Phenomic-Pharmacogenomics |
| FDR | False Discovery Rate |
| MAF | Minor Allele Frequency |
References
- MacArthur, J.; Bowler, E.; Cerezo, M.; Gil, L.; Hall, P.; Hastings, E.; Junkins, H.; McMahon, A.; Milano, A.; Morales, J. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 2017, 45, D896–D901. [Google Scholar] [CrossRef]
- Gallagher, M.D.; Chen-Plotkin, A.S. The post-GWAS era: From association to function. Am. J. Hum. Genet. 2018, 102, 717–730. [Google Scholar] [CrossRef]
- Maurano, M.T.; Humbert, R.; Rynes, E.; Thurman, R.E.; Haugen, E.; Wang, H.; Reynolds, A.P.; Sandstrom, R.; Qu, H.; Brody, J. Systematic localization of common disease-associated variation in regulatory DNA. Science 2012, 337, 1190–1195. [Google Scholar] [CrossRef]
- Arthur, T.D.; Nguyen, J.P.; Henson, B.A.; D’ Antonio-Chronowska, A.; Jaureguy, J.; Silva, N.; Arias, A.D.; Benaglio, P.; Berggren, W.T.; Borja, V. Multiomic QTL mapping reveals phenotypic complexity of GWAS loci and prioritizes putative causal variants. Cell Genom. 2025, 5, 100775. [Google Scholar] [CrossRef]
- Degtyareva, A.O.; Antontseva, E.V.; Merkulova, T.I. Regulatory SNPs: Altered transcription factor binding sites implicated in complex traits and diseases. Int. J. Mol. Sci. 2021, 22, 6454. [Google Scholar] [CrossRef]
- Prowse-Wilkins, C.P.; Lopdell, T.J.; Xiang, R.; Vander Jagt, C.J.; Littlejohn, M.D.; Chamberlain, A.J.; Goddard, M.E. Genetic variation in histone modifications and gene expression identifies regulatory variants in the mammary gland of cattle. BMC Genom. 2022, 23, 815. [Google Scholar] [CrossRef]
- Hawe, J.S.; Wilson, R.; Schmid, K.T.; Zhou, L.; Lakshmanan, L.N.; Lehne, B.C.; Kuehnel, B.; Scott, W.R.; Wielscher, M.; Yew, Y.W. Genetic variation influencing DNA methylation provides insights into molecular mechanisms regulating genomic function. Nat. Genet. 2022, 54, 18–29. [Google Scholar] [CrossRef] [PubMed]
- Min, J.L.; Hemani, G.; Hannon, E.; Dekkers, K.F.; Castillo-Fernandez, J.; Luijk, R.; Carnero-Montoro, E.; Lawson, D.J.; Burrows, K.; Suderman, M. Genomic and phenotypic insights from an atlas of genetic effects on DNA methylation. Nat. Genet. 2021, 53, 1311–1321. [Google Scholar] [CrossRef] [PubMed]
- Sergeeva, A.; Davydova, K.; Perenkov, A.; Vedunova, M. Mechanisms of human DNA methylation, alteration of methylation patterns in physiological processes and oncology. Gene 2023, 875, 147487. [Google Scholar] [CrossRef] [PubMed]
- Villicaña, S.; Bell, J.T. Genetic impacts on DNA methylation: Research findings and future perspectives. Genome Biol. 2021, 22, 127. [Google Scholar] [CrossRef]
- Romanowska, J.; Haaland, Ø.A.; Jugessur, A.; Gjerdevik, M.; Xu, Z.; Taylor, J.; Wilcox, A.J.; Jonassen, I.; Lie, R.T.; Gjessing, H.K. Gene–methylation interactions: Discovering region-wise DNA methylation levels that modify SNP-associated disease risk. Clin. Epigenetics 2020, 12, 109. [Google Scholar] [CrossRef]
- McCartney, D.L.; Min, J.L.; Richmond, R.C.; Lu, A.T.; Sobczyk, M.K.; Davies, G.; Broer, L.; Guo, X.; Jeong, A.; Jung, J. Genome-wide association studies identify 137 genetic loci for DNA methylation biomarkers of aging. Genome Biol. 2021, 22, 194. [Google Scholar] [CrossRef]
- Maldonado, M.B.; de Rezende Neto, N.B.; Nagamatsu, S.T.; Carazzolle, M.F.; Hoff, J.L.; Whitacre, L.K.; Schnabel, R.D.; Behura, S.K.; McKay, S.D.; Taylor, J.F. Identification of bovine CpG SNPs as potential targets for epigenetic regulation via DNA methylation. PLoS ONE 2019, 14, e0222329. [Google Scholar] [CrossRef]
- Liu, R.-K.; Lin, X.; Wang, Z.; Greenbaum, J.; Qiu, C.; Zeng, C.-P.; Zhu, Y.-Y.; Shen, J.; Deng, H.-W. Identification of novel functional CpG-SNPs associated with Type 2 diabetes and birth weight. Aging 2021, 13, 10619. [Google Scholar] [CrossRef]
- Gibbs, J.R.; Van Der Brug, M.P.; Hernandez, D.G.; Traynor, B.J.; Nalls, M.A.; Lai, S.-L.; Arepalli, S.; Dillman, A.; Rafferty, I.P.; Troncoso, J. Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain. PLoS Genet. 2010, 6, e1000952. [Google Scholar] [CrossRef] [PubMed]
- Bell, J.T.; Pai, A.A.; Pickrell, J.K.; Gaffney, D.J.; Pique-Regi, R.; Degner, J.F.; Gilad, Y.; Pritchard, J.K. DNA methylation patterns associate with genetic and gene expression variation in HapMap cell lines. Genome Biol. 2011, 12, R10. [Google Scholar] [CrossRef]
- Shoemaker, R.; Deng, J.; Wang, W.; Zhang, K. Allele-specific methylation is prevalent and is contributed by CpG-SNPs in the human genome. Genome Res. 2010, 20, 883–889. [Google Scholar] [CrossRef] [PubMed]
- Zhang, D.; Cheng, L.; Badner, J.A.; Chen, C.; Chen, Q.; Luo, W.; Craig, D.W.; Redman, M.; Gershon, E.S.; Liu, C. Genetic control of individual differences in gene-specific methylation in human brain. Am. J. Hum. Genet. 2010, 86, 411–419. [Google Scholar] [CrossRef]
- Majchrzak-Celińska, A.; Baer-Dubowska, W. Pharmacoepigenetics: An element of personalized therapy? Expert Opin. Drug Metab. Toxicol. 2017, 13, 387–398. [Google Scholar] [CrossRef]
- Smith, D.A.; Sadler, M.C.; Altman, R.B. Promises and challenges in pharmacoepigenetics. Camb. Prism. Precis. Med. 2023, 1, e18. [Google Scholar] [CrossRef]
- Bustin, S.A.; Jellinger, K.A. Advances in molecular medicine: Unravelling disease complexity and pioneering precision healthcare. Int. J. Mol. Sci. 2023, 24, 14168. [Google Scholar] [CrossRef] [PubMed]
- Griñán-Ferré, C.; Bellver-Sanchis, A.; Guerrero, A.; Pallas, M. Advancing personalized medicine in neurodegenerative diseases: The role of epigenetics and pharmacoepigenomics in pharmacotherapy. Pharmacol. Res. 2024, 205, 107247. [Google Scholar] [CrossRef]
- Stelzer, G.; Rosen, N.; Plaschkes, I.; Zimmerman, S.; Twik, M.; Fishilevich, S.; Stein, T.I.; Nudel, R.; Lieder, I.; Mazor, Y. The GeneCards suite: From gene data mining to disease genome sequence analyses. Curr. Protoc. Bioinform. 2016, 54, 1–30. [Google Scholar] [CrossRef]
- Whirl-Carrillo, M.; Huddart, R.; Gong, L.; Sangkuhl, K.; Thorn, C.F.; Whaley, R.; Klein, T.E. An evidence-based framework for evaluating pharmacogenomics knowledge for personalized medicine. Clin. Pharmacol. Ther. 2021, 110, 563–572. [Google Scholar]
- Cerezo, M.; Sollis, E.; Ji, Y.; Lewis, E.; Abid, A.; Bircan, K.O.; Hall, P.; Hayhurst, J.; John, S.; Mosaku, A. The NHGRI-EBI GWAS Catalog: Standards for reusability, sustainability and diversity. Nucleic Acids Res. 2025, 53, D998–D1005. [Google Scholar] [CrossRef]
- Sharafshah, A.; Motovali-Bashi, M.; Keshavarz, P.; Blum, K. Synergistic Epistasis and Systems Biology Approaches to Uncover a Pharmacogenomic Map Linked to Pain, Anti-Inflammatory and Immunomodulating Agents (PAIma) in a Healthy Cohort. Cell. Mol. Neurobiol. 2024, 44, 74. [Google Scholar] [CrossRef]
- Zhou, S.-F.; Liu, J.-P.; Lai, X.-S. Substrate specificity, inhibitors and regulation of human cytochrome P450 2D6 and implications in drug development. Curr. Med. Chem. 2009, 16, 2661–2805. [Google Scholar] [CrossRef]
- Polsinelli, G.; Zai, C.C.; Strauss, J.; Kennedy, J.L.; De Luca, V. Association and CpG SNP analysis of HTR4 polymorphisms with suicidal behavior in subjects with schizophrenia. J. Neural Transm. 2013, 120, 253–258. [Google Scholar] [CrossRef]
- Van Den Oord, E.J.; Clark, S.L.; Xie, L.Y.; Shabalin, A.A.; Dozmorov, M.G.; Kumar, G.; Consortium, S.S.; Vladimirov, V.I.; Magnusson, P.K.; Aberg, K.A. A whole methylome CpG-SNP association study of psychosis in blood and brain tissue. Schizophr. Bull. 2016, 42, 1018–1026. [Google Scholar] [CrossRef] [PubMed]
- Aberg, K.A.; Shabalin, A.A.; Chan, R.F.; Zhao, M.; Kumar, G.; van Grootheest, G.; Clark, S.L.; Xie, L.Y.; Milaneschi, Y.; Penninx, B.W. Convergence of evidence from a methylome-wide CpG-SNP association study and GWAS of major depressive disorder. Transl. Psychiatry 2018, 8, 162. [Google Scholar] [CrossRef] [PubMed]
- Torkamandi, S.; Bastami, M.; Ghaedi, H.; Tarighi, S.; Shokri, F.; Javadi, A.; Mirfakhraie, R.; Omrani, M.D. Association of CpG-SNP and 3′UTR-SNP of WFS1 with the Risk of Type 2 Diabetes Mellitus in an Iranian Population. Int. J. Mol. Cell. Med. 2017, 6, 197. [Google Scholar] [PubMed]
- Vohra, M.; Adhikari, P.; CD, S.; Nagri, S.K.; Umakanth, S.; Satyamoorthy, K.; Rai, P.S. CpG-SNP site methylation regulates allele-specific expression of MTHFD1 gene in type 2 diabetes. Lab. Investig. 2020, 100, 1090–1101. [Google Scholar] [CrossRef]
- Mansego, M.L.; Milagro, F.I.; Zulet, M.A.; Martinez, J.A. SH2B1 CpG-SNP is associated with body weight reduction in obese subjects following a dietary restriction program. Ann. Nutr. Metab. 2015, 66, 1–9. [Google Scholar] [CrossRef]
- de Toro-Martín, J.; Guénard, F.; Tchernof, A.; Deshaies, Y.; Pérusse, L.; Biron, S.; Lescelleur, O.; Biertho, L.; Marceau, S.; Vohl, M.A. CpG-SNP located within the ARPC3 gene promoter is associated with hypertriglyceridemia in severely obese patients. Ann. Nutr. Metab. 2016, 68, 203–212. [Google Scholar] [CrossRef]
- Harlid, S.; Ivarsson, M.I.; Butt, S.; Hussain, S.; Grzybowska, E.; Eyfjörd, J.E.; Lenner, P.; Försti, A.; Hemminki, K.; Manjer, J. A candidate CpG SNP approach identifies a breast cancer associated ESR1-SNP. Int. J. Cancer 2011, 129, 1689–1698. [Google Scholar] [CrossRef]
- Montagna, P. Epigenetics and pharmaco-epigenetics in the primary headaches. J. Headache Pain 2008, 9, 193–194. [Google Scholar] [CrossRef]
- Mateo Leach, I.; Van Der Harst, P.; De Boer, R.A. Pharmacoepigenetics in heart failure. Curr. Heart Fail. Rep. 2010, 7, 83–90. [Google Scholar] [CrossRef]
- Candelaria, M.; De la Cruz-Hernández, E.; Perez-Cardenas, E.; Trejo-Becerril, C.; Gutierrez-Hernandez, O.; Duenas-Gonzalez, A. Pharmacogenetics and pharmacoepigenetics of gemcitabine. Med. Oncol. 2010, 27, 1133–1143. [Google Scholar] [CrossRef]
- Nasr, R.; Sleiman, F.; Awada, Z.; Zgheib, N.K. The pharmacoepigenetics of drug metabolism and transport in breast cancer: Review of the literature and in silico analysis. Pharmacogenomics 2016, 17, 1573–1585. [Google Scholar] [CrossRef] [PubMed]
- Fornaro, L.; Vivaldi, C.; Caparello, C.; Musettini, G.; Baldini, E.; Masi, G.; Falcone, A. Pharmacoepigenetics in gastrointestinal tumors: MGMT methylation and beyond. Front. Biosci. 2016, 8, 170–180. [Google Scholar] [CrossRef] [PubMed]
- Gutierrez-Camino, A.; Umerez, M.; Santos, B.; Martin-Guerrero, I.; García de Andoin, N.; Sastre, A.; Navajas, A.; Astigarraga, I.; Garcia-Orad, A. Pharmacoepigenetics in childhood acute lymphoblastic leukemia: Involvement of miRNA polymorphisms in hepatotoxicity. Epigenomics 2018, 10, 409–417. [Google Scholar] [CrossRef]
- Chu, S.-K.; Yang, H.-C. Interethnic DNA methylation difference and its implications in pharmacoepigenetics. Epigenomics 2017, 9, 1437–1454. [Google Scholar] [CrossRef]
- Nuotio, M.-L.; Sánez Tähtisalo, H.; Lahtinen, A.; Donner, K.; Fyhrquist, F.; Perola, M.; Kontula, K.K.; Hiltunen, T.P. Pharmacoepigenetics of hypertension: Genome-wide methylation analysis of responsiveness to four classes of antihypertensive drugs using a double-blind crossover study design. Epigenetics 2022, 17, 1432–1445. [Google Scholar] [CrossRef]
- Tomasi, J.; Lisoway, A.J.; Zai, C.C.; Harripaul, R.; Müller, D.J.; Zai, G.C.; McCabe, R.E.; Richter, M.A.; Kennedy, J.L.; Tiwari, A.K. Towards precision medicine in generalized anxiety disorder: Review of genetics and pharmaco (epi) genetics. J. Psychiatr. Res. 2019, 119, 33–47. [Google Scholar] [CrossRef]
- Cacabelos, R.; Carril, J.C.; Cacabelos, N.; Kazantsev, A.G.; Vostrov, A.V.; Corzo, L.; Cacabelos, P.; Goldgaber, D. Sirtuins in Alzheimer’s disease: SIRT2-related genophenotypes and implications for pharmacoepigenetics. Int. J. Mol. Sci. 2019, 20, 1249. [Google Scholar] [CrossRef]
- Hack, L.M.; Fries, G.R.; Eyre, H.A.; Bousman, C.A.; Singh, A.B.; Quevedo, J.; John, V.P.; Baune, B.T.; Dunlop, B.W. Moving pharmacoepigenetics tools for depression toward clinical use. J. Affect. Disord. 2019, 249, 336–346. [Google Scholar] [CrossRef] [PubMed]
- Knothe, C.; Oertel, B.G.; Ultsch, A.; Kettner, M.; Schmidt, P.H.; Wunder, C.; Toennes, S.W.; Geisslinger, G.; Loetsch, J. Pharmacoepigenetics of the role of DNA methylation in μ-opioid receptor expression in different human brain regions. Epigenomics 2016, 8, 1583–1599. [Google Scholar] [CrossRef] [PubMed]
- Kacevska, M.; Ivanov, M.; Ingelman-Sundberg, M. Epigenetic-dependent regulation of drug transport and metabolism: An update. Pharmacogenomics 2012, 13, 1373–1385. [Google Scholar] [CrossRef]
- He, Y.; Chevillet, J.; Liu, G.; Kim, T.; Wang, K. The effects of micro RNA on the absorption, distribution, metabolism and excretion of drugs. Br. J. Pharmacol. 2015, 172, 2733–2747. [Google Scholar] [CrossRef] [PubMed]
- Shi, Y.; Li, M.; Song, C.; Xu, Q.; Huo, R.; Shen, L.; Xing, Q.; Cui, D.; Li, W.; Zhao, J. Combined study of genetic and epigenetic biomarker risperidone treatment efficacy in Chinese Han schizophrenia patients. Transl. Psychiatry 2017, 7, e1170. [Google Scholar] [CrossRef]
- Wu, X.; Deng, J.; Zhang, N.; Liu, X.; Zheng, X.; Yan, T.; Ye, W.; Gong, Y. Pedigree investigation, clinical characteristics, and prognosis analysis of haematological disease patients with germline TET2 mutation. BMC Cancer 2022, 22, 262. [Google Scholar] [CrossRef] [PubMed]
- Buckingham, L.; Mitchell, R.; Maienschein-Cline, M.; Green, S.; Hu, V.H.; Cobleigh, M.; Rotmensch, J.; Burgess, K.; Usha, L. Somatic variants of potential clinical significance in the tumors of BRCA phenocopies. Hered. Cancer Clin. Pract. 2019, 17, 21. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).