DNA Modifications: Function and Applications in Normal and Disease States

Epigenetics refers to a variety of processes that have heritable effects on gene expression programs without changes in DNA sequence. Key players in epigenetic control are chemical modifications to DNA, histone, and non-histone chromosomal proteins, which establish a complex regulatory network that controls genome function. Methylation of DNA at the fifth position of cytosine in CpG dinucleotides (5-methylcytosine, 5mC), which is carried out by DNA methyltransferases, is commonly associated with gene silencing. However, high resolution mapping of DNA methylation has revealed that 5mC is enriched in exonic nucleosomes and at intron-exon junctions, suggesting a role of DNA methylation in the relationship between elongation and RNA splicing. Recent studies have increased our knowledge of another modification of DNA, 5-hydroxymethylcytosine (5hmC), which is a product of the ten-eleven translocation (TET) proteins converting 5mC to 5hmC. In this review, we will highlight current studies on the role of 5mC and 5hmC in regulating gene expression (using some aspects of brain development as examples). Further the roles of these modifications in detection of pathological states (type 2 diabetes, Rett syndrome, fetal alcohol spectrum disorders and teratogen exposure) will be discussed.


Introduction
Some argue DNA's discovery to have been 145 years ago, with the more recognized discovery being 70 years ago, thanks to the works of Avery, MacLeod, McCarty, Watson, and Crick [1]. Since then, over a million published works can be found on the subject of DNA, which is the instructions manual of life [2]. DNA encodes our genes (also known as the genotype) and has been completely sequenced in the "Human Genome Project" as of 2003 [3]. Genes however are not always expressed, DNA is not always transcribed into ribonucleic acid (RNA), and RNA is not always translated into proteins. This complex network of gene regulation can partly be explained by epigenetics. The term was first coined in 1942 by Conrad Waddington, defining it as "causal interactions between genes and their products which bring the phenotype into being". Epigenetics represents the many inheritable chemical "marks" found on and surrounding the genome that influence gene expression in a controlled and selective manner. DNA methylation and histone post-translational modifications (PTMs) are two epigenetic mechanisms involved in the regulation of gene expression [4]. DNA methylation was initially found and extensively studied in prokaryotic DNA since the 1960s. In 1977, 5-methylcytosine (5mC) was identified in eukaryotic DNA by Razin and Cedar [5][6][7]. Since their discovery, close to 33,000 published works can be found on DNA methylation, forever changing genetic and chromatin based research [8].
Epigenetic changes play a major role in biological processes at the level of chromatin structure and organization. DNA is packaged in the nucleus as a 10 nm diameter nucleosomal fiber (chromatin) and in higher-order chromatin structures, with the basic repeating unit being the nucleosome in which DNA is wrapped around a core of histone proteins. Access to DNA by various cellular machineries is granted by a series of epigenetic processes that change the high-order structure of chromatin. These epigenetic processes often entail dynamic chemical modifications to DNA or to the histones. Chemical modifications of histones include enzymatic methylation, acetylation, phosphorylation, ubiquitination, and sumoylation [9]. The addition of an acetyl group on a histone (predominantly on the N-terminal tail) is performed by lysine acetyltransferases (KATs), whereas its removal is performed by histone deacetylases (HDACs). Acetylation of histones generally allows for an "open" state of chromatin (termed euchromatin), exposing DNA and allowing transcription to occur. Methylation of histones may be either an activating or a silencing mark, depending on the specific amino acid affected. Interestingly there is "cross-talk" among the histone PTMs and DNA methylation/demethylation machinery, which is mediated by proteins that bind to the proteins involved in these modifications. For example, histone H3 trimethylated at lysine 4 (H3K4me3; an active mark) located at the 5' end of genes binds to recruit tumor suppressor inhibitor of growth protein 1b, which interacts with growth arrest and DNA damage protein 45a (GADD45) [10]. GADD45 mediates gene-specific DNA demethylation by recruiting DNA repair enzymes. The trimethylated H3K4 mark also recruits KATs, which result in nucleosomes with this mark having an increased steady state level of histone acetylation and reduced level of 5mC [11,12].
Collectively, DNA methylation is a critical player in many biological aspects, which are not limited to gene expression regulation. The knowledge on DNA methylation is rapidly advancing, and thorough understanding of its diverse functions is required to decipher the mechanisms by which alterations in DNA methylation contribute to human development and diseases. In this review we summarize several important topics in relation to the role of DNA methylation in biology and human diseases. In the second section of the review we will discuss how DNA methyl modifications are generated and relative locations of these modifications within the genome. In the third section, the role of DNA methylation and the role of two important DNA methylation-associated epigenetic modulators MeCP2 and CCCTC-binding factor (CTCF) in different biological processes will be discussed. In the fourth section, we will talk about the different types of methyl binding proteins (MBPs) and their role in DNA methylation-related biological processes and how alterations in DNA methylation and MBPs contribute to human diseases. This section will be followed by a detailed discussion on the effects of teratogens on DNA methylation in the brain, as well as the association of DNA methylation and neurodevelopmental disorders. In the finalsection, we will briefly evaluate the potential of DNA methylation as a biomarker for human diseases.

Five-Methylcytosine: "The Fifth Base"
DNA methylation is the focus of this review and represents "the inheritable changes in gene expression that do not involve DNA sequence changes" [4]. DNA methylation plays biological roles in genomic imprinting, reprogramming and stability, cellular differentiation, X-chromosome inactivation (XCI), transposon silencing, RNA splicing, and DNA repair [13][14][15]. The family of enzymes responsible for the covalent addition of the methyl group on the fifth carbon of cytosine (considered the "Fifth Base") are DNA methyltransferases (DNMTs), with the methyl group donor generally being S-adenosyl methionine (SAM) ( Figure 1). DNMT1, which is found in undifferentiated and differentiated cells, contributes to the maintenance of DNA methylation during cellular replication when both strands of DNA are separated and copied. DNMT1 preferentially binds to hemi-methylated DNA and copies the DNA methylation of the parental strand to newly replicated strand [16][17][18]. The de novo DNMTs are DNMT3A, DNMT3B and co-factor DNMT3L (DNA methyltransferase-like protein), which methylate DNA during embryogenesis and in differentiated cells, and are highly expressed in embryonic stem cells (ESCs). DNMT3B is essential for early embryogenesis, while DNMT3A is needed late in embryogenesis and found in differentiated cells. DNMT3L lacks a catalytic domain and is proposed to play the role of co-factor, enabling the de novo methylation function of DNMT3A and DNMT3B [16]. Because epigenetic marks are inheritable, the classic DNA methylation model labels DNMT1 as the enzyme which transfers the established DNA methylation patterns from the parental strand onto the daughter strands, therefore allowing the inheritance to occur and be subsequently maintained [18]. When DNMT1 does not effect this process, due to its absence from the nucleus, it is termed "passive demethylation" [15,19]. However, this model has recently been contested [20]. DNMT3A/B are gradual acting and highly expressed in somatic cells while their co-factor DNMT3L is not. This suggests that DNMT3L drives the de novo function of DNMT3A/B during embryogenesis only, while in somatic cells, DNMT3L is not expressed, and so DNMT3B actually stimulates DNMT3A. Once stimulated, DNMT3A/B anchor themselves to nucleosomes with DNA methylation, which is said to promote protein stability. Hypomethylation of nucleosomes disables the anchoring of DNMT3A/B, with the free DNMT3A/B being rapidly degraded. This is said to protect the genome from random de novo DNA methylation. The selective binding of DNMT3A/B suggests a regulatory role in the maintenance of DNA methylation. DNMT1 will copy the methylation patterns onto rapidly replicated hemi-methylated DNA. If incomplete, nucleosome anchored DNMT3A/B will "proof-read" and replete the pattern, thus propagating the inheritable methylated state [20][21][22]. Cytosines are converted to 5-methylcytosine (5mC) by DNA methyltransferase (DNMT) enzymes through transfer of a methyl group from S-adenosyl methionine (SAM). Ten Eleven Translocation (TET) enzymes catalyze the oxidation of 5mC to 5-Hydroxymethylcytosine (5hmC) through a chemical reaction which involves alpha-ketoglutarate (ĮKG), Oxygen (O2), Adenosine triphosphate (ATP) and Fe 2+ . Similar reactions further oxidize 5hmC into 5-formylcytosine (5fC) and 5-carboxycytosine (5CaC). DNMT1 is aided in its function by multiple proteins. Ubiquitin-like PHD and RING finger domains 1 (UHRF1) associate with DNMT1 at the replication fork. More specifically, proliferation cell nuclear antigen (PCNA) localizes DNMT1, and UHRF1 aids by binding hemi-methylated DNA and is then displaced in order for DNMT1 to specifically bind. This is accomplished by UHRF1's many binding domains: an ubiquitin-like domain, an SRA domain (SET and RING domain), a RING domain with E3 ligase activity, a Tandem Tudor domain, a PHD domain, and an ADD-like domain. The SRA domain preferentially recognizes hemi-methylated DNA and binds securely via its base flipping mechanism, with the target nucleotide being 5mC. Tandem Tudor and PHD collectively bind histone PTMs (H3K9me3, H3K9me2, and H3K4me0) [9,20,23]. This suggests that DNA methylation and histone methylation collectively work together to ensure proper de novo DNA methylation and that UHRF1 faithfully ensures the transmission of DNA methylation during mitosis [9]. This is further supported by the fact that DNMTs are known to interact with histone methyltransferases, such as SUV39h1, Setdb1 and G9a [23]. However, UHRF1s mechanism has not been fully investigated, and others postulate that UHRF1's RING domain ubiquitinates H3K23, which serves as a mark for DNMT1 to be recruited [23,24]. In addition, DNMT3L's role as a co-factor also suggests the existence of a relationship between DNA methylation and histones. DNMT3A, DNMT3A, and co-factor DNMT3L all contain an ADD (ATRX-DNMT3-DNMT3L) domain, which is known to bind methylated or unmodified histones via its PHD domain within ADD. More specifically, DNMT3A and DNMT3L are capable of recognizing and binding unmethylated H3K4 and trimethylated H3K9. If H3K4 is trimethylated (H3K4me3), the PHD domain of ADD will not be able to bind [9]. This too fuels the coupling of DNA methylation and histone methylation, which has been recently summarized in Rose and Klose's 2014 review [23]. Therefore, when considering DNA methylation, it is critical to know that a variety of accessory proteins and modifying enzymes are operational in establishing and maintaining DNA methylation. Histone PTMs also play a role by blocking or promoting DNA methylation via specific histone acetylation or methylation marks, and are themselves regulated by KATs and HDACs.
DNA methylation occurs at approximately 70%-80% of CpG dinucleotides. Concentrated regions of CpG dinucleotides (CpG islands), which tend to be located within gene promoters, near transcription start sites, and in enhancer regions, are generally not methylated [13,17,23]. The exception is during development and within certain tissues, where a number of CpG islands are methylated, thereby turning genes off [13,16]. An example of this is CpG islands in somatic cells. They have been found to be methylated at a paternal (allele-specific imprinted genes) and maternal (X chromosome inactivation) level [16]. However, in mouse ESCs, approximately 800 CpG islands were found which contain 5hmC. Examples of genes with 5hmC methylated CpG islands include Zfp64 and Ecat1 [25]. Enzymatic proteins containing CXXC (Cys-X-X-Cys chromatin-associated binding domain) domains (e.g., DNMT1) bind unmethylated CpGs, while those with a methyl-CpG-binding domain (MBD; e.g., methyl-binding proteins, which are discussed in Section 3.1.2), bind mainly to methylated CpGs [9]. Non-CpG methylation, which is when guanine (G) is replaced with adenine (A), thymine (T) or cytosine (C), has also been identified in undifferentiated human embryonic stem cells (hESCs). 5mCpG remains to be the most prevalent in both un-and differentiated cell lineages; however, 5mCpA appears to be more prevalent in undifferentiated hESCs, decreasing as differentiation occurs. It is thereby thought to play a major role in cellular development [13]. However, 5mCpA has also been found in oocytes and in adult brain [16]. It has also been associated with de novo DNMT3 as the enzyme responsible for its methyl group addition, making the dynamics of DNA methylation even more complex [14].

Five-Hydroxymethylcytosine: "the Sixth Base"
Apart from 5mC, other DNA methylation marks exist and are rapidly becoming targets in "active demethylation" research in order to uncover their biological function [15,16,26]. "Active demethylation" represents the enzymatic removal or modification of the methyl group of 5mC [19]. The methyl group on the fifth carbon of the cytosine residue within the CpG can be oxidized by the ten-eleven translocation (TET) dioxygenase family, utilizing molecular oxygen as a substrate and therefore creating the "sixth base": 5-hydroxymethylcytosine (5hmC) (Figure 1). Abundant levels of 5hmC can be found in the central nervous system (CNS) and in embryos of both humans and rodents, and is generally found near transcription start sites, making it essential for proper development [15,16]. Moreover, higher 5hmC enrichment was reported within gene bodies (intragenic regions) of actively transcribed genes in mouse ESCs [27] and enhancers in human ESCs [28]. The TET family of enzymes is comprised of TET1, TET2, and TET3, where TET1 and TET3 are known to bind CpG sequences via their CXXC domain [16,19]. The TET family have also been found to further oxidize 5hmC into 5-carboxylcytosine (5caC) and 5-formylcytosine (5fC) with the help of ATP ( Figure 1) [16,29]. Deamination of 5hmC by the activation-induced cytidine deaminase/apolipoprotein B mRNA-editing, enzyme-catalytic, polypeptide (AID/APOBEC) family of deaminases can also occur, forming 5-hydroxymethyluracil (5hmU). This led to the hypothesis that 5hmC is an active demethylation mark, acting as an "initiation step" for base excision repair (BER) systems, which return 5mC to a regular unmethylated cytosine residue and therefore promote activation. Once 5fC and 5caC, are formed, they destabilize the N-glycosidic bond and subsequently encourage thymine DNA glycosylase (TDG) and methylated DNA binding domain-containing protein 4 (MBD4) glycosylases to initiate BER by removing the modified base and forming an apurine/apyramidine site (AP site). In the case of 5hmU, AID/APOBEC family of deaminases trigger BER and form an AP site as well. AP sites are toxic and need to be replaced with a base. AP endonuclease 1 (APEX1) will cleave the AP site and allow DNA polymerase to re-insert the appropriate base, in this case cytosine. In ESCs, 5hmC levels dominate that of 5fC and 5caC levels, implying that TET expression is strategically controlled [15,19,[30][31][32]. However, researchers are still investigating the BER-mediated active demethylation mechanism with the hopes of finding increasing evidence to support it.
TET activity has also been tied to ascorbic acid (vitamin C), which acts as a cofactor in at least eight different enzyme reactions (including two related to carnitine synthesis) and is an antioxidant. Vitamin C has been linked to DNA demethylation in mouse ESCs. The catalytic site of TET requires Fe 2+ and 2-oxoglutarate to oxidize 5mC. Vitamin C is postulated to be an additional co-factor for TET, possibly playing a role as a regulator and enhancer of TET activity. In the presence of vitamin C, TET primarily increases global oxidization of 5mC and promotes active demethylation [32][33][34].

Role of DNA Methylation in Biological Processes
As a major epigenetic mechanism, DNA methylation is involved in a vast array of biological processes. Examples of these processes include gene expression (transcriptional activation and repression), RNA splicing, genome organization, imprinting, and X chromosome inactivation (XCI), which are described in detail below.

Gene Expression
Before the discovery of secondary modifications of 5mC such as 5hmC, 5fC, and 5caC, the conventional role of DNA methylation in gene expression was believed to be repressive when 5mC is present at promoter regions [35]. The association of 5mC with alternative splicing also challenges the general repressive role of 5mC. Moreover, the association of 5hmC methyl marks with active chromatin regions [27,36] has challenged the general concept. DNA methylation can mediate transcriptional repression through three broad mechanisms: (1) direct hindrance of transcriptional activation; (2) recruitment of repressive protein complexes; and (3) cross-talk with histone PTMs. These mechanisms gain further complexity by the involvement of increased nucleosome compaction, prevention of the binding of transcription factors, recruitment of methyl binding proteins (MBPs), and prevention of the binding of chromatin remodeling complexes.

Direct Hindrance of Transcriptional Activation
DNA methylation (5mC) at promoter regions of genes may result in transcriptional repression through the prevention of the binding of transcription factors due to steric hindrance and increased nucleosome compaction. However, 5hmC favors transcriptional activation and thus questions the concept of the hindrance of transcriptional activation through DNA methylation. Many transcription factor binding sites harbor CpGs within them (e.g., CREB/ATF, E2F, c-MYC and NF-țB) and methylation at these CpGs hinders their binding [37]. Moreover, methylation of the CpGs adjacent to the transcription factor binding sites can also prevent their binding. For example, while the methylation of the CpGs within the Sp1/Sp3 transcription factor binding site does not affect its binding, methylation of the CpGs adjacent to the Sp1/Sp3 binding site prevents binding [38].
A correlation between DNA methylation and nucleosome dynamics has been demonstrated in the literature. DNA methylation influences many aspects of nucleosome dynamics including nucleosome positioning, stability [39] and structure [40]. The nucleosomal DNA is methylated in a 10 bp periodicity. Methylation at nucleosomal DNA is much higher than the flanking DNA, therefore, nucleosome positioning influences the methylation at the flanking DNA sequences [41]. In the case of the nucleosome structure, increased methylation leads to topological changes and more compact/tighter wrapping of the nucleosomal DNA around the nucleosome [40]. DNA methylation mediates more compact and rigid nucleosome positioning and thereby leads to a closed chromatin structure [42].
For these reasons, treatment with DNA demethylating agents or DNA methylation inhibitors such as 5-aza-2൏-deoxycytidine (Decitabine), 5-azacytidine, and Zebularine results in a more open chromatin structure leading to transcriptional activation. Interestingly, depletion of DNA methylation by DNA demethylating agents has also been shown to result in increased localization of histone variants such as H2A.Z which causes nucleosome disassembly and supports transcriptional activation [43].

Recruitment of Protein Complexes
The recruitment of DNA binding protein complexes including MBPs, co-repressor complexes and chromatin remodeling factors is a well-known mechanism for gene repression, mediated through DNA methylation. Once recruited to chromatin, these protein complexes are capable of mediating gene repression independently or in concert with cross-talks between histone PTMs. The recruitment of MBPs, such as MeCP2 and other proteins which contain a MBD, is a well-established mechanism of DNA methylation-mediated transcriptional repression or silencing [35,44]. The MBPs can recruit co-repressor complexes, which contain repressor proteins such as HDACs and SIN3. However, the role of MBPs in methylation-mediated transcriptional regulation is rather paradoxical because of the dual nature of the binding of these proteins to methylated DNA. For instance, 5hmC methylation was previously shown to prevent the binding of MeCP2 and was implicated in promoting the role of MeCP2 in transcriptional repression [45]. However, recent studies show that MeCP2 also binds to 5hmC at transcriptionally active chromatin domains [36] and that MeCP2 is an interacting protein partner of TET1 protein [46] , suggesting its role in transcriptional activation. The precise role of MeCP2 binding to 5hmC was challenged by a recent study showing the repression of the GAD1 and RELN genes in association with the binding of MeCP2 to 5hmC at their promoters in the cerebellum of autistic patients [47].

Cross-Talks with Histone PTMs
The cross-talks between DNA methylation and histone PTMs can occur in two ways [35]. First, the DNA methylation established by DNMTs and/or TET proteins can lead to the recruitment of MBPs and other transcription regulatory proteins. These proteins can recruit the "writers" of histone PTMs followed by the recruitment of "readers" and/or "erasers". Secondly, histone PTMs can directly or indirectly recruit the methyl writers (such as DNMTs) to establish DNA methylation [35].
Despite the significant advances made with regard to deciphering the role of DNA methylation in regulating gene expression, the presence of multiple methyl modifications, methyl binding proteins and interacting partners have complicated the perspective. Hence, the precise role of each methyl modification is still controversial and requires further analysis.

RNA Splicing
The role of promoter DNA methylation in gene expression regulation is well established. Recent knowledge concerning intragenic (or gene body methylation) and its role in alternative splicing have caused a paradigm shift in the conventional role of DNA methylation in transcription. Many recent studies have demonstrated the enrichment of DNA methyl marks within exons in contrast to the nearby intronic regions [48][49][50]. Moreover, there are subtle differences in the CpG density as well as methylation density between splice donor and acceptor sites. In this case, the donor sites have higher CpG density but are hypomethylated while the acceptor sites have less abundant CpGs and are hypermethylated [50]. The involvement of DNA methylation in splicing was further explained by its role in exon-intron recognition. Higher DNA methylation levels are observed in alternate exons [51] and spliced exons [49].
A handful of regulatory proteins, of which the chromatin binding is methylation-dependent, have been linked to the methylation-dependent alternative splicing.

Role of MeCP2
In human fetal lung fibroblast cells (IMR-90), MeCP2 binding to methylated DNA within alternate exons contributes to the regulation of alternative splicing. DNA methylation at alternate exons drive the binding of MeCP2, while disrupted DNA methylation patterns caused ablation of the MeCP2 recruitment to these exons and subsequently results in elevated histone acetylation and alterations in exon skipping events [51]. MeCP2 recruits HDAC complexes, which reduces the steady state of the histone acetylation, resulting in an elongation rate that favors alternate exon inclusion. When HDAC activity is inhibited, MeCP2 levels are knocked down, or DNA methylation is inhibited, alternate exon exclusion will occur. The authors also demonstrated that MeCP2 is involved in the recognition of exons in splicing mechanisms [51]. The interaction of MeCP2 with spliceosome complex protein PRPF3 [52] and RNA binding protein YB-1 [53] as well as the altered alternative splicing events occurring in a MeCP2-mutant (Rett Syndrome) mouse model [53], further support the role of MeCP2 in splicing.

Role of CTCF
Using CD44 gene splicing as a model for alternative splicing, CCCTC-binding factor (CTCF; a known insulator protein or a boundary element) was shown to be important in methylation-dependent co-transcriptional alternative splicing [54,55]. According to this model, hypomethylation of the CTCF binding site within exon 5 of the CD44 gene recruits CTCF onto exon 5, pauses RNA polymerase II (RNAPII), and subsequently causes inclusion of exon 5. In contrast, hypermethylation of the CTCF binding site prevents CTCF binding, which ultimately causes fast movement of RNAPII and exclusion of the weak exon 5. Coupling of RNAPII pausing to splicing through CTCF has also been demonstrated in the Myb gene [56]. CTCF is capable of mediating long-range interactions between intronic elements, promoter and upstream promoter enhancer elements, which leads to RNAPII pausing. Moreover, high RNAPII pausing index was found to be highly correlated with the recruitment of CTCF to upstream promoter elements and 5൏ untranslated regions (UTRs) [57].

Genome Organization
DNA methylation and histone PTMs function together to maintain the structure of chromatin, which is ultimately represented in gene expression patterns within a cell. Through dynamics of DNA methylation and histone PTMs, either open or closed chromatin structures are obtained to aid transcriptional activation or repression, respectively. Therefore, DNA methylation plays a critical role in maintaining genomic organization. The role of DNA methylation in genomic organization can be further subdivided into different aspects such as modulation of chromatin architecture, maintenance of genomic stability and reduction of transcriptional noise (variability in gene expression) caused by spurious transcription. Hence, the context of DNA methylation as well as the context-dependent recruitment of methylation-related proteins, such as MBPs and CTCF, are important in maintaining the genomic organization.

Modulation of Higher Order Chromatin Structure
The role of DNA methylation in regulating chromatin structure via euchromatin or heterochromatin is well established and extensively studied. Different methyl modifications contribute to chromatin structure in different ways. Euchromatin regions are generally enriched with 5hmC while heterochromatin domains are devoid of 5hmC in mouse ESCs [58]. Similarly, 5caC is accumulated in euchromatin regions in ovarian follicular cells [59]. Moreover, both 5fC and 5caC are found in repetitive microsatellite loci in mouse ESCs [60]. Once again, these studies highlight how DNA methylation is context-dependent.
MBPs function as chromatin architectural proteins and therefore contribute to genome organization [35]. The chromatin architectural role of MeCP2 has been extensively studied. MeCP2 is highly abundant in neurons with levels similar to histone proteins [61], and competes with histone H1 to bind to linker DNA [62]. Absence of MeCP2 or mutations that cause disruption of the MeCP2-chromatin bridge alter formation of higher order chromatin structures [63]. MeCP2 contributes to the formation and maintenance of the higher order structures through the formation of loops and bridges [64,65].

CTCF and Genome Organization
CTCF also contributes to genome organization as a chromatin architectural protein. It has been referred to as the master genome organizer because of its diverse functions (discussed in detail in Section 4.2.1) [66]. CTCF mediates long distance interaction between different chromatin loci through the formation of chromatin loops [67,68]. Chromatin immunoprecipitation (ChIP), chromosome conformation capture studies, and chromatin interaction analysis by paired-end-tag sequencing (ChIA-PET) have revealed interactomes (protein-protein interactions) of CTCF within the genome which represent both inter-and intra-chromosomal interactions mediated through CTCF binding to chromatin [69,70]. Therefore, CTCF is involved in the formation of a genomic web of interactions.

Nuclear Architecture
Another function of DNA methylation and associated proteins is in nuclear organization, which is an indirect representation of the genomic organization. Alterations in the nuclear architecture were observed in the absence of DNA methylation [71]. Nuclear architecture can be visualized as chromocenters, chromosome territories, and the size and shape of nuclei. Alterations in DNA methylation itself or the MBPs cause changes in the aforementioned features. For example, loss of MeCP2 in brain or overexpression of MeCP2 in neurons cause changes in the number and size of chromocenters, as well as size of the nuclei and nucleoli [72]. Impaired DNA methylation patterns cause reorganization of chromocenter territories [73].

Transcription Noise
Recent studies show that gene body methylation (exons and introns) affects the distribution of DNA methylation, histone PTMs and the RNAPII recruitment, and subsequently affects molecular processes such as transcription and RNA splicing. Gene body methylation also reduces the transcriptional "noise" caused by spurious transcription [74]. However, this concept, which was based on the notion that DNA methylation is generally repressive, has been challenged by new understanding of the distribution of 5hmC within gene bodies [27]. Recruitment of MeCP2 to methylated repetitive elements was also shown to reduce transcriptional noise [61] and thus MeCP2 is being considered as genome-wide epigenetic modulator rather than a specific transcriptional regulator [75]. MeCP2 is able to bind to both 5mC and 5hmC [36]. It is theoretically capable of regulating transcriptional noise regardless of the methyl modification. Moreover, DNA methylation contributes to genomic organization through the suppression of the proliferation of transposable elements [76][77][78].

Imprinting
DNA methylation is an essential process in primordial germ cells (PGC). It establishes genomic imprinting and occurs during PGC reprogramming [79]. Several lines of recent evidence show that, during this process, 5mC is converted to 5hmC (which is functionally the same as demethylation) [80,81]. These studies highlight the roles of both 5mC and 5hmC methylation in genomic imprinting. Genomic imprinting involves silencing of alleles, allowing for preferential gene expression exclusively determined by the parent-of-origin and not the DNA nucleotide sequence itself. It has been established that alleles are silenced by selectively methylating the imprint control region (ICR) within a gene. This blocks ICR's binding site potential and downstream-enhancer mechanisms. Differentially methylated regions (DMRs), which are also inherited from either the paternal or maternal chromosome, are located near imprinted genes as well, which in turn affects gene expression. Each paternal and maternal allele contains different DMRs, allowing for regulatory proteins like CTCF to bind DNA and interact with both the ICR and DMRs, allowing the formation of a "chromatin loop" and therefore protecting the allele from methylation [4]. The majority of the imprinted genes are clustered in imprinted domains and these genes harbor DMRs, which are characterized by the parent-of-origin-specific DNA methylation profiles [82]. Well known imprinted domains carrying specific DMRs include IGF2/H19, DLK1/MEG3 and SGCE/PEG10. These DNA methylation profiles within the imprinted genes serve a good biosensor for the early exposure to environmental factors and insults as a measurement of the epigenetic memory of early events [82][83][84]. It also seems that the changes in DNA methylome at the imprinted genes might render susceptibility to imprinting-related diseases [85,86].
Several DNA methylation-related epigenetic factors are involved in the regulation of imprinted genes [84]. For example, the paternal H19 requires the epigenetic modifiers DNMT1, DNMT3A, and DNMT3A. MeCP2 is known to regulate Dlx5 and Ube3a [87,88]. CTCF functions as a boundary element or insulator for the IGF2/H19 locus and is able to bind to the unmethylated DMR of the maternally inherited IGF2/H19 locus. This binding creates an insulator region for the Igf2 promoter and prevents the interaction between Igf2 promoter and its enhancers [89]. Moreover, CTCF interacts with cohesin and is bound to a subset of imprinted genes such as Magel2/Peg12 to which CTCF binds in an allele-specific manner [90]. Whether the binding of the epigenetic factors such as CTCF and MeCP2 to DMRs is dependent of the type of methyl modification is still unclear.
Another key player in imprinting is the more recently discovered maternal effect gene ZFP57. This gene codes for a zinc-finger binding protein, which is part of the Krüppel-associated box (KRAB) repressor protein family. ZFP57 is thought to recruit DNMT1 to the ICRs, thereby explaining mechanistically how DNMTs choose which allele is to be methylated. ZFP57 interacts with co-factor KAP1/TRIM28, which contains various domains that have epigenetic regulatory components by functioning as a scaffold: heterochromatin protein 1 (HP1) binding motif, plant homeodomain (PHD), and bromodomain (BRM) [91,92] These domains are known to bind modified histones. These in turn recruit chromatin modifying enzymes including nucleosome remodeling deacetylase (NuRD) complex, HP1, histone-lysine N-methyltransferase (SETDB1), UHRF1 and DNMTs [23,91,92]. Taken together, this paints a rather complex picture, depicting direct involvement of histone PTMs and chromatin-remodelling machinery in mediating DNA methylation at imprinted alleles. Particularly, methylated histones (H3K9me3 and H4K20me3) are thought to be associated with ICRs. H3K9 is trimethylated by Suv39h1 or Suv39h2 and DNA methyl binding proteins like MeCP2 are known to associate with Suv39h1/2. MeCP2 brings in DNMT1, therefore inducing CpG methylation and establishing heterochromatin [23,92].

X-chromosome Inactivation
Apart from genomic imprinting, XCI is another method to drive monoallelic gene expression. The X chromosome is home to roughly 1,000 genes [93]. In human females, having both X chromosomes expressed would double the dose of those genes, not only creating an imbalance between sexes, but also potentially being toxic to the cell [94,95]. During the late-blastocyst stage, one of the two X chromosomes in every cell is randomly chosen to undergo eternal silencing, termed XCI [96]. This allows for both males (XY) and females (XX) to have equal X-linked gene dosages, termed dosage compensation [93,94]. DNA methylation is believed to be important in the inactivation process as well as maintenance of the inactivated X-chromosome [97]. A basic difference between the active X-chromosome (Xa) and the inactive X-chromosome (Xi) is the level of DNA methylation. Xi has higher levels of DNA methylation, which maintain a compact chromatin structure. Moreover, the genes found within Xi tend to have methylated the CpG islands in contrast to the unmethylated CpG islands in Xa [98].
The human X chromosome contains an X inactivation center (XIC), which harbors two genes essential for XCI, XIST and TSIX. Both genes encode noncoding RNAs (ncRNAs). XIST ncRNA is over 19 kb and remains in the nucleus. TSIX ncRNA is greater than 30 kb and is complement to XIST. Although the choice of which X chromosome to inactivate is random, the inactivation process itself is not. In mammals, the initiation of XCI starts with the downregulation of trans-TSIX and the upregulation of cis-XIST on the soon-to-be inactive X chromosome (Xi). The TSIX levels will remain stable on the active X chromosome (Xa) and inhibit the transcription of XIST. XIST ncRNA will then travel across the Xi, coating it, and recruits the polycomb repressive complex 2 (PRC2) which will bind XIST, and in turn, recruit PRC1 and DNMTs to subsequently methylate the CpG islands of gene promoters. This gives Xi an overall condensed (heterochromatic) status identifiable at the histologic level and known as the "Barr body". However, not all genes on the X chromosome are silenced. It is estimated that 15%-20% of X-linked genes escape silencing (the pseudo-autosomal region, PAR included), and roughly 10% of genes on XCI are only partially silenced [93,99]. There is also evidence suggesting direct contact between enhancer of zeste homologue 2 (EZH2) of PRC2 and DNMT3A/B, however contradictory evidence exists as well [100,101].
The Xist promoter itself is regulated by promoter methylation, and the Tsix noncoding RNA was shown to activate the DNMTs to mediate the DNA methylation at the Xist loci leading to the repression of Xist gene [102,103]. Moreover, Tsix has been shown to function together with CTCF to determine which X-chromosome will be inactivated [104]. Even though imprinting and XCI are theoretically different processes, they are linked to each other. Among the types of XCIs occur during mouse embryonic development, there are two types: imprinted XCI and random XCI. During mouse embryogenesis, the paternal X-chromosome is inactivated through the expression of paternal Xist and binding of this Xist transcript to the X-chromosome. This paternally inactivated X-chromosome is also called imprinted XCI. However, unlike the random XCI, it seems that DNA methylation does not play a significant role in imprinted XCI because during this stage of embryogenesis (morula stage) the genome undergoes DNA demethylation [105].

Methyl Binding Proteins: "The Methyl Readers"
As described previously, DNA methylation at specific loci is established by DNMTs and/or TET proteins. However, in order to interpret the methylation information established by these enzymes, a "methyl reader" is required, which can recognize, bind to, and recruit other protein complexes to methylated DNA. Methyl binding proteins or MBPs are characterized by their conserved methyl-CpG-binding domain (MBD) which basically allows them to recognize and bind to methylated DNA (see Figure 2A). The proteins that belong to MBP family include methyl-CpG-binding protein 2 (MeCP2), MBD1, MBD2, MBD3 and MBD4 [106]. However, despite the presence of the MBD domain, MBD3 is unable to bind to methylated DNA, presumably because of the presence of two extra amino acids (His-30 and Phe-34) within the domain [107]. However, MBD3 can still function as a transcriptional regulator similar to other MBPs through interactions with HDAC1 and MTA2 found in NuRD/Mi2 complex [107]. Apart from the proteins with MBD domain, Kaiso proteins, which lack a MBD domain, are also able to bind to methylated DNA through their Zinc (Zn) finger domains [108]. The overall structure of the MBPs is critical in their biological functions. For example, MeCP2 (considered to be the prototype protein of the MBP family) is comprised of two major domains: MBD and transcription repression domain (TRD). The MBD and TRD domains mediate binding to methylated DNA and transcriptional regulation, respectively. MeCP2 is a multifunctional epigenetic modulator and its major functions include transcriptional regulation (both transcriptional activation and repression), chromatin structure modulation and RNA splicing [109]. The diverse functions of MeCP2 are mediated by protein-protein and protein-DNA interactions, where these interactions are mediated through different domains of the MeCP2 protein (see Figure 2B,C).
The requirements for binding to methylated DNA differ from one MBP to another and are DNA sequence-dependent. Such sequence-dependent binding requirements should contribute to the unique binding to target gene promoters [110]. MeCP2 protein requires a single CpG with at least four adjacent A/T pairs [111]. MBD1 binds to a single CpG site in the context of TCpGG, CCpGA, GCpGG and CCpGC with varying degrees of efficiency [112]. The binding of MBD2 to methylated DNA is determined by the presence of either CpGG or CpGC sequences within the target gene promoter, with a higher affinity towards CpGG [113]. In contrast, the Kaiso proteins prefer binding to two symmetrical methylated CpGs in the context of CGCG [108]. Not only the underlying sequence, but also the type of methylation (5mC, 5hmC, 5fC or 5caC) determines the binding of MBPs to chromatin. Recent studies have determined the diverse patterns of MBP binding to methylated CpGs in different cell/tissue types and at different genomic loci (Table 1). Among the MBPs, MeCP2 has been shown to bind to both 5mC and 5hmC with high affinity and was shown to be a major 5hmC binding protein in the brain [36]. Mouse ESCs [114] Note: SIN3A, NCOR2 (corepressors), CREB1 (coactivators), SRSF2/3, PRP31 (splicing factors/mRNA processors); DNMT1 (DNA methyltransferase); EHMT1 (histone methyltransferase); CTCF (insulator/transcription factor/multifunctional protein).
Even though MBPs are characterized by their binding to methylated DNA, these proteins can also bind to unmethylated DNA. For instance, MeCP2 is able to bind to unmethylated DNA [117,118], which is mediated through MBD [62,119] as well as inter domain (ID), TRD and C-terminal domain (CTD) [119]. The CXXC3 zinc finger domain of MBD1 drives its binding to unmethylated CpGs [120]. A recent study also demonstrated the affinity of MBD3 towards both methylated and unmethylated CpGs, similar to MBD2 [121].
As discussed in Section 2, the biological functions of MBPs are diverse and include, but are not limited to, modulation of chromatin architecture, gene expression regulation and RNA splicing. Apart from that, MBPs also play a role in maintenance of DNA methylation. For example MBD4 is involved in maintenance of methylation and suppression of the mutations occurring at CpG sites [122]. Moreover, MeCP2 interacts with DNMT1 to perform DNA methylation maintenance [123].

DNA Methylation and MBPs in Human Diseases
Alterations of DNA methylation patterns are known to be associated with a variety of human diseases. DNA methylation is highly susceptible to environmental cues and environmental insults such as exposure to toxins, teratogens (see Section 5), diet (nutrient availability), and mental state (stress). It is a dynamic and heritable epigenetic modification, and thus alterations of methylation can be transmitted through generations, which are usually referred to as "transgenerational epigenetic effects" [124]. Moreover, single nucleotide polymorphisms (SNPs) are found in DMRs and many of these SNPs (CpG-SNPs) are associated with several human diseases such as type 2 diabetes [125]. CpG-SNPs contribute to the disease through either introducing a CpG or disrupt a CpG affecting the DNA methylation patterns, gene expression and changes in splicing. Furthermore, CpG-SNPs at the Prodynorphin gene were found to be associated with alcohol dependence [126]. DNA methylation at the promoter of specific genes or gene body methylation affecting splicing can contribute to human diseases. The involvement of DNA modifications in brain development and neurological disorders has been studied extensively and is described in Section 5.
A major mechanism by which DNA methylation is involved in human diseases is through the involvement of MBPs. MBPs can cause human diseases through multiple ways (see Figure 3). They include any process which affects their expression and functions such as alterations in (1) transcriptional regulation (transcription, splicing, posttranscriptional regulation); (2) protein expression and trafficking to different functional cellular compartments; (3) expression in specific cell types; (4) localization (nuclear, cytoplasmic, nucleoli); (5) mutations causing impaired expression, localization or functions or binding to interacting proteins/DNA; (6) posttranslational modifications causing impaired function; and (7) binding to chromatin and target genes. The presence of such mechanisms in several human diseases including cancer, diabetes as a metabolic disorder, imprinting disorders, and immune system-related disorders are described in detail below. Cancer was long regarded as a genetic disease caused by spontaneous mutations in the genome, but recent findings have showed that alterations in the epigenome are also important [127,128]. Correct methylation patterns are necessary for the appropriate binding of MBPs [129]. The precise roles of MBPs differ between the different types of cancers.
In the absence of MeCP2, a two-fold increase in the expression of genes that play a role in prostate cancer derived PC3 cell line survival, proliferation, migration, and invasion (e.g., BAK1, Cant1, Cav1, CCND1, CD164, Enolase 2, RASSF1A and NTN4) is seen [128]. In myeloma, regulation of SPAN-XB gene expression by binding of the MeCP2 protein to the promoter dictates cell proliferation [130].
MBD2 mRNA levels may directly or indirectly play a role in the production and maintenance of DNA methylation of genes important in gastric carcinogenesis [131]. Three SNPs in the MBD2 gene are associated with reduced risk of breast cancer [132]. Binding of the MBD2 protein with CCAAT/enhancer binding protein alpha (CEBPA) transcription factor enhances or suppresses expression of genes involved in liver cancer [133]. In uterine serous carcinoma (USC), deletion of a small segment of chromosome 19, which includes the MBD3 gene, is a frequent occurrence [134]. In gastric carcinoma, decreased MBD3 mRNA levels are associated with dissociation of the NuRD complex and aberrant DNA methylation [131,135,136]. A polymorphism in MBD4 Glu364Lys seems to influence protein stability as it is associated with increased susceptibility to cervical cancer [137].
Kaiso protein nuclear localization and binding to the promoter of E-cadherin leads to suppression of E-cadherin protein (involved in cell junctions), which in turn may be involved in breast cancer development [138,139]. Consociation of Kaiso proteins and p120-catenin in the cytoplasm of lung cancer cells is associated with its phenotypes. Dai et al., further suggest that abnormal p120 ctn expression and cytoplasmic expression of Kaiso is also observed in lung cancer phenotypes. Kaiso proteins may have an oncogenic function in the cytoplasm of lung cancer cells and their localization is affected by p120 ctn which is its binding partner [140].

Diabetes Mellitus
Diabetes is one of the most severe metabolic disorders. It is caused by the lack of insulin or an insensitivity to insulin, which ultimately leads to high blood glucose levels and chronic damage in a wide range of organ systems. Among the diverse factors contributing to the pathogenesis of diabetes, epigenetic modifications have attracted attention because they can connect the pathology to environmental cues [141]. The methionine-homocysteine cycle, which generates methyl donors, is a major pathway important for DNA methylation (see Figure 4). The cysteine biosynthetic pathway is affected in diabetic individuals leading to higher levels of homocysteine in direct correlation with increased blood glucose levels [142]. Other intermediates of the methionine-homocysteine cycle are also altered in diabetes. These include folate, SAM, and S-adenosylhomocysteine [143][144][145]. Imbalance in the methyl donor levels can contribute to altered DNA methylation as well as histone methylation. The CpG-SNPs causing alterations in DNA methylation are also common in diabetes [125]. One example is the IL2RA gene promoter, which is regulated by DNA methylation [146]. Several genes implicated in the pathogenesis of diabetes are deregulated by altered DNA methylation. For example, decreased levels of PPARGC1A gene expression in diabetics are associated with increased DNA methylation at its promoter [147]. Examples of other diabetes-related genes regulated by DNA methylation include GLP1R [148], PDX-1 [149], and CTGF [150]. Both human and mouse insulin genes are regulated by promoter methylation and MeCP2 binding [151]. Hyperinsulinemia is another metabolic condition associated with diabetes or mistakenly diagnosed as diabetes. Mice lacking MeCP2 expression develop insulin resistance and hyperinsulinemia [152]. Several girls with Rett syndrome, a neurological disorder primarily caused by mutations in the MECP2 gene [44,109], have developed type 1 diabetes [153][154][155]. Because several diabetes-related genes are regulated by DNA methylation, it is highly likely that other MBPs might contribute to the pathogenesis of diabetes or related metabolic disorders.

Imprinted Disorders
Imprinting disorders such as Prader-Willi syndrome (PWS)/Angelman syndrome (AS) (chromosome region 15q11-13), Beckwith-Weidemann syndrome, and Silver-Russell syndrome [156] are associated with a mutation or deletion of unique parent-of-origin DMRs in chromosome regions 15q11-13, 11p15.5, and 11p15 or maternal uniparental disomy of chromosome 7, respectively [84,157,158]. Changes in MBPs might also be associated with dysregulation of imprinted genes. Decreased MeCP2 protein expression (misregulated by miR-483-5p) and MeCP2 protein partners such as HDAC4 and TBL1X are observed in cells collected from Beckwith-Weidemann syndrome patients [159]. The imprinted locus H19 is a bona fide target of MeCP2 [160,161]. Moreover, MeCP2 regulates other imprinted genes such as UBE3A [162,163] and DLX5 [164]. Clinical overlap between Angelman and Rett syndromes highlight the potential involvement of MeCP2 in such imprinting disorders [87]. One carbon metabolism and the effects of alcohol (EtOH) and nutrition throughout the metabolical pathway. Ethanol is known to increase histone 3 (H3) acetylation, increase and decrease H3 methylation levels, and alter mitogen-activated protein (MAP) kinases. Ethanol can also have an effect on S-adenosylmethionine (SAM) biosynthesis, which is a methyl donor for DNA methyltransferases (DNMTs) and histone methyltransferases (HMTs), thereby having an effect on DNA methylation (generally hypomethylation). It also plays roles in other metabolic pathways (e.g., polyamine synthesis, transsulfuration, transmethylation, DNA synthesis) [165]. Overall, the biochemical cycle is complex and involves many metabolites and compounds. Green boxes and arrows demonstrate the resulting cellular functions. Symbols

Immune System-Related Disorders
The diverse types of terminally differentiated immune system cells types are generated through the process of differentiation of hematopoietic stem cells (HSCs). Hematopoietic stem cell differentiation requires a tightly controlled gene expression patterns, which is mostly acquired through epigenetic regulation [166][167][168] and many of the hematopoiesis-related genes are regulated by promoter methylation and examples include interleukins, CD5, CD29, PU1 and Pax5 [169]. Altered expression and mutations of DNMTs have been linked to impaired hematopoiesis, which can lead to immune disorders ranging from allergies to autoimmune disease [169]. Impaired immune system functions were reported in MBP-related disorders. For instance, patients and mice models with MECP2 duplication syndrome show poor T-cell responses, high rate of infections, and altered levels of IFN-ܶ [170,171]. Polymorphisms of MECP2 genes and MeCP2 overexpression were found to be a risk factor for systemic lupus erythematosus [172,173]. MECP2 SNPs are also connected to the pathogenesis of primary Sjögren's syndrome [174]. MBD2 is considered to be a candidate therapeutic target for autoimmune disorders [175], due to its presumed role in genome-wide DNA demethylation in systemic lupus erythematosus [176]. Similarly, MBD4 expression is elevated in systemic lupus erythematosus patients [177].

Cardiovascular Diseases and Cerebral Ischemia
The impact of epigenetic factors on cardiovascular disease pathology has been shown in relation to DNA methylation and microRNAs. The potential of using these epigenetic marks as biomarkers for cardiovascular disease prediction has also been proposed [178]. In vascular disease conditions, global DNA hypomethylation and increased S-adenosylhomocysteine have been observed [179]. Moreover, hypomethylation of long interspersed nucleotide elements in blood samples were associated with an elevated risk of stroke and ischemic heart disease [180]. Several genes have been suggested to be misregulated by DNA methylation, which could lead to cardiovascular diseases. For example, gene-specific DNA hypomethylation of 15-lipoxygenase (ALOX15) and monocarboxylate transporter (MCT3) is associated with an increasing degree of atherosclerosis [178].
Cerebral ischemia or brain ischemia is a condition caused by a loss of blood supply to the brain, which leads to cerebral infarction, stroke, hemorrhage and death of brain tissue. In a mouse model of ischemic brain injury, increased DNA methylation and elevated methyl-group incorporation were observed. Moreover, loss of function of DNA methyltransferases lead to increased resistance to brain damage caused by ischemia. In the same mouse model, 5-Aza-mediated inhibition of DNA methylation suggested neuroprotection [181]. Similarly, reduced levels but not a complete loss of Dnmt in mouse post-mitotic neurons was associated with neuroprotection from ischemia [182]. Similar to cardiac ischemia, global hypomethylation represented by long interspersed nucleotide elements was linked to increased risk of cerebrovascular ischemia in males but not in females [183]. Therefore, together, these reports suggest a link between DNA methylation and cerebral ischemia and potential pharmacological applications of epigenetic modifiers such as DNA demethylating agents to confer protection from ischemia [184]. Illustrating the contribution of MeCP2 in brain ischemia, its protein levels were shown to be upregulated presumably through loss of miR-132 activity during ischemic preconditioning in mouse cortex [185]. Table 1 summarizes examples of other proteins, which bind to different methyl modifications. The evidence on the binding of specific proteins to methyl marks including corepressors such as SIN3A and NCOR2; coactivators such as CREB1; proteins which are involved in RNA splicing such as SRSF2/3, PRP31; DNA methyltransferases such as DNMT1; lysine methyltransferases such as EHMT1; and multifunctional epigenetic factors such as CTCF further demonstrate the diverse functions of DNA methylation multiple biological processes. Interestingly, the diverse nature of the regulatory proteins bound to 5caC (CTCF, EHMT1, NCOR2, PRP31 and DNMT1) hints the potential role of 5caC in chromatin organization, transcriptional regulation and RNA splicing ( Table 1).

Other DNA Binding Proteins Affected by DNA Methylation and Their Role in Human Diseases
While the above mentioned proteins are bound to specific methyl modifications, binding to DNA for some of these proteins is prevented by other forms of methyl modifications found within their binding sites or adjacent to the binding sites. One protein belonging to this category and linked to many disease states is CTCF.

CTCF
The recruitment of CTCF to intron/exon boundaries or to exons has shown to be ablated by 5mC methylation and shown to contribute to the regulation of DNA methylation-dependent alternative splicing [55]. However, it is still unclear whether the 5hmC methylation supports the binding of CTCF. There is controversial evidence with regard to the potential binding of CTCF to 5hmC. An enrichment of 5hmC was observed at the CTCF binding sites in enhancer regions within human ESCs [28]. In mouse spermatogonia, the enrichment of 5hmC at the promoter sequences bound by CTCF was lower while the CTCF-bound intronic sequences were enriched with 5hmC methylation [186]. In mouse ESCs, CTCF was shown to be bound to 5caC [114]. Therefore, it is possible that depending on the cell type or the genomic loci (enhancers, promoters, exons or introns), the binding of CTCF to methyl marks might be different, and so as the functions associated with its binding (see Figure 5).
CTCF was first reported as a transcriptional repressor for the chicken C-Myc gene [187]. Since then its functions have been expanded to multifunctional epigenetic modulator, which is involved in transcriptional activation and repression, barrier elements or boundary element (insulator), modulation of chromatin structure, and regulation of RNAPII-mediated transcriptional elongation and co-transcriptional splicing (see Figure 5). It has also been shown to be a critical factor in genomic imprinting through binding to the IGF2/H19 imprinting control region [188]. Earlier it was thought that the multifunctional behavior of this single protein is determined by its interacting protein partners [189]. Some of the known protein partners of CTCF are key players of specific cellular processes such as higher order chromatin structure formation (cohesin) [190], transcription and elongation (RNAPII) [191], RNA splicing (RNAPII) [55] and XCI (YY1) [192]. On the other hand, considering the diverse nature of CTCF in binding to different methyl modifictions, it seems that the type of methyl modification is also a contributing factor to CTCF multifunctional behavior.
As a multifunctional epigenetic factor, its altered expression, binding to chromatin, and functions are associated with several human diseases. In breast cancer cells, both CTCF mutations [193] and altered expression of CTCF [194] are reported. Other than breast cancer, its role in cancer conditions was shown in many other cancer types such as prostate cancer [195], lung cancer [196], and colorectal cancer [197]. Due to its critical role in genomic imprinting, it is linked to imprinting disorders such as Beckwith-Wiedemann syndrome. In case of Beckwith-Wiedemann syndrome, demethylation of the IGF2/H19 imprinting control region leads to the binding of CTCF and subsequently blocks the connection between an enhancer for IGF2 and promotes activation of maternal H19 [198]. CTCF is also involved in misregulation of other methylation-related epigenetic factors such as MeCP2. In autistic patient brains, the swapping of heterochromatin regions into the MECP2 promoter and subsequent silencing is causing by increased promoter methylation and prevention of binding of CTCF [199].

Normal Early Brain Development
During normal human embryo development, near the start of the third week of gestation, the midline and anterior-posterior axes are formed, guided by the movement of the mesodermal cells along the midline (primitive streak) in the caudal region. The primitive streak gives rise to the primitive pit, which merges with the mesodermal cells that form the notochord. The ectoderm dorsal to the notochord becomes the neural plate. The notochord initiates the differentiation of ectodermal cells into neuronal precursor cells via a gradient of signaling molecules, including retinoic acid (i.e., vitamin A) and various peptide hormones: Wnt, FGF, TGF (BMP), sonic hedgehog (SHH) and Nodal. The neural plate, containing neural progenitor cells, elongates and starts to fold, forming a neural groove. The neural folds eventually meet and a tubular structure (the neural tube) forms by the start of the fourth week of gestation. By embryonic days 25 and 27, the anterior and posterior neuropores close, respectively. Neural crest cells, dorsal to the neural tube, migrate and differentiate into the sensory and sympathetic ganglia cells, adrenal neurosecretory cells and the enteric nervous system cells. The neural floorplate determines the polarity of the neural tube and subsequently gives rise to the motor neurons. The remaining ectodermal cells differentiate into the epidermis and the neural tube gives rise to the entire CNS [200][201][202].
Production of long projection neurons begins during the 6th week of gestation and finishes towards mid-gestation. Major structures of the brain, induced via various additional signaling molecules, are largely defined by the end of the embryonic period (8 weeks). During the early fetal period, extending to mid-gestation, the cortical neurons are generated. During the mid to late fetal period, maximal around mid-gestation and tailing off to roughly the 34th week of gestation, precursors of inhibitory interneurons and glial cells are produced. An exception is the cerebellum, where internal granular neurons are generated up to approximately 9 months after full term birth. Following generation, the cells migrate to their destined site and begin to differentiate. Differentiation begins in the fetal period but continues well after birth with the brain weight doubling in the first year of life, reaching 90% of adult weight by age six, and reaching full adult weight in early teen years [200,201].
Inadequate nutrition or the introduction of teratogens can affect normal embryonic or fetal brain development by altering the levels of signalling molecules and can lead to brain malformations. Early life experiences and the growth environment also influence postnatal brain development.

DNA Methylation and Brain Development
In mammals, DNA methylation reprogramming occurs soon after fertilization when the paternal and maternal genomes, packaged in their pronuclei, undergo demethylation. The paternal genome undergoes active demethylation before replication occurs, and the maternal genome undergoes passive demethylation, the exception being paternal and maternal imprinted genes. This is postulated to avoid the transfer of environmentally-acquired DNA methylation from the parents to the offspring. During implantation, for the zygote to obtain a totipotency status, widespread de novo remethylation occurs, creating distinct DNA methylation patterns in the blastocyst [203,204]. DNA methylation is orderly, termed the DNA methylation program (DMP). During neurulation, both 5mC and 5hmC are present in the neural tube and neural crest. In the mouse neuroepithelium during the proliferation and migration phases, 5mC appears first, followed by 5hmC and other active demethylation marks (5fC, 5caC). 5mC, MBD1, and DNMT1 appear in a high to low concentration gradient from ventral to dorsal, which coincides with the start of differentiation and maturation of neurons and neural crest cells. The appearance of 5hmC marks the differentiation of the neuroepithelium, coinciding with a loss of Oct4 (a marker of undifferentiated cells) and the appearance of markers for neurogenesis (Map2) and proliferation (Crabp1). The DMP continues postnatally, with regional variations in 5mC/5hmC levels [205,206].
During early hippocampus development and neurogenesis (E15 to P7 in mice), each region of the hippocampus has its own DNA methylation program (DMP). Immature cells express 5mC while mature cells express 5hmC [207]. As the brain matures, a reduction in 5mC/5hmC coincides with chromatin compartmentalization; 5mC tends to be found in heterochromatin while 5hmC is found in euchromatin. The demethylation marks 5fC and 5caC are also present throughout mouse brain development from embryonic day 10 to postnatal day 45, but in much lower concentrations then 5mC and 5hmC [206].
Apart from CpG dinucleotide methylation, there is also CpH (H = A, T or C) methylation (non-CpG; see Section 2.1). Lister et al. [208] found evidence to support that CpH methylation is preferentially bound and driven by DNMT3A. It accumulates in neurons, but not in glia, in the postnatal (2 years after birth) and early adolescent human brain. Expression coincides with brain maturation and synaptogenesis, eventually becoming the main form of DNA methylation in human adult neurons of the frontal cortex. Both 5hmC and CpH methylation are found to be abundant in the rodent and human brain. During early postnatal mouse brain development, 5hmC levels rise and is restricted to CpGs. CpH levels increase in differentiating cells. Genes specific to neuronal and synaptic development are found to be CpH hypermethylated in glial cells (thereby suppressing expression) and both CpG and CpH hypomethylated in neurons. This suggests a potential role of CpH methylation in transcriptional repression of neuronal specific genes in the glial genome. Conversely, genes associated with oligodendrocyte function are CpH hypomethylated in glia and CpH hypermethylated in neurons. This further supports the idea that DNA methylation regulates brain maturation [208]. Guo et al. 2014 [209] studied CpH methylation in the adult mouse dentate gyrus of the hippocampus in greater detail. In vivo, both CpH and CpG were found to be hypomethylated at neuronal transcription factor binding sites and transcription start sites, and regions of low CpG density contained CpH methylation. CpH methylation was also found to be more abundant in the adult brain than in the fetal brain, therefore supporting its role in post-natal brain maturation. MeCP2 and Dnmt3a were found to bind methylated CpHs in vivo. Both Dnmt1 and Dnmt3a are highly expressed in post-mitotic neurons. When CpH is demethylated, it can only be remethylated by de novo Dnmt3a [209].
Overall, DNA methylation and its binding partners, along with histone PTMs and chromatin organization drive early brain development, post-natal brain differentiation, and adult brain maturation. Aberrant DNA methylation can lead to impaired function and improper development. Furthermore, DNA methylation studies not only require CpG methylation analysis, but non-CpG methylation as well.

Effects of Teratogens on Normal Brain Development
Alterations to the cellular environment can be recorded in the epigenome. Adverse factors include teratogens such as recreational drugs (alcohol, tobacco, illicit drugs), toxins (lead, mercury), pollutants (BPA, PCB, PAH), poor nutrition (folic acid, choline, retinoic acid), maternal disease (diabetes), and even stress (through cortisol). Any of those factors could potentially alter the DNA methylome of the developing embryo or fetus. In terms of brain development, it could potentially lead to faulty organization of the nervous system or delayed neurogenesis. The most obvious and common effects on the newborn child are low birth weight and intrauterine growth restriction. Macroscopic congenital anomalies are the extreme. More subtle, but lifelong disabilities include diabetes, obesity, asthma, autism, depression, and fetal alcohol spectrum disorder (FASD) [205,210,211].

Alcohol (Ethanol)
Alcohol is metabolized into acetaldehyde by alcohol dehydrogenase (ADH) in the liver, and by catalase and CYP3E1 in the brain. Acetaldehyde is further metabolized into acetate in all cells by acetaldehyde dehydrogenase (ALDH) (see Figure 6) [212]. During pregnancy, ethanol readily crosses the placenta, and is found at high concentrations in the amniotic fluid as well as in the fetal serum. Adverse consequences include growth restriction, congenital anomalies, and FASD [213].
The literature concerning epigenetic effects of alcohol is large. Ethanol is known to increase histone H3 acetylation, and increase or decrease H3 methylation levels. Ethanol can also alter SAM biosynthesis, which is a methyl donor for DNMTs and lysine methyltransferases, thereby having an effect on DNA methylation levels (generally hypomethylation). It also plays roles in other metabolic pathways (e.g., polyamine synthesis, transsulfuration, transmethylation, DNA synthesis) (see Figure 4) [165]. Both SAM and methionine are potential therapeutic targets for rescuing DNA methylation levels affected by alcohol and by other teratogens as well [165,214,215].
Liu et al. [216] studied the effects of alcohol on global DNA methylation in whole mouse embryos during neural tube formation using MeDIP-ChIP and Sequenom mass ARRAY EpiTYPER methylation detector. The alcohol-treated embryos showed growth restriction and developmental abnormalities in the heart, posterior neural tube, brain regions, and limbs. CpG island content (density profiles) were classified into three groups: high CpG (HCP) (mainly associated with housekeeping genes), intermediate CpG (ICP) and low CpG (LCP) (mainly associated with tissue specific and olfactory genes). HCP content accounted for more than two-thirds of the genome wide CpG content. Among the alcohol-treated embryos, those with closed neural tube had more alterations in methylation levels than did those with open neural tube (abnormal). Expression microarray showed that alcohol treatment caused alterations in methylation and expression of genes responsible for chromatin remodeling, neuronal morphogenesis, synaptic plasticity and neural development. Overall, these data suggest that alcohol strongly affects DNA methylation levels (both hypo-and hyper-methylation), which correlate with neural tube defects. The alcohol-effected gene promoters identified in this study were also found to be affected in nine other developmental syndromes that share phenotypic features with FASD [216].
Zhou et al. [217] performed a similar study using cultured adult rat dorsal root ganglion neural stem cells (DRG-NSC). Alcohol exposure delayed differentiation by changing the methylation status of genes necessary for that process (e.g., Igf2, Sox7, Cutl2), and prevented the diversification of the DNA methylation (the programmed increase or decrease in DNA methylation of moderately methylated genes). The methylation patterns of the genes altered by alcohol were found to be related to neuronal receptors, neural development, synaptic transmission and olfactory (sensory) perception, all of which were similarly found to be affected by alcohol by Liu et al. [216] and were also validated by Sequenom mass ARRAY technology [217].
Laufer et al. [220] set out to study the possible mechanism behind gene expression changes in a mouse model of FASD by looking at DNA cytosine methylation, microRNAs (miRNAs) and CTCF binding sites in the adult mouse brain. Genome wide DNA methylation studies (MeDIP-ChIP and microarray with Ingenuity Pathway Analysis) revealed a large number (~1,000) of differentially methylated gene promoters in the alcohol-exposed mice. Most were genes related to cell development and maintenance, cell death, and brain development. Almost half of the genes known to be imprinted in the mouse genome showed changes in methylation. In utero alcohol exposure also had an effect on miRNA expression and the effect differed according to the timing of alcohol exposure (gestational day 8/11 vs. gestational day 14/16). The affected miRNAs identified by an array were also found to be trimester dependent. Three of the imprinted regions with modified methylation have genes that are believed to be important in other neurodevelopmental diseases [220]. . Alcohol (EtOH) metabolism and the effects its metabolites are exhibited in various metabolic and cellular processes. Various ethnic groups and populations have copy number differences and polymorphisms in the ADH gene, which in turn have an effect on how they metabolize alcohol [218]. High concentrations of alcohol (ethanol) modifies neurotransmitters (e.g., gamma amino butyric acid (GABA)A, N-methyl-D-aspartate (NMDA), NMDA-receptor (NMDAR) subunits) [165,212], while acetaldehyde can bind lysine residues in proteins as well as dopamine which results in salsolinol and isosalsolinol [212]. Although at first a stimulant, at higher quantities ethanol is also a depressant. This indicates that ethanol and its metabolites are essentially neurotoxic, and overtime can lead to a variety of health problems, including poor nutrition, addiction, mood disorders, cirrhosis of the liver and cancer [212,218,219]. Green boxes and arrows demonstrate the resulting cellular functions. Symbols-Inhibits.
Most prenatal alcohol exposure studies focus on the mother, however, pre-conception alcohol habits in the father can also have an effect on DNA methylation. Bielawski et al. [221] treated adult male Sprague-Dawley rats with ethanol three times a week for 9 weeks and mated them with non-alcohol treated females. Sperm was extracted from alcohol treated males after mating and subjected to RNA extraction, followed by reverse transcription-polymerase chain reaction (RT-PCR) with primers for DNMT. It was concluded that the DNMT mRNA levels were significantly lower in the alcohol-treated male's sperm when compared to the non-treated control males. This suggests that chronic alcohol consumption results in altered DNA methylation levels in sperm, which can have an effect on paternal genomic imprinting and can eventually be inherited by the offspring. However, one limitation to the study is that the authors did not distinguish between the different DNMTs [221].
Ouko et al. [222] studied the effects of paternal alcohol consumption on CpG methylation of imprinted genes H19 and IG DMRs (discussed in Section 3.3) in male sperm. In normal males, both H19 and IG DMRs are maternally expressed due to paternal H19 and IG DMRs being more than 99% methylated in mature sperm. They are essential for normal neuro-behavioral development, with H19 and IGF2 being expressed at the same time. The H19 DMR contains a CTCF binding domain that remains methylated, which in turn blocks CTCF binding. The IG DMR contains two genes (DLK1 and GTL2), which play roles in the regulation of gene expression and in cellular differentiation. The authors postulated that paternal alcohol consumption could lead to hypomethylation of these DMRs via alcohol-induced demethylation and sought out to test their hypotheses in human males of mean age 25. Based on a survey questionnaire, men were classified into three categories: heavy, moderate, and non-drinkers. Semen (N = 16) was collected and subjected to DNA extraction, bisulfite modification and sequencing with primers for H19 and IG DMRs. A total of 22 CpG sites were looked at for H19 DMR and 10 CpG sites for IG DMR. Hypomethylation of both DMRs is seen in both drinking groups when compared to the non-drinking groups. For both DMRs, CpG site-specific hypomethylation was observed and enhanced in the drinking groups. Greater levels of demethylation were seen among the heavy drinkers than in the moderate drinkers, when compared to non-drinkers. With regards to the level of hypomethylation for both DMRs, H19 DMR exhibited a greater decrease in methylation than did IG DMR, indicating that H19 is more sensitive to alcohol-related demethylation [222].
Stouder et al. [223] studied transgenerational effects in mice exposed to low dose alcohol in utero. The methylation pattern of two paternally imprinted genes (H19 and Gtl2) and three maternally imprinted genes (Peg1, Snrpn and Peg3) were assessed in various tissues (including sperm) across two generations. Among the five genes selected, only H19 methylation was affected [223]. As seen in Ouko et al. [222], hypomethylation of H19 CpGs was detected, with specific CpG sites within CTCF 2 binding site showing high decreases in methylation [223]. Knezowich et al. reported a similar reduction in methylation for H19 in male alcohol exposed mouse sperm [224]. Among the F2 male offspring, no change in H19 CpG methylation levels was noted in somatic tissues, which does not correlate with Knezowich et al.'s findings. However, in the F2 newborn whole brain (but not hippocampus) a decrease in H19 CpG methylation was seen [223]. These studies overall show the potential use of H19 CpG methylation as a biomarker for paternal alcohol exposure.
For more details regarding imprinted genes, including H19, in the brain, please refer to a review by Kernohan and Bérubé's [84].

Tobacco (Nicotine)
Maternal tobacco smoking exposes the developing fetus to over 4,000 toxic compounds of which ~30 have been linked to adverse health consequences. Nicotine is known to cross the placenta and is found in higher concentrations in fetal serum than in maternal serum. Nicotine exposure has an effect on the growth of the fetus through vasoconstriction, which reduces blood flow in the placenta and umbilical cord. Nicotine exposure also has an effect on brain development [213].
Maccani et al. [225] studied the effects of maternal smoking on CpG methylation in human placentas. They hypothesized that maternal smoking during pregnancy was associated with CpG DNA methylation changes in the placenta, and that these changes were associated with preterm birth (<37 weeks gestation). Placentas sampled 2 cm from the umbilical cord insertion site of smokers (n = 22) and non-smokers (184). The authors found that 1918 CpG loci and their methylation patterns were associated with maternal smoking. They highlighted seven CpG loci that resided in the RUNX3 gene, which codes for a transcription factor, is said to function as a tumor suppressor, and is known to be associated with Asthma and airway hyper-responsiveness. Pyrosequencing confirmed hypermethylation of two of the seven CpG loci highlighted (cg04757093 and cg00117172), and statistical analyses showed an association between two of the seven loci with preterm birth (cg04757093 and cg14182690). Therefore, the authors concluded that hypermethylation of CpG loci cg04757093 was significantly associated with maternal smoking during pregnancy and preterm birth. However, the exact role of RUNX3 in the placenta is unknown. A few limitations to this study include a small sample size (N = 22), unknown dosage of exposure, and the sensitivity of the DNA methylation profiling technique used [225].
Joubert et al. [226] performed a similar study, examining CpG methylation patterns in 1,062 newborns born to smoking mothers (MoBa cohort). The association between a nicotine metabolite cotinine in the mothers' blood and CpG methylation in the newborns' cord blood was analyzed. The authors identified 26 CpGs mapped to 10 genes that were significantly associated with maternal plasma cotinine and were found to be differentially methylated. The genes with the most CpGs identified were GFI1 with eight CpGs, and AHRR, MYO1G & CYP1A1, all with four CpGs each. The results were replicated in another study sample size (N = 18) (NEST cohort). The most significant finding in both cohorts was an increase in cotinine levels (or reported maternal smoking), which was associated with a decrease in CpG methylation at cg05575921 in the AHRR gene. AHRR codes for a repressor protein involved in the aryl hydrocarbon receptor-signaling cascade, which mediates dioxin toxicity. It is also said to be involved in cell growth and differentiation [226].
Li et al. [227] looked at the effects of nicotine in utero and how it may have had an effect on the offspring's susceptibility to hypoxic-ischemic encephalopathy (HIE) as well as on Angiotensin II (AngII) receptors AT1R and AT2R. AngII, through its receptors, helps regulate the cardiovascular system as well as physiological responses. If affected, it can lead to cardiovascular disease, hypertension, diabetes, and stroke. Pregnant rats were given nicotine injections throughout the pregnancy, and offspring at fetal day 21 (F21/E21) and post-natal day 10 (P10) were sacrificed for brain removal. Prenatal nicotine exposure was found to have negative effects on growth (compared body and brain size) in both F21 and P10 offspring when compared to saline controls and to increase the brain's vulnerability to HI-induced infarct. In P10 prenatal nicotine exposed pups, CpG methylation analyses at the AT2R promoter revealed significant hypermethylation upstream of a TATA-box in males but not in females.
This suggests prenatal nicotine exposure alters sex-specific AngII receptors CpG methylation in the developing brain, more so AT2R then AT1R, which in turn, affects their expression levels. AT2R reduction increases susceptibility to hypoxic-ischemic brain damage, which could explain some of the adverse effects of prenatal nicotine exposure [227].
Breton et al. [228] studied global and promoter CpG island methylation in buccal cells of 5-6 year old children. Tobacco smoke exposed children had hypomethylation of AluYb8 (a repeated DNA sequence found in >2,000 loci in the human genome), with no significant difference for LINE1 methylation. Pyrosequencing showed gene promoters with the most significant differences were AXL (a receptor tyrosine kinase) and PTPRO (a receptor protein tyrosine phosphatase) bisulfite [228].
Toledo-Rodriguez et al. [229] obtained genomic DNA from blood leukocytes of adolescents whose mothers had smoked during pregnancy and used bisulfite sequencing to study brain derived neurotrophic factor (BDNF), the gene product of which is an important regulator of brain cell growth. They demonstrated increased amounts of DNA methylation in exon 6 of BDNF gene. The authors speculate that cigarette smoking during pregnancy could have an adverse effect on brain development [229]. However, the authors did not control for sex [229]. This was rather disappointing, since it is well documented that sex differences exist not only in the brain (e.g., males are larger than females; morphological and neurochemical differences), but also at an epigenetic level (e.g., in neonatal brain, males have higher H3 acetylation than females) [230,231]. However, it must be noted that such human studies are complicated by the likelihood that pregnant women who smoke tobacco also tend to smoke in the home after the child is born.
For further details about the effects of maternal nicotine on human brain development, please see Morris's et al. 2011 review, and for human and rodent brain please see Pagani's et al. 2013 review [231,232].

Cocaine
Cocaine easily crosses both the placenta and the blood-brain barrier, thereby having direct effects on the fetal brain [213]. Novikova et al. [233] exposed pregnant CD1 mice to cocaine twice daily, sacrificed the pups at postnatal days 3 and 30, harvested the brains, and studied global and CpG island DNA methylation levels as well as Dnmt gene expression levels. A 30% decrease in global DNA methylation levels was found in the Day 3 cocaine-exposed offspring. The CpG islands affected by cocaine were found to be 34% hypermethylated (five neural genes, 24 housekeeping genes) and 66% hypomethylated (seven neural genes, 57 housekeeping genes). Bisulfite sequencing was used to specifically identify CpG islands associated with hypomethylated and hypermethylated gene promoters (see reference for full list). Among the P30 cocaine exposed offspring, there was a 35% decrease in global DNA methylation levels and an increase in both Dnmt1 and Dnmt3a expression levels. Approximately 67% of the CpG islands showed stable alteration when the two ages were compared leading the authors to conclude that maternal cocaine exposure persistently alters DNA methylation in the offspring [233].
Anier et al. [234] also looked at DNA methylation machinery (Dnmt, Mecp2) in the nucleus accumbens (NAc) and hippocampus of the adult mice exposed to acute and repeated cocaine treatments. Acute cocaine treatment had no effect on Dnmt1 mRNA levels, however there was an upregulation of Dnmt3a/b. This led to the hypothesis that an upregulation of Dnmt3a/b would result in hypermethylation of certain CpG island associated gene promoters that have been found to be associated with cocaine (Bdnf, PP1c, FosB and A2AR). PP1c was found to be hypermethylated at the CpG island associated promoter region 24 h after acute and repeated cocaine treatment and MeCP2 binding to PP1c promoter increased. FosB was found to be hypomethylated at the CpG island associated promoter region 1.5 h after acute and repeated cocaine treatment and MeCP2 binding at FosB promoter decreased. Therefore, cocaine treatment in the adult mouse NAc results in alterations in gene expression by changing DNA methylation levels [234].
Tian et al. [214] studied global DNA methylation levels in adult mouse brains after cocaine exposure using liquid chromatography-electrospray ionization tandem mass spectrometry (LC-ESI/MS/MS) [214]. The LC-ESI/MS/MS method is a highly sensitive method for distinguishing 5mdC (deoxyribonucleoside) from 5mC (ribonucleoside) [235]. Global DNA methylation levels decreased in the prefrontal cortex but not nucleus accumbens in the cocaine group. They also looked at components of the DNA methylation machinery (Dnmt3a/b and MeCP2) at the mRNA and protein level by RT-PCR and Western Blot respectively. Both mRNA and protein expression levels of Dnmt3b were downregulated, and could be rescued by treatment with methionine, a methyl donor [235].

Other Drugs
One of marijuana's main psychoactive compounds, tetrahydrocannabinol (THC), rapidly crosses the placenta and is concentrated in the fetus for up to 30 days, thereby prolonging gestational exposure [213]. To our knowledge, there have been no papers published on the effects of in utero exposure to cannabis (THC) on DNA methylation and the resulting effects on the offspring. However, others have studied the effects of marijuana on dopamine receptor D (DRD2) in human fetal brains as well as in rat post-natal (comparable to human fetal brains) and adult brains [37]. For further details about the effects of maternal cannabis on brain development, please see Morris' et al. 2011 review [32].
Like cocaine, methamphetamine (METH) readily crosses both the blood-brain barrier and the placenta, thereby having direct effects on the developing brain [213]. Itzhak et al. [236] took adolescent female and male mice, exposed them to METH into adulthood and the mating period with continued exposure during gestation. Hippocampi of the offspring underwent DNA methylation analysis using MeDIP-chip and bisulfite sequencing. METH caused DNA hypermethylation in promoters of genes essential to synaptic plasticity, learning, and memory. Enrichment analysis with histone PTM libraries also showed hypermethylation in promoters that possess active histone PTMs (H3K4me3) and demethylation in promoters that possess silencing histone PTMs (H3K27me3) [236].
Semi-synthetic opioids (heroin), which are metabolized into opiates (morphine and codeine), rapidly cross the placenta, with levels of opiates equalizing between the mother and fetus [213]. Using the high-performance liquid chromatography with ultraviolet detection (HPLC-UV) method, Fragou et al. found no significant change in global DNA methylation rodent brains exposed to heroin or cocaine [237]. It is not clear why the cocaine result differs from the experiment by Itzhak, but could be due to improper hydrolysis of DNA or perhaps RNA contamination during DNA isolation [235]. MeDIP-chip with bisulfite sequencing is now considered the more appropriate method for DNA methylation studies [238].

MeCP2 in Neurodevelopmental/Neurological Disorders
Among all the MBPs, MeCP2 is considered as the best studied example for involvement of epigenetics in neurological disorders [239,240]. The role of MeCP2 in multiple cellular processes and its relation to human diseases were discussed above. Even though MeCP2 is ubiquitously expressed, its highest expression is reported in brain [241], and the majority of the MeCP2-deficient or MeCP2 overexpression disorders show neurological phenotypes. Within the brain, MeCP2 expression is detected in many cell types including neurons, astrocytes, and oligodendrocytes [242][243][244]; alterations in the expression within all these cell types have been linked to the development of severe neurological complications [109]. The MeCP2 associated brain disorders can be broadly categorized as neurodevelopmental and neurodegenerative disorders. These disorders can be caused by or associated with loss of function mutations of MECP2, altered expression (either upregulation or downregulation), and polymorphisms or sequence variants. Examples of MeCP2-associated disorders are discussed in detail below.

Rett Syndrome
Mutations causing loss of function or loss of expression of MECP2 are the primary cause of Rett syndrome [245,246]. The majority of Rett syndrome-causing mutations are clustered in the MBD and TRD domains, while some mutations are also found in the C-terminus and N-terminus. These mutations have been shown to cause loss of binding to chromatin, loss of interactions with the protein partners including the complexes for transcriptional repression, loss of transcription repression activity, and loss of MeCP2 PTMs (phosphorylation) required for neuronal activity-dependent target gene expression regulation [109]. The latest studies on MeCP2 mutations show that the major cause of Rett syndrome is mediated through the loss of the DNA-MeCP2 bridge [247,248]. Known MECP2 mutations can abolish MeCP2 binding to 5mC or 5hmC [109]. Therefore, it is evident that the loss of interaction with the methylated DNA and methylation-dependent functions of MeCP2 are contributing to the pathogenesis of Rett syndrome.

Autism Spectrum Disorders
Illustrating the role of MeCP2 as well as DNA methylation in autism, increased promoter methylation and loss of the binding of CTCF were shown to be correlated with significantly reduced MeCP2 expression in autistic brains [199,249]. MeCP2 overexpression has also been shown to be a cause of autism [250]. Apart from the altered MeCP2 expression, other variations to MeCP2 were also reported in autistic patients. Both loss of function MECP2 mutations [251][252][253][254] and sequence variants are found to be associated with this devastating disease [255,256]. Even though MECP2 duplication syndrome is a different disorder, it shares some of the autistic phenotypes [257,258]. MeCP2 has been linked to autism spectrum disorders through its target genes such as RELN and UBE3A which are epigenetically regulated and are misregulated in autism spectrum disorders [88].

Fetal Alcohol Spectrum Disorders
Prenatal exposure of the embryo to alcohol during pregnancy leads to a spectrum of neurological complications, which are collectively referred to as FASD. The role of DNA methylation is the pathogenesis of FASD and ethanol teratogenesis was discussed in Section 5.3. The first link between FASD and MeCP2 was reported in a patient with a MECP2 mutation in AT-Hook domain within the TRD, and this girl showed phenotypes of both Rett Syndrome and FASD [259]. The AT-Hook domain of MeCP2 is known to mediate MeCP2 binding to DNA [260]. Since then, the potential involvement of MeCP2 in FASD pathogenesis has been studied in many in vivo and in vitro FASD models [207,[261][262][263][264]. Alterations in expression and distribution of MeCP2 have been reported in these FASD models in response to ethanol exposure. Ethanol seems to alter the expression of MeCP2 in a cell type-specific as well as brain region-specific manner. For example, ethanol downregulated MeCP2 in the cortex and striatum of ethanol exposed offspring [261,265]. Reduced MeCP2 levels were also reported in ethanol exposure hypothalamus, which was corrected through choline treatments [266]. In contrast, in the ethanol exposed hippocampus, ethanol upregulated MeCP2 significantly [264,267]. In the ethanol exposure mouse dentate gyrus, the expression of MeCP2 was reduced and distribution of MeCP2 was altered [207].

Other
Apart from the above-discussed neurological conditions, MeCP2 is found in many neurological, neurodevelopmental, and neurodegenerative disorders. Alzheimer's disease is an example for the association of MeCP2 with neurodegenerative disorders. MeCP2 can contribute the pathogenesis of Alzheimer's disease through sequence variants of MECP2 [268] as well as through regulation of its target gene BDNF, which is a hot gene in Alzheimer's disease [268,269]. Mental retardation is a major phenotype seen in many MeCP2-related disorders including Rett Syndrome and autism. MECP2 mutations are also found in X-linked mental retardation [270] and AS [271]. Reduced levels of MeCP2 expression were also reported in cases of Down Syndrome and PWS [249].

DNA Methylation as a Biomarker for Human Diseases
A biomarker is defined as a measurable indicator of a biological state or disease. DNA methylation can be considered as a hallmark of many human diseases and, thus, can be utilized as potential biomarker. The application of DNA methylation to identify diseases has already been reported in diseases such as diabetes, neurological disorders, cardiovascular disorders and is extensively applied in cases of cancer [272]. Utilization of DNA methylation is advantageous compared to the expression-based biomarkers due to few reasons including easy PCR-based amplification using fewer cell numbers and can be detected in samples collected by non-invasive methods (saliva, blood, urine, semen) [272]. However, differential methylation profiles in different cell types make it challenging to use it as a biomarker, since heterogeneity of the sample might provide biased or inaccurate results. In order to address this issue, several bioinformatics tools have been developed to deconvolute the methylation data obtained from a heterogeneous population into individual cell types [273,274].

Type 2 Diabetes and the DNA Methylome
Type 2 diabetes mellitus (T2DM), the most common form of diabetes, has become a major concern worldwide. A recent study in England has reported that the percent of the population with pre-diabetes has dramatically increased over the past few years [275]. Similar trends are observed worldwide. In Canada, there are more than 60,000 new cases of T2DM yearly. Lifestyle, diet, environment, and genetics contribute to the onset of T2DM. Genome-wide association studies have identified T2DM risk genes, which have roles in pancreatic beta-cell mass and/or function [276]. Several studies have reported that epigenetic changes are also involved, with an alteration in the DNA methylation patterns in the beta cells of the pancreas and white blood cells of patients with type 2 diabetes being [277,278].
The first detailed DNA methylation profiling in the pancreatic islets of T2DM patients (five) and non-T2DM (eleven) healthy donors was reported by Volkman, M et al. [276]. Using the Illumina Infinium HumanMethylation27 BeadChip array, Francois Fuks' team identified 276 CpG loci associated with the promoters of 254 genes that had differential DNA methylation in the diabetic islets, with the majority (96%) showing decreased methylation in the T2DM sample [276]. The differentially methylated genes have roles in beta-cell function, cell death, and adaptation to metabolic stress. Global changes in DNA methylation were not observed; rather the changes were gene specific. Importantly, these differentially methylated CpGs did not overlap with known SNPs. Several of these differentially methylated sites were validated by bisulfite pyrosequencing. In contrast to pancreatic beta-cells, minimal alterations in methylation patterns were observed in the peripheral blood leukocytes from T2DM patients and controls.
The reduced methylation of the upstream promoter region CpG in the T2DM beta-cells often correlated with increased expression of the associated gene; however, as noted by the authors, there is not a simple relationship between reduced promoter DNA methylation and increased transcriptional activity. Further analyses of the differentially methylated upstream promoter regions identified the GATA transcription factor-binding motif.
In a more recent analysis of the DNA methylome of pancreatic beta cells from T2DM patients, Dayeh T and colleagues used the Illumina Infinium HumanMethylation450 BeadChip [277]. As with the Fuks' study, global changes in DNA methylation were not observed (pancreatic islets from fifteen T2DM and thirty-four non-diabetic donors). The authors identified 1,649 CpG differentially methylated sites, with 1,008 located in or near 853 unique genes, some of which have functions in pancreatic islets, exocytosis, and apoptosis, and 561 being intergenic. Of the 276 differentially methylated CpG sites identified in Fuks' study, 71 sites were found to be differentially methylated in this more recent study. The majority (97%) of the differentially methylated sites were methylated at a lower level in the T2DM islets. Chromosomes 1 and 2 had more differentially methylated sites than the other chromosomes, with the differentially methylated sites being underrepresented on Chromosome 19. Of the 853 differentially methylated genes, 102 genes were differentially expressed in T2DM compared to non-diabetic islets. As with the Fuks' study, the majority (75%) of these genes had decreased methylation and increased expression in the T2DM islets. The differentially methylated genes have roles in cancer, axon guidance, MAPK signaling, focal adhesion, and actin cytoskeleton. An overrepresentation of CpG sites in the 5൏ UTR was observed with the genes showing this inverse relationship. Interestingly there was an overrepresentation of differentially methylated CpG sites in the gene body; possibly these differentially methylated sites could alter splicing of the transcripts produced by these genes in T2DM relative to non-diabetic islets.
Differential methylation was observed for 17 T2DM candidate/risk genes, including IRS1, FTO, and TCF7L2. The authors proposed that disease susceptibility might be influenced by a combination of epigenetic and genetic events. In a previous study, these authors showed that around 50% of the SNPs associated with T2DM were CpG-SNPs, which could delete or introduce methylated sites; this observation highlights the importance of knowing whether a SNP is responsible for a change in the methylation status of a CpG [125,279].
In contrast to the results reported by the Fuks' study, Toperoff et al. identified DNA methylation variations that served as an early marker of T2DM in peripheral blood leukocytes [278]. The authors used a two-step approach, first doing a comparison of pooled DNA from T2DM with that from non-diabetic controls, which identified differentially methylated regions followed by a deep sequencing approach to explore specific regions of interest. DNA from peripheral white blood cells of pooled T2DM patients (710) and non-diabetic control (459) were analyzed by digestion with methylation-sensitive restriction enzymes followed by hybridization to Affymetrix SNP6 microarrays. Linkage disequilibrium blocks containing T2DM-associated sequence polymorphisms were enriched in the differentially methylated regions. Further analyses of these regions by bisulfite conversion and deep sequencing identified differentially methylated CpGs. One of these differentially methylated sites was found in the intron of the FTO gene, which has a T2DM-associated polymorphism (A is the risk allele versus G) position 11 bp upstream from the differentially methylated CpG. The risk allele was found to be significantly hypermethylated compared to the other allele [278,279]. However, this risk allele was hypomethylated in DNA from T2DM patients.
Interestingly the DMRs were often co-localized with enhancers and binding sites of methylation sensitive transcription factors such as USF1/2, MYCN and E2F. The USF1/2 transcription factor has a role in controlling glucose-lipid metabolism in response to insulin and is involved in beta-cell development.
Together the studies suggest that DMRs in the islets may play a role in the pathogenesis of T2DM. Some of these sites may predispose the individual to T2DM, while others may be a consequence of altered gene expression in the T2DM islets. For those differentially methylated sites found in peripheral blood leukocytes, particularly those that alter transcription factor binding to enhancer regions, these sites may predispose the individual to T2DM. It will be interesting to determine whether these sites are inherited and represent a transgenerational epigenetic modification.

DNA Methylation as a Biomarker for Neurological Disorders
The neurological disorders that show disease-specific DNA methylation patterns and/or changes in methylation-related genes were discussed in previous sections. In such cases the utilization of DNA methylation and/or related proteins as a biomarker could be achievable [280,281]. In order to study the application of DNA methylation as a biomarker, twin studies serve as powerful tools. Despite their mostly identical genetic basis, they might display changes in DNA methylation. Easy access to the cells or tissue for biomarker analysis is one critical requirement for a good biomarker. Except for brain tumors or in autopsy studies, obtaining brain samples for biomarker analysis is challenging and in most cases is not realistic. Therefore, blood-based DNA methylation was suggested as a non-invasive biomarker surrogate for neurological disorders [281]. In light of this, DNA methylation immunoprecipitation for the DMRs on chromosome 21 was successfully used to detect fetal trisomy 21 in maternal peripheral blood samples [282]. Since some neurological and psychiatric disorders (including depression, schizophrenia, and Parkinson disease) also demonstrate disease-specific methylation patterns, it was suggested that blood-based non-invasive biomarker analysis can be applied in these situations [281].
Moreover, since MECP2 mutations are the primary cause of Rett syndrome, it is already used as a biomarker to detect Rett syndrome. However, such biomarkers must be utilized carefully, because neurological disorders are highly heterogeneous. For instance, there are Rett syndrome cases which lack MECP2 mutations, while presence of MECP2 mutations are reported in non-Rett syndrome cases [109]. Hence, utilization of an epigenetic factor like DNA methylation, which is susceptible to environmental changes, should be carried out with caution and in concert with other analysis to confirm the diagnosis.
Moreover, placental expression of many imprinted genes (examples: IGF2, MEG3, PEG3) which are known to be regulated by DNA methylation-mediated epigenetic mechanisms was associated with infant neurodevelopmental outcomes [283]. In a another study, it was shown that increased promoter methylation of 11-beta hydroxysteroid dehydrogenase gene (HSD11B2) was associated with lower birth weight and lower quality of movement scores, which are considered as predictors of early/infant neurobehavioral outcomes [284]. These examples further strengthen the notion that DNA methylation could be used as biomarkers for neurological disorders. Studies also suggest that not only DNA methylation, but also other epigenetic modification could presumably be used as biomarkers for human diseases. One such epigenetic mechanism is microRNAs, the expression of which has been linked to the early neurobehavioral outcomes. Placental expression of several microRNAs including miR-16, miR-146a, and miR-182 was correlated with neurobehavioral outcomes such as attention scores and movement score [285]. of Tri-Council Stipends (GETS), Canada Research Chairs to James R. Davie and Marc R. Del Bigio. Vichithra R.B. Liyanage and Nanditha Murugeshan are recipients of MHRC-UMGF (Vichithra R.B. Liyanage) and MHRC-MICH-UMGF (Nanditha Murugeshan) studentship awards. The authors acknowledge the strong support of the Manitoba Institute of Child Health and CancerCare Manitoba Foundation.

Authors Contributions
Each author contributed to the writing of specific sections, with Jessica Jarmasz and Vichithra Liyanage equally contributing the majority of the text in this review.