Fighting Against Promoter DNA Hyper-Methylation: Protective Histone Modification Profiles of Stress-Resistant Intestinal Stem Cells

Aberrant DNA methylation in stem cells is a hallmark of aging and tumor development. Recently, we have suggested that promoter DNA hyper-methylation originates in DNA repair and that even successful DNA repair might confer this kind of epigenetic long-term change. Here, we ask for interrelations between promoter DNA methylation and histone modification changes observed in the intestine weeks after irradiation and/or following Msh2 loss. We focus on H3K4me3 recruitment to the promoter of H3K27me3 target genes. By RNA- and histone ChIP-sequencing, we demonstrate that this recruitment occurs without changes of the average gene transcription and does not involve H3K9me3. Applying a mathematical model of epigenetic regulation of transcription, we show that the recruitment can be explained by stronger DNA binding of H3K4me3 and H3K27me3 histone methyl-transferases as a consequence of lower DNA methylation. This scenario implicates stable transcription despite of H3K4me3 recruitment, in agreement with our RNA-seq data. Following several kinds of stress, including moderate irradiation, stress-sensitive intestinal stem cell (ISCs) are known to become replaced by more resistant populations. Our simulation results suggest that the stress-resistant ISCs are largely protected against promoter hyper-methylation of H3K27me3 target genes.

genotype and treatment. Each list contains the peaks that are either consistently detected in both replicates or have a high reliability (MACS fold enrichment > 5.0) Very few of them are located in blacklisted regions ( [7], ≤ 2%). We marked these peaks but did not remove them from the lists. Subsequently, we analyzed whether peaks being located within the promoter region (defined as transcriptional start site (TSS) +/-1000 bases) or gene bodies (defined as region between the TSS and the last base of the gene). The gene reference list containing 31592 RefSeq genes was taken from UCSC Table Browser. To avoid gender-specific artifacts, we excluded genes and peaks of the X-and Y-chromosome from analysis. If a peak had a minimum overlap of 5% with a promoter region and/or a gene body, it was considered as gene associated and the respective gene to carry the modification (present: 1, absent: 0). Total numbers of peaks and numbers of gene associated peaks are shown in Fig. S1B. Effects of genomic stress on H3K9me3 target genes are quantified in Fig. S1C.
Promoter CpG, GpC density. A core promoter was defined as the 1kb broad region [TSS-500bp, TSS+500bp] consistent with our 5hmC analysis (Supplementary IV). CpG and GpC pairs were quantified counting the fraction of C and G bases that is downstream followed by a G or C base, respectively.

RNA-seq, data pre-processing and analysis.
Whole tissue from jejunum was homogenized with the Tissue Lyser (20s at 15 Hz). RNA was prepared following manufacturer's recommendations (Qiagen QIA AllPrep kit) including a DNase I digest and sequenced on an Illumina HiSeq 4000 sequencer with 75bp paired-end, strand-specific, random priming. Raw counts were normalized to the total number of reads resulting in one reads per million (RPM) value per gene. These RPM values were normalized to the genes length (RPKMreads per kilobase million). Boxplots of RPKM values of selected gene sets are given in Fig.   S2B. The transcription profiles over all samples are then clustered into metagene profiles using SOM machine learning, with each metagene serving as a representative of a cluster of combined profiles. We used a grid of 30 x 30 metagenes and applied default parametrization as implemented in oposSOM [8]. Genes with similar transcription in all samples are assigned to the same or neighboring metagenes; the latter forming clusters. Predefined gene sets are often distributed across different clusters as e.g. the GO set 'immune response' (Fig.  S2C).
Quantitative RT-PCR. Intestinal tissue was homogenized in 1ml TRIzol® Reagent (Invitrogen, Germany) and total RNA was isolated according to the manufacturer's instructions. For cDNA synthesis, 1 µg RNA was transcribed using the SuperScript™ IV First-Strand Synthesis System (Invitrogen). Relative levels of mRNA were quantified using an Applied Biosystems 7500 Real-Time PCR System with the ΔΔCt method. Expression of specific genes was normalized to murine Rps29 and fold changes of biological replicates were averaged. All reactions were performed by using technical replicates. Primer sequences are provided in Table S1. Results are summarized in Fig. S2A.

II. Basic assumptions and equations of the epigenetic regulation model (Thalheim et al. [9])
Fig. S3: Regulatory network assumed in the model [9].

Regulation of histone modification:
The binding probabilities of the HMTs,  K (K=4, 27), are calculated assuming a positive feedback between the presence of the histone mark K and the recruitment of its HMTs. Such a feedback has been demonstrated for both H3K4me3 and H3K27me3 [10]. According to this assumption, the binding probability of the HMT of histone mark K depends on the fraction mK (K=4, 27) of modified nucleosomes of the NH cooperative nucleosomes associated with the promoter and is given by: Here, is the ground enthalpy per bound HMT complex and and are the free energies of HMT binding to DNA and to histone mark K, respectively. They are specific for histone mark K and are scaled by the Boltzmann unit. The factor K is (-1) for H3K4me3 and (+1) for H3K27me3. Assuming a dynamic equilibrium between modification and demodification,  K is also given by: where C K is the de-modification constant C K = kde/kmod; i.e. the ratio between the demodification rate kde and the modification rate kmod for the respective modification.
Analytical solutions of the histone methylation machinery (m4, m27) consistently solve Equ. S1 and S2. In dynamic simulations, individual histones are either modified with the probability kmod• K dt or de-modified with probability kdedt per time step dt.
The changes mCpG of the DNA methylation subsequent to cell division (to reach a balance between maintenance and de novo DNA methylation) are than described by: (S3) We assume that the DNMT1 binding probability to the promoter depends on the methylation level mCpG of the CpGs at this promoter: This binding is controlled by two energy constants Em,0 and Em,1 describing accessibility for and binding of DNMT1 to methylated CpGs of the promoter, respectively. According to this feedback, the DNA methylation becomes bistable for defined parameter sets. The range of bistability is controlled by: i) the energy constants (Em,0, Em,1), ii) the maximum probability of maintaining a methyl-CpG in the daughter (Dmain,0), and iii) the effective de novo DNA methylation probability Dnovo(m4, m27). The latter depends on the histone modification states of the gene under consideration [11]: where and are energies modulating the binding of de novo DNMTs to the promoter in presence of the respective modification.
The methylation states of the regulatory states of the systems (m4, m27, mCpG) are given by mCpG=0.

Regulation of gene expression:
We assumed gene transcription to depend on the histone modification levels m4 and m27 of the gene promoter. Accordingly, the transcription of the individual genes is calculated by solving: Here, Pmax is the maximum promoter activity and  the transcript degradation rate. FTF and Fauto are the regulation factors for the TF-network and for the auto-regulation of the gene, respectively. Details about the underlying regulatory principles of the TF-network, which are based on thermodynamics, can be found in [12]. Fauto=1 is set throughout the study. The constant m0 <<1 ensures that genes can be transcribed with a small rate also if the promoter is devoid of H3K4me3 (m4=0).
Analytical solutions for the transcriptional machinery T* require (dT/dt)T* =0. Thus, epigenetic states that are consistent with the transcription machinery represent a small subset of the solutions of Equations S1-3. In dynamic simulations, in addition to the fluctuations of the histone modification states described above, cells underwent a dilution of histone modification during cell division. Modified nucleosomes of the mother cell are randomly distributed onto the daughters and complemented with unmodified ones. This dilution is the main reason for instable modification states, i.e. spontaneous de-modification [13].

III. Basic properties of model genes
The regulatory states of the G1 model gene depend on its integration in the TF-network. As expected, increased transcriptional activation of G1, modelled increasing the regulation factor FTF, increases its transcription and H3K4me3 level, while the H3K27me3 and DNA methylation levels decrease (Fig. S4A). Thereby, the regulatory states in individual cells fluctuate and can considerably deviate from population averaged values. These fluctuations nearly vanish if the genes adopts a high DNA methylation state. The G1 model gene approaches such a state if its DNA methylation, due to fluctuations or external regulation, reaches values above mCpG,C (for parameter set A: 0.3<mCpG,C<0.4, Fig. S4B).

IV. Details on promoter DNA methylation scenarios
Low methylation of G1 promoters might originate in changes of either the maintenance or the de novo DNA methylation activity. However, HCG promoter genes strongly enriched in G1 genes. Accordingly, their promoters show low methylation levels. For such genes, our in silico model predicts a weak dependence of the methylation level on the activity of the maintenance DNA methyltransferase DNMT1 (Fig. S5A). Thus, we focused on changes of the de novo DNA methyltransferases DNMT3a/b; changing the de novo DNA methylation activity Dnovo. Alternatively, low methylation might originate in activated hydroxyl-methylation [11,14]. In this scenario, G1 genes have to be targets of the ten eleven translocation (TET)-pathway. TET-proteins are often recruited by HCG promoters. Thus, we expected strong recruitment to G1 gene promoters in control mice, consistent with their low DNA methylation level. Utilizing published data [15] (E-MTAB-5202), we re-analyzed hydroxyl-methylation (5hydroxymethylcytosine: 5hmC) in intestinal cells. In short, we calculated an average RPM value for the promoter of each gene using the processed RPM values as provided. A core promoter was defined as the 1kb broad region [TSS-500bp, TSS+500bp]. Thus, RPM and RPKM numbers are identically. Actually, we did not found an enrichment of 5hmc in G1 HCG compared to G2 LCG (ICG) promoters (Fig. S5B). So, low DNA methylation specifically at G1 promoters by improved TET recruitment is unlikely although we cannot exclude a contribution of this mechanism. In our in silico model, a decrease of the DNMT1 activity, modelled by decreasing Dmain,0, strongly affects the level of methylation in high methylation states. In contrast, it has only a marginal effect on the level of methylation in low methylation states (cyan box). B) Box-plot of promoter 5hmc levels of selected gene sets in ISCs. The distribution of 5hmC levels for G1 genes is similar to G2 genes and shows a significantly lower mean than a subgroup of genes (Ref) that was predicted to be regulated by the TET-pathway during ISC specification and differentiation [16]. * p<0.1 (Kolmogorov-Smirnov-test) C) The simulated regulatory states depend on the maximum de novo methylation activity Dnovo,0. For constant Dnovo,0=0.3> Dnovo,c, the probability of reaching the high methylation state (e') (compare Fig. 3A) approaches one. Starting from an H3K4me3-H3K27me3 state, the maximum of mCpG is approached with an exponential decay time of 18 (cyan line).