Flavonoid Biosynthesis Genes in Triticum aestivum L.: Methylation Patterns in Cis-Regulatory Regions of the Duplicated CHI and F3H Genes

Flavonoids are a diverse group of secondary plant metabolites that play an important role in the regulation of plant development and protection against stressors. The biosynthesis of flavonoids occurs through the activity of several enzymes, including chalcone isomerase (CHI) and flavanone 3-hydroxylase (F3H). A functional divergence between some copies of the structural TaCHI and TaF3H genes was previously shown in the allohexaploid bread wheat Triticum aestivum L. (BBAADD genome). We hypothesized that the specific nature of TaCHI and TaF3H expression may be induced by the methylation of the promoter. It was found that the predicted position of CpG islands in the promoter regions of the analyzed genes and the actual location of methylation sites did not match. We found for the first time that differences in the methylation status could affect the expression of TaCHI copies, but not the expression of TaF3Hs. At the same time, we revealed significant differences in the structure of the promoters of only the TaF3H genes, while the TaCHI promoters were highly homologous. We assume that the promoter structure in TaF3Hs primarily affects the change in the nature of gene expression. The data obtained are important for understanding the mechanisms that regulate the synthesis of flavonoids in allopolyploid wheat and show that differences in the structure of promoters have a key effect on gene expression.


Introduction
Many important crop species, including allohexaploid bread wheat (Triticum aestivum L., BBAADD genome, 2n = 6x = 42), have a complex polyploid genome. Regulation of gene expression in polyploid organisms, such as bread wheat, is complicated by the presence of homeologous copies in addition to paralogous ones [1]. This makes polyploid organisms an interesting model for studying regulation of the expression of gene copies, including the effect of DNA methylation [2][3][4][5][6].
Flavonoids, including anthocyanins, are a diverse group of phenolic plant metabolites. Flavonoids act as plant pigments that can stain tissues with various shades of reddishpurple, blue, and pink [7]. However, a significant proportion of flavonoid compounds are colorless.
We hypothesized in this study that the specific nature of the expression manifested by individual copies of the flavonoid biosynthesis genes TaCHIs and TaF3Hs may be related to the difference in methylation patterns of the same copies in different tissues. The homeologous TaCHI-A1, TaCHI-B1, and TaCHI-D1, and paralogous TaF3H-B1 and TaF3H-B2 genes, were selected to test this hypothesis. We predicted possible sites of methylation and compared these with the actual location of 5mC in two organs of T. aestivum, in which a significant change in the expression level of TaCHI and TaF3H genes was previously shown. We also compared whether methylation coincides with the binding sites of key transcription factors. The obtained results provide better understanding of the tissue-specific regulation of gene expression in T. aestivum. was shown that the total TaCHI expression in shoots was higher than in roots, wh TaF3H-1 genes had higher expression in the anthocyanin-colored coleoptile compa the colorless one, and the TaF3H-1 and TaCHI expression in seedlings increased salinity stress [10,21,25]. However, the regulation mechanisms for tissue-specific e sion of these gene copies have been poorly studied.  TaCHI-B1 and TaCHI-D1, plus TaF3H-B1 and TaF3H-B2 genes in 'Saratovskaya 29 . The data concerning (1) TaCHIs expression were obtained using RT-PCR by [15], and (2) TaF3Hs expression were obtained using RT-PCR by [17]. The black circle means strong expression, the grey circle means weak expression, and the white circle means no expression.

Plant Material, DNA Isolation, and Sodium Bisulfite Treatment
The plant material used in this study included the allohexaploid wheat (T. aestivum) cultivar 'Saratovskaya 29 . To isolate DNA from the coleoptile and roots, 15 wheat seeds were germinated in a Rubarth Apparate climate chamber (RUMED, Lostorf, Switzerland) on wet filter paper with a 12-hour photoperiod with LEDs plant growth (RUMED, Lostorf, Switzerland) at 20 • C. Total genomic DNA was isolated from the coleoptile (weak anthocyanin coloration) and roots (absence of pigmentation) ( Figure 2) using the DNeasy Plant Mini Kit (QIAGEN, Hilden, Germany) on the fifth day after germination. The EpiTect Fast Bisulfite Kit (QIAGEN, Hilden, Germany) was used to treat 1 µg of the genomic DNA from each sample with sodium bisulfite according to the manufacturer's instructions.

PCR, Electrophoretic Analysis, Extraction, and Purification
DNA amplification was performed in 20 µL of PCR mixture with 70 ng of the DNA template, 1 ng of each primer, and HotStarTaq DNA Polymerase (QIAGEN, Hilden, Germany) according to the manufacturer's instructions. After initial denaturation at 94 • C for 15 min, 35 cycles were implemented at 94 • C for 1 min, 50-55 • C for 1 min, and 72 • C for 2 min, followed by a final elongation step at 72 • C for 5 min. Electrophoretic analysis was performed in 1% agarose gel (HydraGene, Co. Ltd., Piscataway, NJ, USA) prepared on a TAE buffer (40 mM Tris-HCl, pH 8.0; 20 mM sodium acetate; 1 mM EDTA) with ethidium bromide. Amplified fragments were isolated from agarose gel using the MinElute Gel Extraction Kit (QIAGEN, Hilden, Germany). 5

Cloning and Sequencing the Amplified PCR Fragments
The amplified PCR products were cloned for each sample using the PCR Cloning Kit (QIAGEN) and Escherichia coli (Migula 1895) Castellani and Chalmers 1919 XL-1 Blue competent cells (Evrogen, Moscow, Russia). Plasmid DNA was isolated using the "diaGene" kit (DIA-M, Moscow, Russia) according to the manufacturer's instructions. Plasmid DNA of 10 positive clones for each PCR product was amplified in both directions using M13 primers. DNA sequencing was performed using the BigDye™ Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems™, Waltham, Massachusetts, USA) and the SB RAS Genomics core facilities (Novosibirsk, Russia). All sequences obtained were deposited in GenBank (NCBI):

Prediction of CpG Islands
CpG islands are DNA regions with a high frequency of CpG dinucleotides. We predicted possible sites of CpG methylation for the TaCHI and TaF3H genes in silico (Supplementary Materials). The sequences of TaCHI genes with promoters were used for the analysis (TaCHI-A1 JN039037, TaCHI-B1 JN039038, and TaCHI-D1 JN039039). The gene promoters were aligned and compared (Supplementary Materials). The analyzed regions approximated 1600 bp; about 600 bp of them were highly homologous promoter sequences. The presence of two islands was predicted for all TaCHIs. In the TaCHI-A1 gene, Island 1 was in the promoter, and Island 2 captured the end of the promoter, the first exons, and the middle of the 1st intron. Unlike TaCHI-A1, Island 1 in the TaCHI-B1 and TaCHI-D1 genes ended at the 1st exon and Island 2 started on the 1st intron (Supplementary Materials).
The promoters and exons of the TaF3H genes (TaF3H-B1 AB223025, and TaF3H-B2 JN384122) differed in length and structure [14]. Thus, the length of the analyzed regions in TaF3H-B1 and TaF3H-B2 also differed. We used the gene sequences with promoters for the analysis, and we found four CpG islands in TaF3H-B1 and three in TaF3H-B2 (Supplementary Materials). Island 1 in TaF3H-B1 started in the middle of the promoter, while the other three islands were in the gene body. Island 1 in TaF3H-B2 started 15 bp before the 1st exon; Islands 2 and 3 corresponded to Islands 3 and 4 in TaF3H-B1, respectively. The difference in the number of islands caused the differences in the structure of promoters (Supplementary Materials).

Analysis of Promoters
In addition to the basic motifs, like CAAT and CGCG boxes, the analysis of promoter elements for the TaCHI and TaF3H genes revealed motifs responsible for light-dependent activation (yellow) as well as TF-dependent elements required for genes involved in the biosynthesis of flavonoid compounds (red), tissue-specific (green), and stress-specific elements (blue) (Supplementary Materials). The most representative elements were DOF, MYB, and MYC/bHLH. Among the light-dependent elements, only the TaCHI genes had SORLIP motifs (Sequences Over-Represented in Light-Induced Promoters). Among the stress-specific elements, the motifs for dehydration response were most typical for the TaCHI genes, and the reaction to heavy metals for TaF3H-B2. The TaF3H-B2 promoter also stood out for the abundance of pollen-specific motifs. The TaCHI genes had multiple root-specific motifs (Supplementary Materials).

DNA Methylation in the Promoters of the TaCHI and TaF3H Genes
We developed specific primers for the promoters of each analyzed gene (Table 1) in order to verify the presence of CpG islands in the cis-regulatory regions and identify differences in methylation of regulatory elements. The TaCHI-D1 gene, strongly expressed in the coleoptile and roots, was completely unmethylated both in the coleoptile and in the roots ( Figure 3, Table 2).
Biomolecules 2022, 12, x FOR PEER REVIEW    In the TaCHI-A1 gene (Table 2), strongly expressed in the coleoptile and roots, we detected large amounts of individual DNA methylation marks both at the CpG and non-CpG sites from the beginning of Island 2 to the beginning of the 1st exon ( Figure 3, Supplementary Materials). Contrariwise, in the TaCHI-B1 gene (Table 2), strongly expressed in the coleoptile and weakly in the roots, we found stable DNA methylation marks in the roots, but not in the coleoptile, at the beginning of the analyzed region ( Figure 3). The marks did not correspond to the predicted islands (Supplementary Materials). Methylation detected in both TaCHI-A1 and TaCHI-B1 was affected mostly the MYB TF recognition sites, light-induced elements, and stress-responsive ones (Table 3).
Analyzing the methylation patterns in the TaF3H genes disclosed many methylation marks in Island 1 in the promoter of the TaF3H-B1 gene (expressed in different plant parts, except roots) and few marks in Island 1 in the promoter of the TaF3H-B2 gene (particularly expressed in the roots) ( Figure 4, Table 2). The detected methylation affected mostly bZIP and MYB TF recognition sites, and hormone-and stress-responsive elements (Table 3). However, almost no differences were observed for both TaF3H-B1 and TaF3H-B2 in the DNA extracted from roots or coleoptiles. Table 2. The number of methylation sites in the TaCHI and TaF3H genes. The table shows the ratio of the number of the identified methylated sites to the total number of potential CpG, CpHpG, and CpHpH sites in the sequenced E. coli colonies.  -+ Analyzing the methylation patterns in the TaF3H genes disclosed many methylation marks in Island 1 in the promoter of the TaF3H-B1 gene (expressed in different plant parts, except roots) and few marks in Island 1 in the promoter of the TaF3H-B2 gene (particularly expressed in the roots) ( Figure 4, Table 2). The detected methylation affected mostly bZIP and MYB TF recognition sites, and hormone-and stress-responsive elements (Table 3). However, almost no differences were observed for both TaF3H-B1 and TaF3H-B2 in the DNA extracted from roots or coleoptiles.

Discussion
Biological processes are implemented under the control of the spatial and temporal gene expression determined with high accuracy. Regulation of gene expression is an important aspect in the life of all organisms, therefore its adaptation in the context of evolution is especially significant [33][34][35]. Numerous sequential changes, including gene duplication, are a necessary part of such adaptation. For example, duplication of genes encoding TFs, followed by the structural and functional divergence of the obtained copies, makes an essential contribution to the development of transcription regulatory networks [2,[35][36][37][38]. The redundancy provided by duplicated genes could contribute to the adaptation of species and genetic resistance to changes in environmental conditions [3].
Normalization of the gene dose occurs due to genetic and epigenetic changes. Methylation in the 5'-region of the gene (including the promoter and part of the transcribed region) and the 3'-region (including part of the transcribed region and 3'-flanking sequences) could inhibit gene expression [39][40][41]. It has been suggested that methylation in the promoter region inhibits the binding of regulatory proteins and, as a consequence, the transcription, while methylation within introns and exons correlates with highly expressed genes [41][42][43]. For example, among three homeologous copies of the LEAFY HULL STERILE1 (WLHS1) gene of the allohexaploid bread wheat T. aestivum, one gene lost its functionality due to a mutation in the functional domain, the other copy was not transcribed due to hypermethylation, and only the third gene retained its functionality [44].
Generally, cytosine methylation makes a universal contribution to the growth and development of eukaryotic cells through the regulation of gene activity. In this study, we decided to find what was hidden behind the functional divergence of TaCHI and TaF3H gene copies in T. aestivum. We assumed that the cause was the tissue-specific gene expression associated with differences in the promoter structure and the methylation pattern.
Previously, it was shown that TaCHIs were expressed in different parts of wheat independently of the coloration, even though in some intensively colored organs a certain increase in the gene expression can be observed [15]. The promoter structure of TaCHI-B1 was insignificantly different from that of TaCHI-A1 or TaCHI-D1, with the presence of additional MYB-recognition elements and G-box (ACE: ACGT-containing element) [15]. These differences, together with the identified stable methylation marks, are likely to be the cause of a decreased gene expression in roots (Figure 3b, Supplementary Materials). MYB TFs are known to serve as key regulators for the synthesis of flavonoids in various plant organs [21,38,[45][46][47][48]. Methylation in TaCHI-B1 was stable and appeared in all clones at the beginning of the analyzed region. We assume that this site may be critical for a decrease in the level of gene expression, especially because this region contains MYB-binding domains.
The results obtained for the TaCHI-A1 gene imply that the presence of methylation marks in the promoter and the 1st exon in two clones did not affect the gene expression ( Figure 3a). Nevertheless, Shoeva et al. (2014b) noticed that in cv. 'Saratovskaya 29 with weak anthocyanin pigmentation of the coleoptile (which coincides with the genotype studied here), the expression of TaCHI-A1 was weaker than in the near-isogenic line (NILs) 'i:S29Pp1Pp2' with strong anthocyanin pigmentation of the coleoptile. This difference could be explained by the presence of the dominant flavonoid biosynthesis regulator gene TaMyc-A1 (encoding MYC/bHLH TF) in the NILs, which led to a purple color change in many parts of the plant [49,50].
Four copies of the gene encoding F3H were previously identified in wheat. Three copies (TaF3H-A1, TaF3H-B1, and TaF3H-D1) resulted from allopolyploidization of the wheat genome. These genes are co-transcribed and localized in the syntenic regions of chromosomes 2A, 2B, and 2D, showing high similarity in the structure of the coding regions and promoters [9]. The fourth copy (TaF3H-B2) is a paralogous copy in chromosome 2B, differing in structure and transcriptional activity [14].
The presence of a paralogous copy of F3H in wheat and in some of its relatives is unique among plants. Comparison between TaF3H-B2 and TaF3H-1 copies indicates that conservative amino acid residues are still present in TaF3H-B2; they are necessary for the formation of the enzyme's active sites [14,51]. However, the fact that TaF3H-B2 is not transcribed during the biosynthesis of anthocyanins, but is transcribed in wheat roots, suggests that TaF3H-B2 probably participates in the biosynthesis of other flavonoid compounds (while the TaF3H-1 genes do exactly the opposite). Khlestkina et al. (2013) and the present study demonstrated a significant difference in the structure of TaF3H-1 and TaF3H-2 promoters. For TaF3H-B1, active methylation both in the coleoptile and roots mainly affected the TF and stress-responsive elements and did not affect MYB-binding sites near the ATG start codon (Figure 4, Supplementary Materials). Moreover, only one type of the light-dependent element GATABOX, required for highlevel and tissue-specific gene expression, was methylated (Table 3). We assume that, since the remaining light-dependent elements and TF binding sites were not methylated, the expression of TaF3H-B1 is possible in the coleoptile under the studied conditions.
It was previously reported that TaF3H-1 was expressed only in colored tissues and not expressed in colorless ones, such as the uncolored pericarp or roots of 'Saratovskaya 29 [52]. Other structural genes, such as TaCHS, TaCHI, TaDFR, or TaANS, were transcribed in the absence of anthocyanin pigments, but at a lower level than in intensively colored tissues. The lack of this gene's expression in the pericarp and roots coincided with the pattern of TaMyc-A1 expression, suggesting that the TaF3H-1 genes are the key structural genes that determine the origin of anthocyanin biosynthesis and the main targets of TF [50]. The features associated with the regulation of expression or substrate specificity of TaF3H-B2 have not yet been studied. The weak methylation status in both the coleoptile and roots as well as single changes in the promoter methylation pattern could not be the cause of the difference in TaF3H-B2 activity ( Figure 4, Table 3). According to the results of this study, differences in the promoter structure and TF specificity should lead to tissue-specific expression of these genes.
Thus, we assume that DNA methylation, divergence of promoter sequences, and gene expression regulation together underpin the specific pattern of the duplicated TaCHI and TaF3H genes in T. aestivum.
Supplementary Materials: Supporting information can be downloaded from: https://www.mdpi. com/article/10.3390/biom12050689/s1. Supplementary Section S1: The alignment of TaCHI promoters; Supplementary Section S2: The alignment of TaF3H promoters; Supplementary Section S3: CpG island prediction in promoters of TaCHI and TaF3H genes made with MethPrimer 2.0; Supplementary Section S4: The alignment of TaCHI promoter sequences and assembled promoter sequences after bisulfite treatment; Supplementary Section S5: Results of the signal scan search request in promoters of TaCHI and TaF3H genes made with New PLACE and PlantPAN 3.0.