Genome-Wide Identification and Spatial Expression Analysis of Histone Modification Gene Families in the Rubber Dandelion Taraxacum kok-saghyz

Taraxacum kok-saghyz (Tks), also known as the Russian dandelion, is a recognized alternative source of natural rubber quite comparable, for quality and use, to the one obtained from the so-called rubber tree, Hevea brasiliensis. In addition to that, Tks roots produce several other compounds, including inulin, whose use in pharmaceutical and dietary products is quite extensive. Histone-modifying genes (HMGs) catalyze a series of post-translational modifications that affect chromatin organization and conformation, which, in turn, regulate many downstream processes, including gene expression. In this study, we present the first analysis of HMGs in Tks. Altogether, we identified 154 putative Tks homologs: 60 HMTs, 34 HDMs, 42 HATs, and 18 HDACs. Interestingly, whilst most of the classes showed similar numbers in other plant species, including M. truncatula and A. thaliana, HATs and HMT-PRMTs were indeed more abundant in Tks. Composition and structure analysis of Tks HMG proteins showed, for some classes, the presence of novel domains, suggesting a divergence from the canonical HMG model. The analysis of publicly available transcriptome datasets, combined with spatial expression of different developmental tissues, allowed us to identify several HMGs with a putative role in metabolite biosynthesis. Overall, our work describes HMG genomic organization and sets the premises for the functional characterization of epigenetic modifications in rubber-producing plants.


Introduction
The genus Taraxacum Wigg. (dandelion) is included in the large family of Asteraceae (Compositae), which counts more than 2800 species [1]. Most of these species of dandelions live in the temperate zones of the northern hemisphere, although they are native to Eurasia [2]. Among all the dandelions species, the Russian dandelion (Taraxacum kok-saghyz, Tks) can produce from its roots natural rubber (NR) of excellent quality, quite comparable to that obtained from Hevea brasiliensis, the so-called rubber tree [3]. In Tks, the length of the rubber polymer is greater than that of Hevea, with a molecular weight of approximately 2180 kDa, but its productivity per hectare is lower [4]. Nevertheless, Tks mature roots can contain up to 5% of rubber and 20% of inulin in dry weight in wild plants and, therefore, this dandelion is considered as a valid alternative natural rubber source. Tks can extensively grow both in cold and temperate areas, presents a short life cycle, and can be easily harvested [5]. Moreover, Tks is a species suitable for genetic manipulation and it could be used as a model plant to investigate the molecular pathways that regulate NR biosynthesis [6,7].
NR is a polymer composed of cis-1,4-polyisoprene and it represents a very important raw material, used to produce more than 50,000 both industrial and medical products [8].
Here, we report for the first time an extensive in silico analysis and identification of the histone-modifying genes (HMGs) in Tks. Using publicly available genome and transcriptome data, we analyzed the gene and protein structure of the identified HMGs and monitored their expression in various Tks tissues, including latex. In addition, we measured transcript levels in Tks young and adult leaves and roots and identified those per each class of enzymes that appeared to be abundant in relevant tissues.

HMG Genes Identification
In the Tks genome [9], we identified a total of 154 HMGs: 60 HMTs, 34 HDMs, 42 HATs, and 18 HDACs. We compared the total number with a close relative of Tks, Lactuca sativa (Ls) and with the reference plants Arabidopsis thaliana (At) [43] and Medicago truncatula (Mt) [44]. While some classes were similar in numbers, others showed divergent expansion among different species. PRMTs were indeed more abundant in Tks, indicating an expansion of this class of proteins similarly to Litchi chinensis [45]. Altogether, HAGs were more abundant in both Tks and Ls compared to At, confirming the possibility for this group to expand divergently in different taxa ( Table 1). The complete list of Tks HMGs is reported in Table S1.  PF00583  38  34  3  51  HAM  PF01853  1  2  2  1  HAC  PF08214  3  14  5  11  HAF  PF09247  0  1  2  1   HDAC   HDA  PF00850  9  19  12  10  SRT  PF02146  4  3  2  2  HDT  5  3  4  3   154  181  103  206 Gene duplication events contributed to the high number of HMGs in the Tks genome. Synteny analysis on the Tks genome assembly was not effective for the identification of segmentally duplicated HMGs, while five couples of tandemly duplicated genes were identified: TksHDA5/TksHDA6, TksHAG30/TksHAG31, TksHAG36/TksHAG37, TksJMJ11/TksJMJ12, and TksSDG44/GWHPAAAA005121. In the last pair, GWHPAAAA005121 was not classified as an HMG due to the absence of a representative domain. We identified 32 additional couples of putative duplicated HMGs based on phylogenetic analysis. The Ka/Ks ratio was calculated for all the gene pairs in order to estimate the occurring evolutionary dynamics (Table S3). Most of the couples showed Ka/Ks <1, suggesting purifying or stabilizing selection. Only the TksHAG13/14 pair with a Ka/Ks ratio of 1.54 and TksPRMT7/8 with 1.05 showed values compatible with positive and neutral selection.

HMTs
Among HMTs, 46 were identified belonging to SDG and 14 to the PRMT group. SDGs are similar in number to At and the same amount in Tks and Ls (Table 1). SDGs are divided in seven classes [22,46]. Four TksSDGs cluster with Class I, E(Z)-like SDGs ( Figure 1A) and show the expected SANT-CXC-SET domains except for TksSDG8 lacking the SANT domain ( Figure 2). Five clusters with Class II, ASH1-like. TksSDG29, TksSDG16, and TksSDG19 have the same domain composition of corresponding At proteins. TksSDG15 encodes for a short (52aa) and a probably aberrant protein where only the SET domain was identified. TksSDG37 lacks the PostSET domain compared to AtSDG26. Five proteins cluster with Class III, TRX-like. TksSDG41 differs from AtSDG14 as the first PHD module is substituted by a SAND domain that is often associated to PHDs and could contribute to their function. TksSDG45 showed an additional TUDOR domain at the N-terminal. The TUDOR domain was previously identified in D. melanogaster TUDOR proteins and its function is unknown. Several human JMJ/KDM4 proteins harbour a TUDOR domain at the C-terminal [44]. TksSDG4 clusters with AtSDG25 but differs in domain composition and sequence length: GYF-SET and 1057aa the former, and SET-PostSET 1424aa the latter. This was previously observed in S. lycopersicum (Sl), where AtSDG25 and SlSDG20 showed similar differences [45]. Two Class IV SDGs exist in both Tks and At. Six Tks proteins cluster with class V subclass I and six with class V subclass II with a similar domain composition to At proteins. Eighteen SDGs can be classified as belonging to class VI/VII and show an interrupted SET domain. TksSDG40 is a short protein of 158aa, probably aberrant, and the presence of the SET domain was not confirmed by SMART analysis.
TksSDG5 and TksSDG25 encode for putative long proteins with an additional/unexpected domain composition, probably resulting from exon gain or, more likely, a defect in genome assembly/annotation.
As shown in Figure S1, three PRMTs form a separate cluster: TksPRMT2, TksPRMT7, and TksPRMT8 and contains two PRMT domains ( Figure 2). The PRMT5 domain was not confirmed for TksPRMT9 and TksPRMT11 by SMART analysis. TksPRMT6 shows four C2H2 modules at the N-terminal similarly to HMGs belonging to other groups such as AtSDG6/TksSDG24 ( Figure 2).

HDMs
In Tks, we identified 27 proteins containing the JmjC domain (PF02373) ( Table 1). Despite that nine harbour the JmjC domain alone and could be classified as JMJ-only, only TksJMJ25 and TksJMJ27 clustered with At JMJ-only proteins ( Figures 1B and 2). The other seven are closer to other JMJ groups but underwent the loss of representative domains. In addition, only four out of fourteen proteins show the presence of the RING domain typical of the KDM3 group ( Figure 2). KDM4 JMJs present N-terminal JmjN (PF02375) and JmjC, and C-terminal C5HC2 (PF02928) (subgroup I) or C2H2 (PF00096) (subgroup II). In Tks, two proteins belong to subgroup I and three to subgroup II. Four proteins cluster with KDM5 JMJs and show the expected domain composition ( Figures 1B and 2).
GCN5-type HAGs, TksHAG36, and TksHAG37 have a C-terminal Bromodomain similar to AtHAG1. Three additional proteins clustering with AtHAG1 are shorter and probably underwent loss of the Bromodomain: TksHAG8 193 aa, TksHAG33 250 aa, and TksHAG34 192 aa (Figures 3A and 4 and Table S1).      TksHAG16 clusters with AtHAG2 and harbours the Hat1 domain. TksHAG10 clusters with AtHAG3 and harbours the ELP3 domain. Two additional proteins clustering with AtHAG3, TksHAG27, and TksHAG6 show the AT1 domain alone ( Figures 3A and 4). Furthermore, other AT1 domain-containing proteins can be observed.
A first group of seven proteins forms a definite cluster that is characterized by the presence of Jas and PHD domains except for two shorter proteins, probably derived from the others, that lost specific domains: TksHAG21 and TksHAG25. In TksHAG32, the AT1 domain was not identified as below the threshold for SMART analysis ( Figures 3A and 4).
Ten proteins are characterized by the presence of an additional FR47 domain at the C-terminal. Four of them, TksHAG5, TksHAG11, TksHAG17, and TksHAG19 form a definite cluster. Instead, a second cluster includes TksHAG14, TksHAG15, TksHAG23, TksHAG29, and two additional proteins: TksHAG20 where no domain was identified by SMART, and TksHAG24, where the FR47 domain is missing and there is an N-terminal AAK domain. Interestingly, TksHAG13, containing both AT1-FR47, do not cluster with the other FR47 domain-containing proteins ( Figures 3A and 4).
Two proteins, TksHAG9 and TksHAG26 show two C-terminal BRCT domains and form a cluster with TksHAG30 and TksHAG31 that are shorter and without a BRCT domain. TksHAG12 harbours AT1 and C2 domains. All other TksHAG proteins show the AT1 domain alone (Figures 3A and 4).

HDACs
We identified 18 histone deacetylases (Table 1). Nine are HDAs, three belong to class I, RPD3-like, five to class II, HDAC1-like, and one to class III, HDAC11-like. In class I, TksHDA9, clustering with AtHDA19, is a long protein (1634 aa) showing six additional N-terminal TPR domains probably resulting from exon gain or a defect in genome assembly/annotation. In the class II, TksHDA7 similarly to MtHDA8 [44] harbours an N-terminal Znf_RBZ domain (Figures 3B and S5).
Four proteins are SRTs containing the SIR2 (PF02146) domain. Interestingly, with the exception of TksSRT4, all other SRTs have a double SIR2 domain ( Figures S5 and S6).
Five TksHDTs were identified by BLASTp analysis ( Table 1). The most conserved are TksHDT1 and TksHDT2 if compared with AtHDTs. TksHDT1 showed a C2H2 domain. In TksHDT5, C2H2 modules are six ( Figures S5 and S7). In plants, HD2-Types are related to cis-trans isomerases found in insects and yeast [30]. Similarly, their function as HDACs remains debatable.

In Silico Expression Analysis of HMG Genes in Different Organs and Developmental Stages
To shed light on the spatial expression of the HMGs in Tks, we analyzed publicly available datasets [9], based on the following tissues/organs: leaves, stems, the main root, and the lateral root in both young and mature stages (eight samples); latex and three additional samples from reproductive organs: flower, peduncle, and seeds. Expression patterns of HMGs were evaluated for each gene as log 2 fold change (Fc) compared to the average among samples. Hierarchical clustering was used to assess groups with similar expression patterns.

HMTs
Among SDGs, a first group presents a peak of expression in young/mature leaves compared to other organs. Within this group, TksSDG28 shows an additional peak in the latex. A second large group includes genes with higher expression in reproductive organs, mainly flowers and seeds. In group 3 we identified genes with higher expression in young organs and flowers. Group 4 shows higher expression in stem and roots with lower expression in the latex. The last group harbours the most expressed genes, showing an expression pattern primarily in mature roots ( Figure 5).
TksPRMTs exhibit less modulated expression patterns. Three groups can be identified: the first with higher expression in latex and reproductive organs; the second with a peak in flowers; and the third group with higher expression in stem and roots. In the first group, TksPRMT7 is the one showing the most abundant peak of expression in latex ( Figure 5).

HDMs
Among TksJMJs, a first group includes genes with higher expressions in reproductive organs. In addition to that, TksJMJ22 presents a relevant peak also in the latex, indicating a strong modulation compared to whole roots. A second group includes genes with a higher expression in stem-root and flower, and little expression in latex and leaf. The third group shows a higher expression in stem-root but, differently from most JMJs, not in flower. This group includes JMJs with high levels of expression. The fourth group shows a relevant expression in stem and flower ( Figure 5).
All HDMAs exhibit higher expression in stem, flower, and seeds, with two groups that can be distinguished by high and low expression in mature root and latex ( Figure 5).

HATs
Four groups were identified among TksHAGs: a first one with peaks in the roots (mainly in mature roots); a second group predominantly in reproductive organs; a third with the same pattern with addition of leaves; and interestingly, the fourth group is made of highly expressed genes showing a peak in the latex similarly to the single HAM gene detected by our analysis. HACs are less expressed in leaves and latex ( Figure 6).

HDACs
TksHDAs can be divided into three groups: a first one with peaks of expression in leaves and peduncle; a second group with lower levels in the roots; and a third group more relevant in young and reproductive organs. TksHDA8 shows a peculiar pattern with peaks in the main root and latex. TksSRTs were present in reproductive organs with TksSRT4 and TksSRT2 having an additional peak in mature leaf and in the latex, respectively. HDTs, instead, show peaks of expression in young leaf, stem, lateral root, and in the mature stem with the exception of TksHDT5, which shows peaks in reproductive organs.

HMGs Expression
Using the absolute quantification method, we measured the expression of 13 different TksHMGs belonging to four families in four different tissues: developing leaves (DL), fully developed leaves (FL), root tips (RT), and mature roots (R). Figure 7 shows the spatial gene expression pattern of four HMTs (TksSDG3, TksSDG4, TksSDG24, and TksPRMT9), five HDMs (TksJMJ14, TksJMJ23, TksJMJ25, TksHDMA4, and TksHDMA6), one HAT (TksHAG10), and three HDACs (TksHDA7, TksHDA8, and TksHDA9). Five genes (TksHDMA4, TksHDA9, TksHDMA6, TksJMJ23, and TksHAG10) have at least one tissue expression level higher than 1 femtogram (fg)/µL (Figure 7a). The most expressed gene is TksHDMA4, with an average tissue quantification of 6 fg/µL but no significative modulation between the different tissues. Among the remaining eight genes, TksSDG3 and TksJMJ25 expressions never exceed 0.1 fg/µL (Figure 7b). With an average tissue quantification of 0.05 and 0.01 fg/µL, respectively, they are the lowest expressed HMG genes. Overall, there is no difference of global HMG gene expression between the four different tissues analyzed, with an average expression of 1.38 and 1.29 fg/µL in fully developed leaves (FL) and the root (R), respectively. TksJMJ25 and TksHDA8 expression increased significantly, 319 and 174%, respectively, in FL in comparison to DL. TksHAG10, TksSDG24, and TksPMRT9 are up-regulated in a less strong way. The remaining eight genes confirm an FL up-regulation with respect to DL; however, below a 50% change. TksSDG4, TksHDA8, and TksSDG3 expression increased clearly with a 276, 249, and 104%, respectively, in R compared to RT. TksHDA7 and TksHAG10 are up-regulated by 85 and 56%. As for leaves, all eight remaining genes are up-regulated in R tissue in comparison to RT with a percentage below 50%.

HMG Members in Taraxacum Kok-Saghyz
Natural rubber represents a biopolymer of major importance due to its wide properties that cannot be found in synthetic material, including resilience, elasticity, and abrasion [50]. The demands to develop new sources of natural rubbers have quickly increased over the years due to the constant reductions of petroleum-based materials and the efforts towards renewables [3]. Currently, the main resource for commercial natural rubber, He- Figure 7. qPCR expression analysis of 13 HM genes in Taraxacum kok-saghyz developing leaves (DL), fully developed leaves (FL), root tips (RT), and mature roots (R). (a) Absolute quantification (mean ± SD fg/µL) of HM genes with an average expression above 1 fg/µL; (b) Absolute quantification (mean ± SD ag/µL) of HMG genes with an average expression below 1 fg/µL.

HMG Members in Taraxacum Kok-Saghyz
Natural rubber represents a biopolymer of major importance due to its wide properties that cannot be found in synthetic material, including resilience, elasticity, and abrasion [50]. The demands to develop new sources of natural rubbers have quickly increased over the years due to the constant reductions of petroleum-based materials and the efforts towards renewables [3]. Currently, the main resource for commercial natural rubber, Hevea brasiliensis, has generated major concerns because of its high sensitivity towards pathogen infections [51]. Furthermore, latex can induce several allergic reactions due to their protein content [52]. As an alternative to H. Brasiliensis, two plant species are so far known to produce rubber in similar amounts and a high molecular weight: Parthenium argentatum Gray, also known as guayule, and Taraxacum kok-saghyz, the Russian dandelion [53]. Tks latex, for instance, is broadly investigated for non-medical applications such as tire production [54].
In Tks, rubber biosynthesis occurs through different pathways that are responsible for NR chain elongation, small rubber particles (SRPPs), and rubber elongation factors, respectively [7]. Therefore, to improve rubber quality and quantity, it becomes paramount to functionally characterize the genes involved in those pathways.
Histone modifications are well established to control gene expression [15]. Thus, they represent a promising strategy to manipulate different signalling pathways, unlike the gene-by-gene approach [55].
In this study, we presented for the first time a comprehensive analysis of the HMGs in Tks, identified by the presence of representative domains recognized through their hidden Markov profiles. HMGs sequences and/or composition were compared with the model species A. thaliana and M. truncatula, as well as the close relative L. sativa.
Overall, the identification recorded an increase in the number of HMG genes in Tks (154) compared to A. thaliana (102), which is similar to what is observed in L. sativa and M. truncatula. Due to limitations in the assembly, it was not possible to assess gene duplication.
A closer investigation revealed that such an increase did not occur on all the HMG classes, but it was indeed specific. Thus, while HDMs were similar in numbers (34 in Tks vs. 24 in Arabidopsis), HATs were predominantly higher in the former (42) than in the latter (12). This is not necessarily surprising, as an expansion of HAGs was already revealed in other crops when compared to Arabidopsis (Table 1) [44].
Among the HMTs family, analyses on Tks indicated the presence of 46 genes encoding for SDGs, a similar number to those of lettuce and Arabidopsis. Instead, the PRMTs were doubled up in Tks compared to the other species. Overall, the Tks SDGs showed a similar structure to those already annotated with SANT-CXC-SET domains except for TksSGD8 that showed a shorter structure (Figure 2). SANT domains are essential to ensure histone substrates for the SGD enzymatic activity [56], while the CXC allows binding to RNA molecules as the SET represents the signature domain to cluster the SDGs [57,58]. TksSDG5 appears to be much longer than the other SDGs, perhaps due to fusion between two different proteins ( Figure 2). The presence of a bag6-A domain could implicate a divergent function among the other SDGs, including a possible interaction with heat shock proteins [59]. A functional study is therefore required to address this question.
Deeper investigation on the PRMTs indicated that TksPRMT2, 7, and 8 contained two PRMT domains whilst the majority had only one. Structural analyses combined with domain deletion could retrieve more information on that regard.
Tks HDMs were classified in JMJs and HDMAs. Interestingly, many JMJ-only clustered with the KDM3 groups although they lacked the additional RING domain ( Figure 5). However, they all presented a very extensive introns and exons structure that is typical of the KDM3s [60].
Over 90% of Tks HATs belonged to the HAG group. Similar to previous reports [48], the majority of the members contained only the AT1 protein domain (Figure 4). Whilst TksHAG36 and 37 were identified as GCN5 likes due to the presence of the bromodomain, others included Jas and PhD domains or the FR47, or the BRCT domain in combination with AT1. This appears to be quite divergent from the canonical HAGs structure, suggesting a novel class of HMG HATs in Tks. Indeed, Jas domains are present in JASMONATE ZIMdomain (JAZ) proteins that are repressors of jasmonate (JA) signalling [61]. FR47 domain resembles the C-terminal region of the Drosophila melanogaster hypothetical protein FR47. Interestingly, a member of the HATs in Plasmodium falciparum also contains that domain, suggesting a class of acetyltransferases targeting nonhistone proteins [62].
Our approach identified 18 HDACs in Tks. The three classes showed numbers comparable to Arabidopsis ( Figure S5). Interestingly, TksHDA9, which shows similarities with AtHDA19, appears to have additional domains. AtHDA19 is a major regulator of development and growth [41,63]. Whether TksHDA9 has developed similar functions remains to be assessed.

In Silico and Spatial Analysis of HMGs in Tks Tissues Identified Putative Regulators of NR Biosynthesis
Histone modifications are well known to regulate important functions in plants [15], but very little is known about their expression patterns and behaviour in Tks. To unveil new roles for the HMGs, we took advantage of publicly available datasets in Tks and measured their mRNA levels as fold-change normalized to the average amounts. Among the different tissues analyzed, latex represented the most interesting as it is directly connected to natural rubber production. Indeed, the biosynthesis of NR occurs in the latex of laticifers, organized in rubber particles [64]. Within the SDG class VI, TksSDG28 presented a distinct expression in latex ( Figure 5). Instead, TksSDG3 and 4 showed an abundant level of transcripts in young leaves and roots. Interestingly, TksSDG4 was more abundant in roots, whilst TksSDG3 showed similar levels in both tissues. The Arabidopsis TksSDG3 homolog, SWINGER, is part of a large protein complex that includes VRN2 (VERNALIZATION 2) and VIN3 (VERNALIZATION INSENSITIVE 3), together with FERTILIZATION INDEPENDENT ENDOSPERM (FIE) and CURLY LEAF (CLF). They are responsible for establishing FLC (FLOWERING LOCUS C) repression during vernalization [65]. Notably, the TksSDG4 homolog in Arabidopsis is ATXR7 that instead promotes H3K4me3 on the FLC locus, hence its transcriptional activation [66]. A similar function is associated with AtSDG8, a homolog of TksSDG24 [67]. Whether their function is indeed conserved in Tks and is based on tissue specificity remains to be assessed.
Heatmaps for Tks PRMTs did not show major modulations among tissues, with the exception for TksPRMT9 and TksPRMT11 that portrayed an elevated expression in young leaves ( Figure 5). Real time qPCR analyses of some of the indicated genes revealed a tendency towards root samples. Interestingly, TksSDG4 was expressed primarily in mature roots and TksPMRT9 showed major peaks in fully developed leaves as well as roots, making them interesting targets for a functional approach (Figure 7b).
In silico expression of the histone demethylases revealed a predominance on the reproductive tissues ( Figure 5). However, TksJMJ14 showed a preferential expression in roots. Our qRT-PCR data showed instead that levels were different between root tips and adult roots, suggesting a modulation associated with plant development. Furthermore, it is important to highlight that, while mature roots contain laticifers already producing rubber, in the root tips the laticifers are still developing. Therefore, our qPCR analysis presents dual significance: first, it identifies putative HMGs involved in regulating the expression of genes related to rubber or laticifers production; second, the comparison between the expression in leaves and roots is pivotal to distinguish between overall growth and rubber-specific mechanisms. The analysis of other two JMJs TksJMJ23, a homolog of Arabidopsis ELF6 [67] and TksJMJ25, confirmed in silico results as none of them were modulated but instead they showed similar expression levels among the analyzed tissues (Figure 7a,b). A similar pattern was observed for the HDMAs, where only TksHDMA4 turned out to be the most abundant among the analyzed genes. Interestingly, the heatmaps showed a major expression in the latex tissue.
Heatmaps for histone acetyltransferases indicated only a few HATs were expressed in root tissues, whilst the group 3 was majorly present in young and mature leaves ( Figure 6). We focused our attention on TksHAG10 that indeed showed to be mostly abundant in roots ( Figure 7a). Interestingly, our qPCR data also showed a peak in fully developed leaves that was not observed in the datasets analysis. However, the plant material used in the datasets experiment might not fully correspond to the one obtained for expression analysis.
Among the histone deacetylases, TksHDA9 was the most expressed within the family (Figure 7a). Interestingly, TksHDA9 revealed a drop in expression between leaves and roots, while TksHDA8 was instead more abundant in mature roots. It will be interesting to assess whether TksHDA9 action is counteracted by histone acetyltransferases GCN5-likes.

Plant Material
A Tks plant belonging to the W6-35166 population obtained from USDA-ARS (

In Silico Identification and Analysis of HMG Loci
To identify HMGs loci, the hidden Markov profiles of each gene family were used as a query input for the HMM software against the protein subject datasets of Taraxacum kok-saghyz and Lactuca sativa [9]. Since for the HDT family no PFAM domain is available, we used Arabidopsis HDTs as query input for BLASTp searches against the protein datasets. The identified loci were analysed with the SMART software to characterise the domain composition [68,69]. The cases pointing to two different domains for the same protein region were resolved based on the domain that was associated to the smaller E-value. The intron/exon structures were graphically represented with GSDS 2.0 [70].

Phylogenetic Analysis
The HM protein families identified in this work from Taraxacum kok-saghyz genome annotation database [9] along with HM proteins from Arabidopsis were aligned using ClustalW program in CLC Genomics Workbench version 9.5.2 (Qiagen, Hilden, Germany). A phylogenetic tree was then constructed using a neighbor-joining algorithm and Kimura protein substitution model. Reliability of the internal branching was obtained by a bootstrap test of 1000 replicates.

Transcriptome Analysis
RNA reads included in project PRJCA000437 were downloaded from the National Genomics Data Center (NGDC), part of the China National Center for Bioinformation (CNCB) and quality checked using Trimmomatic [77] according to Corchete et al. (2020) [78]. The reads were then aligned to the Tks indexed genome using STAR [79] with default settings. HTSeq-count script [80] was used to determine raw counts for each feature and experiment. The raw count matrix was normalised using edgeR and TMM method [78,81].

RNA Extraction, cDNA Synthesis, and Gene Expression Analysis
Total RNA from the above-mentioned plant material was extracted using innuPREP Plant RNA Kit (Analytik, Jena, Germany) according to the manufacturer's protocol. RNA quality and concentration were estimated by Nanodrop™ 1000 Spectrophotometer (Thermo Fisher Scientific Waltham, MA, USA). A total of 1 µg of total RNA was used for cDNA synthesis using SuperScript IV kit (Thermofisher Waltham, MA, USA) with oligo(dT) 20 as primers in a 20 µL final volume. Primer pairs for SYBR-Green based qPCR have been designed for each candidate gene using Geneious Prime 2022.1.1 (https://www.geneious.com). Each primer has been subjected to off-target analysis using local BLAST-2.11.0+ [72,82,83] against the Taraxacum kok-saghyz genome annotation [9] downloaded from https://ngdc.cncb.ac.cn/gwh/. Pairs with both forward and reverse primers matching an off-target transcript were rejected. Primer sequences are reported in Supplementary Table S2. qPCR was performed using an ABI Prism 7900HT instrument (Applied Biosystems, Waltham, MA, USA) and Platinum SYBR Green qPCR SuperMix-UDG with ROX (Thermofisher, Waltham, MA, USA) following manufacturer's instructions. Reactions were performed in two technical replicates on three biological replicates. The following cycling conditions were used for quantitative PCR: 5 min at 95 • C followed by 45 cycles at 95 • C for 15 s and at 58 • C for 60 s. Melting curve analysis from 60 to 90 • C was performed to monitor the specificity of the amplification. mRNA levels of each genes analysed were calculated using absolute quantification based on the standard curve method [84] as reported by D'Amelia et al., (2014) [85] expect for standard curves obtained by purified and quantified conventional PCR normalised to a concentration of 33 pg/µL and 10-fold serial dilutions (ranging from 10 2 to 10 8 ). Expression data were analysed using Tukey's pairwise test.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/plants11162077/s1, Figure S1: Phylogenetic tree of PRMT proteins of Taraxacum kok-saghyz and Arabidopsis (in bold). The numbers near the tree branches. Figure S2: Phylogenetic tree of HDMA proteins of Taraxacum kok-saghyz and Arabidopsis (in bold). The numbers near the tree branches represent bootstrap values. Figure S3: Domain composition and intron-exon structure of Taraxacum kok-saghyz HAMs and HACs. Figure S4: Phylogenetic tree of HAC proteins of Taraxacum kok-saghyz and Arabidopsis (in bold). The numbers near the tree branches represent bootstrap values. Figure S5: Domain composition and intron-exon structure of Taraxacum kok-saghyz HDAs, SRTs, and HDTs. Figure S6: Phylogenetic tree of SRT proteins of Taraxacum kok-saghyz and Arabidopsis (in bold). The numbers near the tree branches represent bootstrap values. Figure S7: Phylogenetic tree of PRMT proteins of Taraxacum kok-saghyz and Arabidopsis (in bold). The numbers near the tree branches represent bootstrap values. Table S1: List of T. kok-saghyz HMGs.