Genome-Wide Identification and Expression Profiling Analysis of the Trihelix Gene Family Under Abiotic Stresses in Medicago truncatula

The trihelix transcription factor (GT) family is widely involved in regulating plant growth and development, and most importantly, responding to various abiotic stresses. Our study first reported the genome-wide identification and analysis of GT family genes in Medicago truncatula. Overall, 38 trihelix genes were identified in the M. truncatula genome and were classified into five subfamilies (GT-1, GT-2, SH4, GTγ and SIP1). We systematically analyzed the phylogenetic relationship, chromosomal distribution, tandem and segmental duplication events, gene structures and conserved motifs of MtGTs. Syntenic analysis revealed that trihelix family genes in M. truncatula had the most collinearity relationship with those in soybean followed by alfalfa, but very little collinearity with those in the maize and rice. Additionally, tissue-specific expression analysis of trihelix family genes suggested that they played various roles in the growth and development of specific tissues in M. truncatula. Moreover, the expression of some MtGT genes, such as MtGT19, MtGT20, MtGT22, and MtGT33, was dramatically induced by drought, salt, and ABA treatments, illustrating their vital roles in response to abiotic stresses. These findings are helpful for improving the comprehensive understanding of trihelix family; additionally, the study provides candidate genes for achieving the genetic improvement of stress resistance in legumes.


Introduction
Transcription factors (TFs) are a kind of DNA binding protein, that play pivotal roles in plant growth and development, as well as in response to environmental stresses [1,2]. TFs regulate the expression of target genes by binding to specific cis-elements of the gene promoter region or binding with other TFs [3]. Currently, trihelix TF family members have attracted more attention; they feature a typical helix-loop-helix-loop-helix structure with a core sequence of 5 -G-Pu-(T/A)-A-A-(T/A)-3 in their DNA-binding domain [4,5]. Because this domain can specifically bind to GT elements required for light response in a DNA sequence, it is also called the GT family [6]. The conserved domain of trihelix is similar to the individual repeats of the MYB family in sequence; therefore, it is generally thought to be derived from MYB-like genes [5].
The first discovered trihelix gene was the GT-1 transcription factor in pea (Pisum sativum) [7]. Subsequently, orthologous genes were identified in tobacco and Arabidopsis thaliana [8,9]. Early researches on GT genes had been focused on the light response regulation [10][11][12]. In recent years, the biological functions of several trihelix family genes have been discovered, indicating that

Plant Materials and Treatments
M. truncatula (cv. Jemalong A17) seeds were sterilized in 75% ethanol for 5 min, rinsed with sterile water five times, and then placed on the moistened filter paper in Petri dishes. They were subsequently cultured in a growth cabinet at 25 • C. For the tissue-specific expression analysis of MtGT genes, 7-day-old seedlings were transferred into the mixture of peat soil and vermiculite (1:1, v/v) for individual pot cultivation (18 cm inner diameter, 20 cm height). Potted seedlings grew in a greenhouse (the temperature is 25 • C with a 16/8 h light/dark photoperiod, relative humidity of 35-40%, and photon flux density of 450 µmol m −2 ·s −1 ). Roots, stems, blades, buds, flowers and seedpods were collected at pod stage from three individual plants. For the expression analysis of MtGT genes response to different treatments, the 7-day-old seedlings were transferred into the flasks with 1/2 MS liquid medium and grew in a controlled growth chamber under 16/8 h light/dark regime at 25 • C. Ten days later, the plants with the fourth blade expanded were watered with 15% PEG6000, 200 mmol·L −1 NaCl, and 1 mmol·L −1 ABA, respectively. The blades were collected at 0 h, 2 h, 24 h and 72 h with a triplicate. All samples were immediately frozen in liquid nitrogen and stored at −80 • C for RNA extraction.

Chromosomal Distribution and Gene Duplication Events Analysis
The chromosomal location information of 38 trihelix genes were obtained from the M. truncatula genomic annotation file GFF3 (general feature format) that was downloaded in MTGD. The TBtools software [37] was used to draw the chromosomal distribution image of MtGT genes. The detection and identification of the gene duplication events in MtGT genes were performed using multiple collinear scanning toolkits (MCScanX) [38] with E-value set to 10 −5 .

Gene Structure and Conserved Motifs Analysis
Gene structure was constructed to visualize the exon-intron of MtGT genes based on the CDS and the corresponding full-length sequence by using TBtools software [37]. The MEME tool (http://meme-suite.org/tools/meme) [39] was used to analyze the conserved motifs of MtGT proteins and the relative parameters were set to the motif width as 15-50 amino acid (aa) and the number of motifs as 10.

Phylogenetic and Collinearity Analysis of MtGT genes
The full-length amino acid sequences of the M. truncatula and Arabidopsis trihelix family proteins were aligned using MUSCLE, and were visualized by Jalview 2 [40]. The MEGA X [41] was used to construct the unrooted phylogenetic tree by Neighbor-Joining (NJ) method with a bootstrap value of 1000 replicates. The phylogenetic tree was illustrated using online tool EvolView (http://www.evolgenius.info/evolview/) [42]. To analyze syntenic relationships of the trihelix family genes among the M. truncatula and Arabidopsis, G. max, Z. mays, O. sativa and M. sativa genomes, MCScanX was used with default settings [38] and was visualized by TBtools software [37].

Expression Analysis of the MtGT Genes by Real-Time qPCR
Total RNAs were extracted with the Eastep ® Super Total RNA Extraction Kit (Promega, Beijing, China) following the manufacturer's instructions and then EasyScript ® All-in-One First-Strand cDNA Synthesis SuperMix for qPCR (TransGen, Beijing, China) was used to synthesize the cDNA. Quantitative real-time PCR (qRT-PCR) was conducted according to the instructions of 2×RealStar Green Fast Mixture with ROX II (GenStar, Beijing, China) on an ABI QuantStudio TM 7 Flex RT-PCR system (Applied Biosystems, Foster City, CA, USA). The gene-specific primer sequences for qRT-PCR determination are provided in Table S9. MtActin was used as the internal control. Three technical repetitions for each sample, and the relative expression data were calculated according to the 2 −∆∆CT method [44].

Identification of MtGT Genes in M. truncatula
In this study, 38 non-redundant trihelix genes were identified in the M. truncatula genome through two BLAST methods based on the known trihelix protein sequences of Arabidopsis, and both Pfam and CDD databases confirmed the presence of trihelix domain. Subsequently, 38 MtGT genes were named MtGT1 to MtGT38 according to their order on the chromosomes ( Table 1). The predicted physical and chemical properties of MtGT proteins, including protein length, molecular weight (MW), isoelectric point (pI), and grand average of hydropathicity (GRAVY) are shown in Table 1 Table S1.

Phylogenetic Analysis and Classification of MtGT Genes
To investigate the molecular evolution and phylogenetic relationship of the trihelix family in M. truncatula, multiple sequences alignment of 35 A. thaliana trihelix proteins [12,31] and 38 MtGT proteins was performed ( Figure S1) and the unrooted phylogenetic tree was constructed (Figure 1). A total of 73 trihelix proteins were classified into five clades (GT-1, GT-2, SH4, GTγ and SIP1), consistent with the previous studies on Arabidopsis [14] and other species, such as soybean [24], rice [17], tomato [21], and wheat [26]. Among these, SIP1 with 14 MtGT family members was the largest cluster, whereas GT-1 and GTγ were the smallest subfamily with 5 MtGT family members each. GT-2 and SH4 clades contained eight and six MtGT members, respectively. These results were similar to the genes distribution of different subgroups in Arabidopsis and rice [12,23].

Chromosomal Distribution and Gene Duplication Events of MtGT Family
As shown in the chromosome map (Figure 2), 38 MtGT genes were located unevenly on eight chromosomes. Chromosome 1 contained most genes (11) of trihelix family, whereas chromosomes 5 and 8 had the least number of genes (2). To clarify the molecular evolution of the MtGT family, gene duplication events including tandem and segmental duplication were analyzed. In this study, we identified three groups of MtGT genes with tandem duplication events   [25], wheat [29] and Fagopyrum tataricum [46]. In the remaining genes, MtGT19 had most introns (16), MtGT-15, which was the longest trihelix gene, had 11 introns, and MtGT20 contained four introns (Figure 3b). To further analyze the diversity of MtGT proteins, the MEME search tool was used to identify 10 conserved motifs (motif 1-motif 10) shown in Figure 3c, and the detailed sequence of each motif is provided in Table S2. Motif 1 was presented in all MtGT proteins. Motif 5 existed almost in all GT-1, GT-2, and SH4 clades. Among them, MtGT-5, MtGT-6, MtGT-8, MtGT-9, MtGT-29, MtGT-31, MtGT-32, and MtGT-36 had two motif 5, of which most belonged to clade GT-2. In addition, GTγ clade genes featured motif 9 at their N-terminal, and all SIP1 subfamily genes contained motif 2 at their N-terminal and featured motif 6 at C-terminal.

Evolutionary and Collinearity Analysis within MtGT Genes and Several Species
To further understand the collinearity of the M. truncatula trihelix family, we constructed five comparative syntenic diagrams between M. truncatula and the representative species including three dicotyledonous plants (Arabidopsis, soybean, and alfalfa) and two monocotyledonous (rice and maize) ( Figure 4). The details are provided in Table S3. MtGT genes displayed syntenic relationships in different degrees with five species; they had the most collinearity relationship with soybean, followed by alfalfa, and had very little collinearity relationship with the maize and rice. A total of 25 and 22 MtGT genes showed syntenic relationships with soybean and alfalfa, respectively. However, only one and two genes had collinearity relationships with rice and maize, respectively. Clearly, the study of MtGT family genes can provide a more valuable gene functional reference for legume crops.

Expression Patterns of MtGT Genes in Different Tissues
The tissue-specific expression data of MtGT genes in six tissues by RNA-seq were retrieved from MTGD (http://www.medicagogenome.org/) (Table S4), including blade, bud, nodule, flower, root, and seedpod, which are shown with a heat map in Figure 5a. Except for the expression data of MtGT-11 and MtGT-30, which were not found, the other 36 MtGT genes had different expression levels in six tissues. Among them, some MtGT genes were highly expressed in specific tissues, MtGT-8, MtGT-9, and MtGT-29 were expressed at relatively high levels in roots, flowers, and blades; MtGT12 was specifically expressed in flowers and seedpods; MtGT4 and MtGT35 were expressed at high levels in roots and nodules; and MtGT31 and MtGT32 were only expressed highly in nodules. However, there were several MtGT genes whose expression levels were similar in six tissues, such as Mt-GT1, MtGT-27, MtGT-20, MtGT-15, MtGT-22, and MtGT-23. In addition, the expression levels of MtGT21 and MtGT24 in various tissues were very low (Figure 5a). The expression levels of nine selected MtGT genes were further verified through qRT-PCR in the six tissues (root, stem, blade, flower, bud and seedpod) (Figure 5b; Table S5). The results demonstrated that tissue expression levels of most selected MtGT genes were consistent with the RNA-seq data from MTGD except MtGT4, suggesting that MtGT family members play various roles in specific tissues during the growth and development of M. truncatula.

Expression Profiling Analysis of MtGT Genes in Response to Abiotic Stress
Recent studies have indicated that trihelix genes play crucial roles in plants response to abiotic stresses. Based on the M. truncatula Gene Expression Atlas (https://mtgea.noble.org/v3/), we obtained the 33 MtGT gene chips expression data in the blades of 28-day-old seedlings under drought and salt treatments (Table S6). Through the differential expression analysis of these genes, we found that 12 MtGT genes (MtGT-31 Table S7). The results showed that the expression characteristics of most genes under drought and salt stress treatments were in accordance with the gene chip data. Particularly for MtGT20, MtGT22, and MtGT33, they were dramatically up-regulation by drought and salt treatments at the same time. Interestingly, MtGT-33 was remarkably up-regulated in blades at 0-2 h under drought stress but down-regulated at 2-24 h; MtGT20 and MtGT22 were continuously up-regulated under drought stress within 96 h. However, in salt stress treatment, MtGT-20 and MtGT-22 were remarkably up-regulated in blades at 2-24 h but down-regulated at 24-48 h; MtGT33 was continuously up-regulated within 48 h. They may be involved in the response regulation of abiotic stress. Figure 6. Expression profiles of the MtGT genes in response to abiotic stresses. A total of 33 MtGT genes chip expression data of 28-day-old seedlings under drought (40-45% soil water content) (a) and salt (200 mM NaCl) treatments (b). The relative expression levels were -log 2 transformed and visualized by heat map. The colors vary from blue to red, and circles from small to large represent the scale of the relative expression levels. The expression patterns of nine MtGT genes under drought and salt treatments were validated by qRT-PCR. Expression data were normalized using MtActin as the internal control and error bars indicate standard deviation among three biological replicates.

Expression Profiling of MtGT Genes in Response to ABA Treatments
Aforementioned results in this study demonstrated that some MtGT genes could be dramatically induced by drought and salt treatments. The phytohormone abscisic acid (ABA), which is considered as the core adversity signal in plants, plays a critical role in response to abiotic stresses such as drought, salinity, and chilling [47][48][49]. Therefore, we further verified the expression profiles of 15 differentially expressed MtGT genes under exogenous ABA treatment by qRT-PCR (Figure 7; Table S8). Most of the genes, particularly MtGT19, MtGT33, and MtGT35, had a strong response to ABA hormones. Their expression levels increased more than ten or even hundreds of times for 48 h of ABA treatment, indicating that these TFs played vital roles in response to the ABA stress signal.

Discussion
Recently, a few more trihelix TFs have been characterized which played important roles in multiple processes during plant growth and development, such as trichome development [14], shattering of the mature seed during crop domestication [17], morphogenesis control of floral organs [13], and response to abiotic and biotic stresses [15,18]. In this study, we identified 38 MtGT genes in M. truncatula, which is similar to the number of trihelix genes in Arabidopsis (30) [12], rice (41) [23], and tomato (36) [24]. Additionally, it is close to the average number of trihelix genes on each subgenome of wheat (31) [29]. The 38 MtGT genes were divided into five clades (GT-1, GT-2, SH4, GTγ, and SIP1) by constructing an unrooted phylogenetic tree to analyze and compare with trihelix family members in Arabidopsis. At present, the functions of several trihelix family genes, such as PETAL LOSS (PTL) gene [13], ARABIDOPSIS 6B-INTERACTING PROTEIN1-LIKE1 (ASIL1) and ASIL2 gene [50], have been studied in depth in Arabidopsis. It is helpful to find some MtGT genes which have similar functions as those reported in Arabidopsis through phylogenetic analysis.
We further analyzed the gene structure and conserved motifs of the trihelix family members in M. truncatula, and the result was consistent with the family classification. We found that most MtGT genes in GT-1, GT-2, and SH4 clades contained the trihelix domain (GT domain), whereas all the members of SIP1 and GTγ subfamilies contained MYB DNA-binding domain. It is consistent with the hypothesis that the trihelix domain originated from a MYB-like gene carrying only one repeat [5]. The most members classifying into the same clade shared similar motif compositions and exon/intron, which indicated that the specific conserved motif may play an important role in the function of a particular cluster. Among five clades, most members of GT-1 and GT-2 clades shared motifs 1, 4, 5, and 7, which had higher homology between them than that in other subfamilies of M. truncatula. This is similar to the results in Arabidospsis, and several AtGT genes in the GT1 and GT2 clusters have similar functions. GT-1 and GT-3a of GT1 clade, GT-2 and DF1-like of GT2 clade are involved in light-induced response [5,[9][10][11]51]; EMB2746 (GT1) and EDA31 (GT2) were identified as essential for Arabidopsis embryo development [52,53]. As the largest subfamily, the composition of motifs of the SIP1 (most of the members of this clade shared motifs 1, 2, 3, and 4) was quite different from that of other subfamily members, whose composition of motifs was similar to that in cabbage [25], chrysanthemum [28], and wheat [29]. The functions of SIP1 members may be more complex and diverse in the trihelix family of M. truncatula.
Gene duplication including tandem, segmental, and genomic duplication have significant impacts on the generation of novel genes and functional diversity, facilitating the evolution and expansion of gene families in plant genomes [25,45]. There were three groups of MtGT genes with tandem duplication events and two pairs of MtGT genes with segmental duplication in trihelix genes family of M. truncatula. Compared with soybean (67) [27], P. trichocarpa (56) [30], and B. Rapa (52) [25], the number of MtGT genes with duplication events were fewer in M. truncatula. We speculated that most trihelix family genes originate from the different ancestors and are less conservative in M. truncatula. Additionally, these results indicated that gene functions of the M. truncatula trihelix family may have a high degree of divergence and diversity.
The tissue-expression pattern is an important factor in the study of gene functional characteristics. Based on the RNA-seq data combined with qRT-PCR verification, we can hypothesize that MtGT TFs play specific and significant roles in the growth and development of M. truncatula. Most of the MtGT genes presented a tissue-specific expression pattern in M. truncatula. Particularly, MtGT12 was specifically expressed in flowers and seedpods, and MtGT6 from GT-2 clade exhibited relatively high expression in flowers, which may affect the development of M. truncatula floral organs and embryoid. The PETAL LOSS (PTL) gene encoding a GT-2 TF repressed growth in the sepal whorl in Arabidopsis. The ptl mutants exhibited missing petals and partial fusion of sepal whorl [13]. The EDA31, a close member of PTL, was found to be involved in embryo sac development due to defective polar fusion in its mutant [53]. MtGT4 and MtGT35 were expressed at high levels in roots and nodules, and MtGT31 and MtGT32 were only expressed highly in nodules. This is consistent with the nature of M. truncatula as a typical legume with developed root system and strong nitrogen fixation ability. Tissue-specific genes may play vital roles in the growth and differentiation of the corresponding organs or tissues, but further experiments are needed to verify the biological function of these MtGT genes.
Recent studies have reported that trihelix genes are involved in the ABA signalling pathway in response to plant abiotic stresses. The accumulation of ABA causes stomatal closure in guard cells to prevent water loss and regulates the expression of numerous genes to induce various cellular and molecular events, such as second messenger Ca 2+ signal system and antioxidant enzyme system, to improve stress tolerance [54][55][56]. Yu et al., reported that ShCIGT belonging to the GT-1 subfamily was involved in the regulation of abiotic stresses resistance in tomato by interacting with a mediator of ABA signal: SNF1-related protein kinase 1 (SnRK1) [57]. Additionally, previous studies have shown that trihelix TFs were involved in Ca 2+ signal regulation in response to abiotic stress. AtGT-3b was dramatically induced by salt stress and could interact with the GT-1 cis-element of the SCaM-4 (CaM isoform) promoter in soybean responding to various environmental stresses [18]. Overexpression of AtGT2L which act as a Ca 2+ -dependent CaM-binding protein involved in plant stress response enhanced the tolerance to cold and salt stress in Arabidospsis [58]. Furthermore, overexpression of Arabidopsis SIP1 clade trihelix 1 (AST1) could regulate the expression of multiple physiological response genes, including proline biosynthesis genes, LEA family genes, POD and SOD genes, improved drought and salt stress tolerance [59]. Based on the aforementioned results, among MtGT family genes, MtGT10, MtGT19, MtGT20, MtGT22, and MtGT33 were significantly induced by drought, salt, and ABA treatments, demonstrating their important roles in abiotic stresses response and resistance in M. truncatula. This study provides a valuable reference gene resource for the molecular genetic improvement of stress resistance in legumes, particularly in alfalfa. The way they participate in the ABA signalling response or interact with the Ca 2+ signal pathway, regulatory pathways they are involved in and their role (positive or negative), and functional genes they interact with need to be further investigated.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4425/11/11/1389/s1, Table S1: Nucleotide and amino acid sequences of 38 MtGTs. Table S2: Conserved amino acid sequences of motifs in MtGTs. Table S3: The collinearity relationships of the MtGT genes with five species. Table S4: The relative FPKM value of 36 MtGT genes in different tissues. Table S5: Expression levels of nine MtGT genes in different tissues by qRT-PCR. Table S6: Expression data of 33 MtGT gene chips in the blades under drought and salt treatments. Table S7: Expression levels of nine MtGT genes under drought and salt treatments by qRT-PCR. Table S8: Expression data of fifteen MtGT genes under exogenous ABA treatment. Table S9: Sequences of 21 gene-specific primer pairs used for qRT-PCR validation. Figure S1: The multiple sequence alignment of the MtGT and AtGT proteins.
Author Contributions: X.L. and K.W. conceived and supervised the project; X.L. and H.Z. performed the experiments and collected the data; X.L. and L.M. analyzed the data and prepared the figures and tables; X.L. wrote the manuscript; L.M. and Z.W. revised the manuscript. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.