Genome-Wide Analysis of the Trihelix Gene Family and Their Response to Cold Stress in Dendrobium ofﬁcinale

: Trihelix transcription factors play important roles in plant growth, development and various stress responses. In this study, we identiﬁed 32 trihelix family genes ( DoGT ) in the important Chinese medicinal plant Dendrobium ofﬁcinale . These trihelix genes could be classiﬁed into ﬁve different subgroups. The gene structure and conserved functional domain of these trihelix genes were similar in the same subfamily but diverged between different subfamilies. Various stresses responsive cis -elements presented in the promoters of DoGT genes, suggesting that the trihelix genes might respond to the environmental stresses. Expressional changes of DoGT genes in three tissues and under cold treatment suggested that trihelix genes were involved in diverse functions during D. ofﬁcinale development and cold tolerance. This study provides novel insights into the phylogenetic relationships and functions of the D. ofﬁcinale trihelix genes, which will aid future functional studies investigating the divergent roles of trihelix genes belonging to other species. with “macromolecular complex”. Under the molecular function term, 12 genes were annotated to have “nucleic acid binding transcription factor activity”, while only one gene was annotated to have “catalytic activity” (Figure 4 and Table S2). All these results indicate the multi-functions of DoGT genes.


Introduction
Transcription factors are necessary regulatory factors in growth, development and abiotic stress responses of plant [1]. TFs activate or inhibit transcription by specifically combined with gene promoter regions and cis-acting elements [2,3]. Trihelix TF family feature a classic trihelix (helix-loop-helix-loop-helix) domain which binds to GT elements required for the light response, similar to the Myb DNA-bingding domains in sequence, also termed GT factors [4][5][6]. According to the structure of the trihelical domain, trihelix TFs were grouped into five clades named GT-1, GT-2, SH4, GTγ and SIP1 [7]. There are differences between clades, the internal hydrophobic region of each α helix of each trihelix domain in GT-1 and SH4 contains a tryptophan residue [8]. In GT-2 and GTγ, the third conserved tryptophan is replaced by phenylalanine and by isoleucine in SIP1 [9,10]. The most notable feature of GT-2 is that there is another trihelix domain at the C-terminal [11]. Moreover, GT-1, GT-2, GTγ and SIP1 also contain an amphoteric α helix structure (the fourth helix) [12].
Past research has made clear that trihelix genes play important roles in flower development, embryo maturation and seed growth. PETAL LOSS (PTL) gene of Arabidopsis belongs to GT-2 subfamily, can regulate the development of petals, sepals and morphogenesis of floral organ [13][14][15]. Some members of the SIP1 subgroup have been reported to be associated with the process of plant embryo development and cell proliferation [16]. ASIL1 isolated from Arabidopsis thaliana acts as a temporal regulator of seed filling by repressing the expression of master regulatory genes LEC2, FUS3, ABI3 and other genes [17]. In Brassica napus, overexpression of the BnSIP1-1 gene which fell in the SIP1 clade can improve seed germination under osmotic pressure, salt and ABA treatments [18]. The seed shattering of rice is controlled by a single dominant gene SHATTERING1 (SHA1) belonging to the SH4 clade [19]. Except for plant development, trihelix genes also play key roles in plant biotic and abiotic stress responses including pathogen-induced defense programs and response to drought, salt, cold stress et al. [7]. One of the most important functions of trihelix genes seems to be regulation of the cold stress response. Overexpressing of the GT-1 trihelix gene, ShCIGT, could enhance cold and drought tolerance in tomato [20]. The transgenic Arabidopsis plants expressing GmGT-2A and GMGT-2B from soybean displayed strong resistance to freezing stress [10].
Dendrobium officinale Kimura et Migo is a perennial epiphytic herb of Dendrobium in Orchidaceae [21]. As an important traditional herbal plant, D. officinale has over hundreds years of history of medicine in many Asian countries. Under nature condition, D. officinale grows compatibly on damp rock of mountain climates at 500-1600 m or tree trunks in primeval forests in warm and humid environments, so it is quite easily disturbed by abiotic stress, such as high temperature, low temperature, drought and salinity [22,23]. The species is extremely hypersensitive to low temperature above the freezing point, resulting in major yield losses [24]. As a result of its habitat shrinking, human overexploitation and its low natural reproduction rate, slow growth, wild D. officinale were listed in the IUCN red list of threatened species (http://www.iucnredlist.org/details/46665/0 (accessed on 30 April 2004)). Nowadays, D. officinale is commonly produced in green house for saving from cold environment. Although important role of the trihelix genes in plant development and stress resistance and has been investigated in Arabidopsis, rice and soybean, but not yet well studied in D. officinal. Therefore, it is important to reveal molecular mechanism in response to cold stress in D. officinale, not only for breeding cold-tolerance cultivars but also for lifting productivity. The generation of draft genome of D. officinale provides a first-time opportunity to perform a genome-wide identification of trihelix gene family. We comprehensively characterized the number, structure and phylogenetic relationships of the trihelix members throughout the D. officinale genome. We also examined the expression differentiation of the trihelix genes among distinct tissues and under cold stress.

Identification of the Trihelix Gene Family in D. officinale
In order to identify the trihelix gene family, firstly, we downloaded the genome sequences of D. officinale from NCBI (http://www.ncbi.nlm.nih.gov/ (accession codes: PR-JNA262478, accessed on 26 Feb 2016)). Then, the HMM profile (PF13837) were acquired from Pfam database and used to search trihelix domains through HMMER 3.0 software with an E-value < 0.00001 [25]. Finally, after removing the incorrect and redundant members, the candidate trihelix protein sequences were further verified by Pfam and SMART online software [26]. ProtParam (http://web.expasy.org/protparam/ (accessed on 4 June 2020)) was used to compute the pI (isoelectric point), MW (molecular weight) and GRAVY (grand average of hydropathy) of trihelix proteins. The subcellular locations of trihelix members were analyzed using Plant-mPLoc online software [27].
As a control, the trihelix genes of Oryza sativa and Brachypodium distachyon were downloaded from the Rice Information Resource and Phytozome, respectively [28]. In addition, the trihelix genes of B. distachyon (BdGT) were named according to a previous study [29].

Phylogenetic Analysis
A multiple alignment analysis of DoGT genes was run with ClustalW 2.0 and manually corrected using BioEdit 7.1. The phylogenetic tree was builded using MEGA 5.0 based on the NJ and ML methods [30]. The bootstrap values were calculated for 1000 iterations.

Conserved Structures and Motifs of Trihelix Genes
The exon/intron structure of trihelix genes was generated using the GSDS program (version 2.0) with coding and genomic sequences [31]. The motifs of each deduced trihelix protein were analyzed by MEME suit software (version 4.12.0) with parameters as follows: maximum number of motifs, 10 [32].

Promoter Analysis and Gene Ontology (GO) Annotation
For promoter analysis, the 1500-bp upstream sequences of genes were scanned for the cis-elements using PlantCARE web site [33]. Gene ontology (GO) annotation of trihelix proteins was performed by InterproScan and used to predict the functions of DoGT proteins [34]. The GO annotation were then plotted using the OmicShare tool (http://www.omicshare.com/tools (accessed on 9 June 2020)).

Expression Profile Analysis
We downloaded the RNA-seq data titled SRR2014396, SRR2014297 and SRR2014236 from the NCBI to analyze tissue-specific expression patterns of D. officinale trihelix genes. The Illumina data were mapped to the D. officinale genome with HISAT2 (V2.1.0) and then the transcripts were next determined by StringTie (V1.3.5) [35]. The gene expression levels were calculated as the number of fragments per kilobase of gene length per million mapped reads (FPKM). Using the DESeq2 [36] package of R, we analyzed the differential expression patterns of DoGT genes and the Gplots package of R program (https://www.R-project.org/ (accessed on 16 June 2020)) was used to draw the heatmap.

RNA Isolation, Expression and Statistical Analysis
Total RNA of the sample was extracted using a modified cetyltrimethylammonium bromide (CTAB) method [37]. The total RNA was eluted in 30 µL of RNase-free water and stored at −80 • C. The RNA concentration was calculated using a Nanodrop ND2000 spectrophotometer (Nano-Drop Technologies, Wilmington, DE, USA) with RNase-free water as a blank. The cDNA was synthetized through the FastKing gDNA Dispelling RT SuperMix (Tiangen, China).

qRT-PCR Analysis of DoGT Genes
Quantitative real-time PCR primers were designed according to the CDS sequence of DoGT by Primer 6 software (Table S3). The primer mass was tested using PCR amplifcation, agarose gel electrophoresis and melting curve analysis. The qRT-PCR experiment was run on the Bio-Rad CFX96 Real Time PCR System and SuperReal PreMix Plus (SYBR Green, Tiangen, China). The 10 µL reaction volume contained 5 µL 2×SuperReal PreMix Plus, 1 µL of diluted cDNA, 0.5 µL of each primer (10 µM), 1 µL 50×ROX Reference Dye and the addition of ddH 2 O to bring the total volume to 10 µL. The PCR parameters were as follows: 95 • C for 30 s; 40 cycles of 95 • C for 5 s, 60 • C for 30 s, 95 • C for 10 s. According to previous results, GAPDH gene as the internal reference. Relative expression level of the DoGT gene was analyzed by the 2 −∆∆CT method. Three technical replicates were done for each sample. Normalizing all of the data based on setting the expression level at 0 h as a value of 1 for cold stress (values above 1 and below 1 were considered as up-and down-regulated, respectively). For tissue expression analysis we used leaves as control (expression = 1.0) to calculate the fold change in the expression level of the relevant genes.

Identification of the DoGT Genes in D. officinale
Using the consensus sequence of the trihelix domain, a total of 32 non-redundant D. officinale trihelix genes (DoGT) were identified and the detail information are listed in Table 1. The 32 trihelix genes were subsequently renamed from DoGT1 to DoGT32 according to their gene IDs. All of the 32 trihelix proteins contained a typical feature of the trihelix domain confirmed by Pfam and Smart. The detail information of DoGT genes including length, pI, Mw, GRAVY and location are also presented in Table 1. Lengths of the open reading frames (ORFs) of DoGT proteins ranged from 203 to 789 aa with the pI varying from 4.58 to 10.19 and the Mw spanning from 22.01 kD to 85.07 kD. The GRAVY values of these deduced trihelix proteins ranged from −1.414 to −0.244, indicating that they were hydrophilic. Furthermore, the Plant-mPLoc analysis showed that nearly all the DoGT proteins were located in the nucleus except for several members, including DoGT1, DoGT2, DoGT6, DoGT9, DoGT10, DoGT23 and DoGT30 which located in the chloroplast, chloroplast/nucleus/peroxisome, cell membrane/chloroplast, chloroplast/nucleus, chloroplast/nucleus, chloroplast and chloroplast, respectively.

Sequence Alignment and Phylogenetic Analyses of Trihelix TFs
To uncover the classifications of the DoGT proteins, un-rooted Neighbor-Joining (NJ) and Maximum Likelihood (ML) phylogenetic trees were constructed using D. officinale and other species (rice and B. distachyon). The results of these two trees were consistent (Figure 1 and Figure S2). As shown in Figure 1 and Figure S2, the DoGT genes were divide into five subgroups, namely GT-1, GTγ, GT-2, SH4 and SIP1, respectively. This classification was highly similar to previous study in Arabidopsis, rice and wheat [7,29,38]. The 32 DoGT genes were distributed over all of these subfamilies. The distribution trends were similar to those in Arabidopsis and Citrus sinensis [39]: the SIP1 clade was the largest subfamily, containing 15 trihelix genes, whereas the SH4 subgroup was the smallest, only containing two members. This result indicates that DoGT genes are unevenly distributed in the five subgroups. In addition, the phylogenetic tree also revealed paralogous and orthologous relationships among these three species (Figure 1). Seven pairs of paralogous proteins were identified in D. officinale, including DoGT2 and DoGT18, DoGT5 and DoGT21, DoGT7 and DoGT13, DoGT9 and DoGT11, DoGT17 and DoGT29, DoGT23 and DoGT30 and DoGT25 and DoGT27 with strong bootstrap support 60, 100, 99, 100, 97, 100 and 99, respectively. Furthermore, 28 pairs of orthologous genes were found between rice and B. distachyon. Moreover, group GT-1 and GT-2 were located at the bottom of the phylogenetic tree, which is consistent with the hypothesis that GT1 and GT2 diverged early in the existence of land plants [40].

Protein Structure of the DoGT Gene Family
To reveal structural diversification of DoGT genes in D. officinale, we used MEME web site to predict the conserved motifs and a total of ten distinct motifs were analyzed [32,41]. We presented the schematic distribution of these motifs among different gene groups (Figure 2) to show their relative locations within proteins. The multi-level consensus sequences were produced among these motifs (Table 2). In total, the ten motifs were annotated by InterProScan. Almost all DoGT proteins had motif 1 and motif 4, except for DoGT18 (without motif 4, GT-1 subgroup), DoGT22 and DoGT32 (without motif 4, SH4 subgroup). Motif 3 and 7 were only found in clade members of GT-1 and GT-2 except for DoGT1, DoGT2 and DoGT18. Motif 9 was identified in all GT-2 members and some of GT-1 and SH4 subfamilies. Similarly, motif 6 and 8 were identified in all GTγ proteins, while motif 2 and 5 only appeared in SIP1 subgroup, respectively. These results showed that the gene structure and motifs of DoGTs were conserved. As a result, the majority of closely related members in the phylogenetic tree, as expected, had common motif compositions, suggesting their functional conservation. Our phylogenetic analysis results and previous studies clearly showed the reliability of this classification [41,42].     Table 2.

Exon-Intron Organization of DoGT Genes
To further investigate the evolution of DoGT genes, exon and intron structures of the trihelix gene family from D. officinale were visualized by CDS sequences with corresponding genomic sequences. As shown in Figure 3, the exon/intron structures were divergent among the DoGT genes. The number of exons spanned from 1 to 12. 14 DoGT genes (43.75%) contained 1 exons and 9 DoGT genes (28.13%) had 2 exons accounting for the largest proportion, whereas only one DoGT gene harbored 12 exons, respectively. The results showed that the most closely related trihelix members in the same clade showed similar gene structures in intron numbers or exon lengths. The similarity in gene structures was consistent with the phylogenetic analysis. For example, all members of GTγ and SH4 subgroups contained one and two exons. In SIP1 subgroup, almost all DoGTs harbored one to two exons, with the exception of DoGT11 (three exons), DoGT23 (seven exons) and DoGT30 (seven exons).

Cis-acting Elements Analysis and Gene Ontology (GO) Annotation
The cis-elements in promoter regions are associated with gene functions and expression patterns [43]. In order to investigate the evolution and functional divergence of DoGT

Cis-Acting Elements Analysis and Gene Ontology (GO) Annotation
The cis-elements in promoter regions are associated with gene functions and expression patterns [43]. In order to investigate the evolution and functional divergence of DoGT genes, the upstream 1.5 Kb promoter regions of all DoGT genes in the D. officinale genome were analyzed using the PlantCARE online software. In this study, we examined the two types of cis-acting regulatory elements. One was related to plant development, including light responsive (box4, G-box, sp1 and MRE), endosperm expression (GCN4_motif), circadian (circadian control), meristem expression (CAT-box), meristem specific activation (CCGTCC-box) and zein metabolism regulation (O2-site); another one was associated with stress responses, including MeJA (methyl jasmonate) response elements (CGTCA-motif and TGACG-motif), ABA response (ABRE), heat and SA (salicylic acid) response (TCA element), anaerobic induction (ARE), drought stress (MBS), fungal elicitor response (Bow-W1), low-temperature stress (LTR) and so on. Our results showed that most of the DoGT genes contained the box4, MeJA response elements and MYB motifs, while only one DoGT gene (DoGT5) had circadian elements and two DoGT genes comprised MRE motif (Table S1).
In order to further predict the functions of DoGT proteins, gene ontology (GO) annotation analyses were performed. A total of 25 distinct functional groups were identified: 16 involved in biological processes, six involved in cellular components and three involved in molecular functions ( Figure 4). In biological processes, GO classifications of "biological regulation", "cellular process", "metabolic process" and "regulation of biological process" were dominantly attributed. As for genes in the cellular component part, most DoGT genes were annotated with "cell", "cell part" and "organelle", while only three genes were associated with "macromolecular complex". Under the molecular function term, 12 genes were annotated to have "nucleic acid binding transcription factor activity", while only one gene was annotated to have "catalytic activity" (Figure 4 and Table S2). All these results indicate the multi-functions of DoGT genes. ustainability 2021, 13, x FOR PEER REVIEW DoGT genes were annotated with "cell", "cell part" and "organelle", while only genes were associated with "macromolecular complex". Under the molecular fun term, 12 genes were annotated to have "nucleic acid binding transcription factor activ while only one gene was annotated to have "catalytic activity" (Figure 4 and Tabl All these results indicate the multi-functions of DoGT genes.

DoGT Gene Expression Profiling
Previous studies reported that tissue-specific expression pattern of genes coul veal their potential biological functions [44,45]. Therefore, we downloaded the pub

DoGT Gene Expression Profiling
Previous studies reported that tissue-specific expression pattern of genes could reveal their potential biological functions [44,45]. Therefore, we downloaded the public Illumina RNA-seq data from three tissues (leaves, stems and flowers) to investigate the temporal and spatial expression levels in D. officinale. The gene expression profiles were calculated by FPKM value and the default empirical abundance threshold of 1 FPKM was used to evaluate whether a gene was positively expressed or not [46,47]. Of the 32 identified DoGT genes, only five members were actively expressed in all of the three tissues, while the remaining 27 (84.38%) genes were considered not expressed in at least one of the three tissues ( Figure 5). All the top four highly expressed DoGT genes, including DoGT5, DoGT27, DoGT28 and DoGT22, were observed in leaves, indicating their putative functions in the development and other physiological processes in leaves. Moreover, five genes (DoGT4, DoGT19, DoGT24, DoGT29 and DoGT32) were found only up-regulated in flowers and two genes (DoGT22 and DoGT31) shared similar high levels of gene expression in all of the three organs. Genes that are specific highly expressed in one tissue are often found to be able to regulate target genes involved in the processes of plant growth and development [47,48]. Therefore, tissue-specific DoGT genes reported herein might be valuable sources for further investigating their biological functions in the growth and development of D. officinale.

Analysis of DoGT Genes Expression in Tissues
Ten DoGT genes were randomly selected from five subgoups to exhibit their ex sion profiling in D. officinale. The expression patterns of these 10 DoGT genes were e ined in stems, leaves and flowers through qRT-PCR.

Analysis of DoGT Genes Expression in Tissues
Ten DoGT genes were randomly selected from five subgoups to exhibit their expression profiling in D. officinale. The expression patterns of these 10 DoGT genes were examined in stems, leaves and flowers through qRT-PCR.
As illustrated in Figure 6, the expression patterns of those ten DoGT genes had a greater difference in these three organs. Meanwhile, the qRT-PCR analysis and transcriptome data of most genes were consistent. The expression levels of eight genes (DoGT4, DoGT15, DoGT19, DoGT20, DoGT22, DoGT24, DoGT29 and DoGT32) were relatively higher in flowers. For example, DoGT15 showed the greatest high expression in flower more than 57-fold in leaf. In contrast, DoGT14 and DoGT31 were relatively highly expressed in stems but lowly expressed in leaves.

Response of DoGT Genes to Cold Stress
Evidence has shown that trihelix genes important in response to cold stress [20,49,50]. To check the expression pattern of DoGT genes under cold stress, qRT-PCR was used to

Response of DoGT Genes to Cold Stress
Evidence has shown that trihelix genes important in response to cold stress [20,49,50]. To check the expression pattern of DoGT genes under cold stress, qRT-PCR was used to detect the expression levels of DoGTs at 0 • C. As showed in Figure 7, these selected ten DoGT genes showed different expression profiles under cold stress, suggesting that the DoGT genes were sensitive to cold stress. Some DoGT genes in leaves seemed to be more sensitive to cold stress. For example, DoGT24 was significantly up-regulated by about 15-fold after 8 h. Our results also showed that the DoGT genes (DoGT19, DoGT20 and DoGT22) belonging to different subgroups exhibited the same responses to cold stress: initially decreased, then increased at 8 h, and decreased again. Our results suggested that the 10 DoGT members were significantly affected under cold stress and these results indicated the involvement of DoGT gene family in their response against cold stress.

Discussion
Early studies suggested that the trihelix family genes contained three distinctive subfamilies (GTα, GTβ and GTγ) [51]. Then, in 2012, Kaplan-Levy et al. proposed a new clas- Figure 7. The real-time PCR analysis of ten GoGT genes in response to cold stress treatment in D. officinale. Statistically significant differences (Student's t-test) are indicated as followed: ** p < 0.01.

Discussion
Early studies suggested that the trihelix family genes contained three distinctive subfamilies (GTα, GTβ and GTγ) [51]. Then, in 2012, Kaplan-Levy et al. proposed a new classification and divided the trihelix genes into five subgroups [7]. We compared the rice and B. distachyon sequences and constructed phylogenetic trees to classify the genes into five subgroups. The results supported previous findings. Members within same group had a similar gene structure, length and amino acid motif composition, indicating their close evolutionary relationship.
It was reported that genes within the same group might have similar functions because of sequence similarity. For example, OsGTγ-1, OsGTγ-2 and OsGTγ-3 in rice, both belonging to class GTγ, were involved in the stress responses, especially in salt stress [51].
In this current study, GO analyses showed that DoGT genes are grouped in 25 functional groups, including 16 involved in biological processes, six in cellular components and three in molecular functions, indicating the extensive functions of trihelix genes. Consistent with these results, many cis-acting elements were detected in promoters of DoGT genes and most of them were associated with plant growth/development and abiotic stress response.
We examined the expression patterns of DoGT genes in the flowers, stems and leaves of D. officinale. The expression profiles indicated spatial variations of DoGT expression in different tissues. In Figure 5, some DoGTs showed high expression levels in flowers, the same as function of previously report of trihelix genes in flower [51]. However, further research is necessary to make clear the functions of these DoGT genes.
Cold is a major abiotic stress that adversely influence plant growth and development. However, as described previously, D. officinale is often grown in the green house for saving from cold environment [52]. Therefore, understanding how respond and develops tolerance to cold stress is the first step towards improving the adaptation to cold environments. DoGT genes were significantly up-regulated/down-regulated after cold treatments, suggesting that these genes were associated with cold tolerance in leaves. It has been reported that analysis of cis-element distribution could provide insights into the functions of genes. Except for DoGT20, all of the 10 DoGT genes had MYB element which is related to stress response. In addition, DoGT15 also contained a low-temperature response element (LTR). The obtained results may provide a number of the DoGT candidate genes for cold stress, which will facilitate the genetic improvement of cold tolerance in D. officinale.

Conclusions
Here, we first report a genome-wide analysis of trihelix genes in D. officinale. The classification and conserved domain of DoGT proteins, as well as stress-responsive elements in the promoters of DoGT genes were analyzed. Ten of the 32 DoGT genes were significant inducible, the results show that they might participate in response to cold stress. Our works will be helpful to increase knowledge about the molecular mechanism of these gene members in growth development and under cold stress in D. officinale. This research will expand our knowledge of the function of the trihelix family in cold stress regulation in plants.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data used to support the findings of this study are available in this manuscript and supporting materials.