Genome-Wide Identiﬁcation and Analysis of the GRAS Transcription Factor Gene Family in Theobroma cacao

: GRAS genes exist widely and play vital roles in various physiological processes in plants. In this study, to identify Theobroma cacao ( T. cacao ) GRAS genes involved in environmental stress and phytohormones, we conducted a genome-wide analysis of the GRAS gene family in T. cacao . A total of 46 GRAS genes of T. cacao were identiﬁed. Chromosomal distribution analysis showed that all the TcGRAS genes were evenly distributed on ten chromosomes. Phylogenetic relationships revealed that GRAS proteins could be divided into twelve subfamilies (HAM: 6, LISCL: 10, LAS: 1, SCL4/7: 1, SCR: 4, DLT: 1, SCL3: 3, DELLA: 4, SHR: 5, PAT1: 6, UN1: 1, UN2: 4). Of the T. cacao GRAS genes, all contained the GRAS domain or GRAS superfamily domain. Subcellular localization analysis predicted that TcGRAS proteins were located in the nucleus, chloroplast, and endomembrane system. Gene duplication analysis showed that there were two pairs of tandem repeats and six pairs of fragment duplications, which may account for the rapid expansion in T. cacao . In addition, we also predicted the physicochemical properties and cis-acting elements. The analysis of GO annotation predicted that the TcGRAS genes were involved in many biological processes. This study highlights the evolution, diversity, and characterization of the GRAS genes in T. cacao and provides the ﬁrst comprehensive analysis of this gene family in the cacao genome.


Introduction
Abiotic stresses including high temperature, drought, cold, and salt have important effects on plant development and growth. Some transcription factors regulate the transcript levels of their target genes under stress by binding to specific DNA sequences in their target promoters [1][2][3]. Therefore, the regulatory networks of various biological processes can be understood by studying plant transcription factors.

PAT1 Involved in signaling in Arabidopsis photochromes
A. thaliana HAM, LISCL, LAS, SCL4/7, SCR, DLT, SCL3, DELLA, SHR, and PAT1 [26] GmGRAS37 PAT1 Improved resistance to drought and salt stresses soybean AtSCL4/7, Os19, Os4, HAM, DELLA, DLT, AtPAT1, LISCL, AtSCR, AtSCL3, and AtSHR [10] StGRAS9 PAT1 Responded to plant hormones IAA, ABA, and GA3 treatment potato DELLA, LAS, HAM, PATI, SCR, LISCL, SHR, and SCL3 [13] Although the GRAS gene family has been studied for many years, the mechanisms and evolutionary dynamics of this gene family in woody plants are still not fully understood. Differences in loss and retention of duplicated gene family members between woody and herbaceous species may help in identifying genes with specialized roles in the adaptive evolution of different lineages. The cacao is called "soft gold" because of its high value. The flowers of cacao trees have ornamental value, and cacao is the main ingredient in chocolate and cacao powder [27]. In addition, cacao beans have important uses in the pharmaceutical and cosmetic industries. Cacao is receiving increasing attention for its potential health benefits because it is rich in polyphenols, particularly flavonoids [28]. However, cacao production is hampered for a number of reasons. Therefore, it is of great value to study the cacao tree. The phylogenetic relationship, conserved domain, and collinearity analysis of this family can provide new ideas for further functional analysis. In 2010, Corti et al. [29] carried out the sequencing and assembly study of the T. cacao genome, and since then, researchers have successively identified and analyzed its important gene families, such as the NAC gene family [30], WRKY gene family [31], and the GPX transcription factor family [32]. In 2013, researchers completed an analysis of the metabolome and transcriptome of the cacao tree [33]. Our knowledge about the expansion and diversification of this gene family in plants is presently limited to the herbaceous species Arabidopsis. To date, the GRAS gene family has not been identified and classified in T. cacao.
In this study, we identified 46 GRAS gene family members and conducted a comprehensive genome-wide analysis of the GRAS gene family of the cacao tree, including gene structure, domain analysis, intron/exon, chromosome location, subcellular localization, and cis-acting elements of GRAS genes. In addition, we analyzed the phylogenetic relationship of GRAS proteins between T. cacao and A. thaliana. Furthermore, we performed the gene duplication pattern of T. cacao GRAS proteins, and we analyzed a syntenic analysis of GRAS proteins among T. cacao, A. thaliana, P. trichocarpa, and Sesamum indicum (S. indicum). The results of this study lay the foundation for further studies of the biological function of genes in T. cacao and provide a reference for subsequent molecular mechanisms.

Identification of GRAS Gene Family in T. cacao
The genome file, protein file, coding sequences (CDS), and annotation files of T. cacao were obtained from Ensembl Plants (http://plants.ensembl.org/index.html, accessed on 1 June 2022). The Hidden Markov Model (HMM) profile of the GRAS protein domain was downloaded from the Pfam protein family database (release 35.0; http://pfam.xfam.org/, accessed on 1 June 2022) under the accession number 'PF03514' [34,35].
The HMM model of HMMER (version 3.1b2) was used to screen GRAS protein candidate members of T. cacao twice to determine the final target members. Firstly, the downloaded HMM profile was employed using the HMMER v3.3.2 program to search for proteins containing target GRAS domains as the initial filtering results, and ClustalW (version 2.1) was used to perform multiple sequence alignment for the initial target proteins [36]. Secondly, to expand the filtering scope, we constructed a new HMM model with e-value < 1 × 10 −20 . The new model was used to filter second target proteins using HMMER (version 3.3.2), with e-value < 0.05. The two results were combined and used as the final candidate proteins. Finally, the NCBI Conserved Domain Search (https://www.ncbi.nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi, accessed on 6 June 2022), Pfam Batch Sequence Search (http://pfam.xfam.org/search#tabview=tab1, accessed on 6 June 2022), and the SMART program (http://smart.embl.de/smart/batch.pl, accessed on 6 June 2022) [37] were used to verify the existence of the GRAS domain in each candidate protein sequence. After combining all results, 46 GRAS genes were obtained from the T. cacao genome.

Phylogenetic Analysis and Classification of TcGRAS Genes
To provide family classification of GRAS genes and understand their phylogenetic relationships, a rooted neighbor-joining (NJ) phylogenetic tree between T. cacao (TcGRAS) and Arabidopsis GRAS proteins was built using the MEGA 11 software (version 11.0.11) [39,40]. The TcGRAS genes were classified according to their phylogenetic relationship with A.thaliana GRAS members. We obtained Arabidopsis GRAS protein sequences from TAIR (https: //www.Arabidopsis.org, accessed on 10 June 2022) [9,41]. Both families of protein sequences were aligned by Muscle [42] in MEGA 11 software (version 11.0.11) under the default parameters. The maximum likelihood (ML) method was used with the following parameters: 1000 iterations for the bootstrap method, the Poisson model, and use all sites. In addition, an individual phylogenetic tree of TcGRAS genes was constructed in the same way and visualized using online software iTOL (http://itol.embl.de/, accessed on 10 June 2022) [43].

Gene Structure and Conserved Motif Analyses of TcGRAS Genes
The conserved motifs of the TcGRAS proteins were predicted by using the online program MEME (https://meme-suite.org/meme/tools/meme, accessed on 22 June 2022) with the following settings: maximum number of motifs 15, minimum motif width 6, maximum motif width 50, and any number of repetitions [44]. The domain analyses of TcGRAS proteins were performed under the Gene Structure Display Server 2.0 program. The gene structure view function of TBtools (version 0.665) was used to obtain conserved motifs and gene structures.

Chromosomal Mapping and Cis-Acting Regulatory Analyses of TcGRAS Genes
The online program MG2C (http://mg2c.iask.in/mg2c_v2.1, accessed on 15 June 2022) was used to predict the chromosomal position of TcGRAS genes. All the identified genes were mapped to 10 chromosomes according to the location information of the chromosome by TBtools. The upstream 2000 bp sequences of TcGRAS genes' CDS were extracted by TBtools software (version 1.098696), and then submitted to the online software PlantCARE (http://bioinformatics.psb.ugent.be/webtools/plantcare/html, accessed on 15 June 2022) [45] to predict cis-acting elements, including light-responsive elements, abscisic acid-responsive elements, MeJA-responsive elements, low-temperature-responsive elements, defense-and stress-responsive elements, gibberellin-responsive elements, and auxin-responsive elements, after filtering and screening [45]. The Simple BioSequence Gene Viewer function of TBtools software (version 1.098696) was used to visualize the cis-acting elements.

Gene Duplication and Synteny Analyses of TcGRAS Genes
The 'MCScanX' function of the TBtools software with default parameters was used to predict gene duplications of TcGRAS genes. MCScanX Diamond output was used to calculate the replication events of the T. cacao genome. The Duplicate_gene_classifier program in MCScanX (https://github.com/wyp1125/MCScanX, accessed on 22 June 2022) was used to analyze the duplication type of each TcGRAS gene. KaKs_calculator software (version 2.0) [46] was used to calculate the Ka/Ks ratio of tandem repeat gene pairs in the TcGRAS gene, with the following parameters: method of calculation: YN, and genetic code Table 1 (Standard code). The Advanced Circos function of TBtools software (version 1.098696) was used to visualize WGD or segment duplications. The synteny of TcGRAS genes with the GRAS genes of A. thaliana, P. trichocarpa, and S. indicum was visualized by the One-Step MCScanX function of TBtools software. The Dual Systeny Plot for the MCScanX function of TBtools software (version 1.098696) was used to visualize the synteny.

GO Annotation Analyses of T. cacao TcGRAS Genes
The DAVID online program was used to annotate TcGRAS genes. The official gene sample lists of TcGRAS genes were uploaded to the program. The analysis included three parts: molecular function, cell components, and biological processes. The R programming language (version 4.1.3) was used to visualize the GO annotation analysis [47].

Identification of GRAS Members in T. cacao
A total of 70 GRAS protein candidates were obtained from the initial filtering. After the second filtering, 53 candidate proteins were obtained. Finally, 46 GRAS genes were identified by redefining conserved domains and deleting repeats (Supplementary File S1). The identified genes were named from TcGRAS1 to TcGRAS46 according to their chromosomal position. The number of amino acids (aa), average molecular weight (MW), theoretical pI, instability index, and aliphatic index of identified TcGRAS genes were statistically analyzed ( Table 2). The number of amino acids of the TcGRAS genes ranged from 347 (TcGRAS21) to 1659 (TcGRAS22) aa, and the molecular weight ranged from 39,709.90 to 191,183.51 Da. The results showed that 44 GRAS proteins were acidic with pI values less than 6.5. Two (TcGRAS2 and TcGRAS19) were neutral, with pI between 6.5 and 7.5. The results of the instability index analysis showed that most TcGRAS proteins were unstable, except for TcGRAS8, TcGRAS12, TcGRAS24, and TcGRAS33. Prediction of the subcellular localization of TcGRAS proteins by the online software BUSCA tool revealed that 41 TcGRAS proteins were mainly located in the nucleus, 4 in the chloroplasts, and only 1 in the endomembrane system.

Phylogenetic Analysis of TcGRAS and AtGRAS
To explore the evolutionary relationship of GRAS proteins between T. cacao and A. thaliana, we performed a multiple sequence alignment of 46 TcGRAS proteins and 34 AtGRAS proteins, and then constructed an unrooted phylogenetic tree using the MEGA 11 software (Figure 1). According to the homology of GRAS proteins in A. thaliana, 46 TcGRAS proteins were divided into 10 clades, which were HAM, LISCL, LAS, SCL4/7, SCR, DLT, SCL3, DELLA, SHR, and PAT1. It is notable that 5 of the 46 TcGRAS proteins were not classified as any of these subfamilies; therefore, we grouped TcGRAS22 as UN1 and TcGRAS11, TcGRAS13, TcGRAS17, and TcGRAS33 as UN2. The largest clade was subgroup LISCL, which contained ten TcGRAS members (TcGRAS8, TcGRAS24, TcGRAS25, TcGRAS27, TcGRAS39, TcGRAS40, TcGRAS41, TcGRAS42, TcGRAS43, and TcGRAS46), whereas subgroups UN1, DLT, SCL4/7, and LAS only had one member. Subgroups UN1 and UN2 only contained T. cacao members, meaning that these genes may have been specialized during the evolutionary process.

Gene Structure, Conserved Motifs, and Domain Analyses of TcGRAS Genes
To understand the structural diversity and similarity of GRAS gene family members in the cacao tree, we used the Gene Structure View function of the TBtools software to construct a triad map of the evolutionary tree, gene structure, and motif of GRAS gene family members, as shown in Figure 2. We first performed an individual phylogenetic tree using an NJ method consistent with the phylogenetic analysis between TcGRAS and AtGRAS ( Figure 2A). SCR, DLT, SCL3, DELLA, SHR, and PAT1. It is notable that 5 of the 46 TcGRAS proteins were not classified as any of these subfamilies; therefore, we grouped TcGRAS22 as UN1 and TcGRAS11, TcGRAS13, TcGRAS17, and TcGRAS33 as UN2. The largest clade was subgroup LISCL, which contained ten TcGRAS members (TcGRAS8, TcGRAS24, TcGRAS25, TcGRAS27, TcGRAS39, TcGRAS40, TcGRAS41, TcGRAS42, TcGRAS43, and TcGRAS46), whereas subgroups UN1, DLT, SCL4/7, and LAS only had one member. Subgroups UN1 and UN2 only contained T.cacao members, meaning that these genes may have been specialized during the evolutionary process.  To further understand the characteristics of the GRAS gene families in T. cacao and the conserved motifs shared among different subfamilies, we used the Multiple Expectation Maximization for Motif Elicitation program to find the conserved motifs. A total of 10 conserved motifs were predicted and named Motif 1-10 ( Figure 2B and Supplementary File S2). Early sequence analysis indicated that the GRAS proteins typically share a variable N terminus and a highly conserved C terminus. In this study, we found that the C-terminal regions contained a highly conserved domain (Motif 6). Three proteins did not contain this conserved motif, including TcGRAS2, TcGRAS20, and TcGRAS21, and we hypothesized that the C-terminal region of these GRAS proteins was truncated, lacking part of the GRAS domain.
We used the Gene Structure Display Server 2.0 program to construct a domain analysis of TcGRAS proteins ( Figure 2C). We found a total of eight types of conserved domains. All the GRAS genes contain the GRAS domain or GRAS superfamily domain. In addition, the domains of GRAS members in DELLA also have the DELLA superfamily. The TcGRAS2 gene has a GRAS superfamily and TB2_DP1_HVA22 superfamily domain. The TcGRAS22 gene has a GRAS superfamily, ZnF_BED, DUF4413, Dimer_Tnp_hAT, and Peptidase_c48 superfamily domain.

Chromosomal Mapping and Cis-Acting Regulatory Analyses of TcGRAS Genes
The location of TcGRAS genes was obtained from genome annotation files. A total of 46 TcGRAS genes were randomly distributed on 10 chromosomes and were named from

Chromosomal Mapping and Cis-Acting Regulatory Analyses of TcGRAS Genes
The location of TcGRAS genes was obtained from genome annotation files. A total of 46 TcGRAS genes were randomly distributed on 10 chromosomes and were named from TcGRAS1 to TcGRAS46 according to their positions on the chromosomes ( Figure 3A). Chr01 9 of 16 and Chr04 had the largest number (nine, 19.57%) of TcGRAS genes, followed by Chr09 with eight members (17.39%). In contrast, Chr05, Chr06, and Chr10 contained only two TcGRAS genes each (4.35%). Chr04 contained seven subgroups of TcGRAS genes, as shown in Figure 3B, while Chr05, Chr06, and Chr10 contained only two subgroups each. Subgroup DLT was only observed on Chr03, subgroup SCL4/7 was only observed on Chr01, and subgroup LAS was only observed on Chr07.

Gene Duplication and Syntenic Analysis of TcGRAS Genes
Genome-wide analysis of cacao tree gene replication by MCScanX software revealed 2148 tandem duplicated genes in the cacao tree genome, while only 2 pairs of tandem duplicated genes were present among 46 TcGRAS genes (Figure 4). The analysis showed that one pair of tandem duplication genes (TcGRAS24 and TcGRAS25) was located on Cis-acting elements regulate transcription initiation and transcription activity by binding to transcription factors. To explore the promoter function of TcGRAS genes, we extracted 2000 bp sequences upstream of the transcription start site. Then, we submitted these to the online program PlantCARE. Seven types of important cis-acting elements were obtained after sorting and screening, including light-responsive element, abscisic acid-responsive element, MeJA-responsive element, defense and stress-responsive element, low-temperature-responsive element, auxin-responsive element and gibberellin-responsive element. The light-responsive element was found in all promoter regions of TcGRAS genes. In addition, more than half of the 46 GRAS genes had the abscisic acid-responsive element, MeJA-responsive element, and gibberellin-responsive element. Compared with the MADS-Box transcription factor family in cacao tree, the GRAS gene family contains significantly more light-responsive elements, abscisic acid-responsive elements, and MeJA-responsive elements. The distribution of these cis-acting elements is shown in Supplementary File S3.

Gene Duplication and Syntenic Analysis of TcGRAS Genes
Genome-wide analysis of cacao tree gene replication by MCScanX software revealed 2148 tandem duplicated genes in the cacao tree genome, while only 2 pairs of tandem duplicated genes were present among 46 TcGRAS genes ( Figure 4). The analysis showed that one pair of tandem duplication genes (TcGRAS24 and TcGRAS25) was located on Chr04, and another pair (TcGRAS42 and TcGRAS43) on Chr09. In addition, the substitution ratio of non-synonymous (Ka) to synonymous (Ks) mutations (Ka/Ks) of two pairs were calculated ( Table 3). The Ka/Ks values of both pairs were more than 1, indicating that these genes were positively selected over the course of evolution, and the novel protein functions may be beneficial to the survival and reproduction of T. cacao. were calculated ( Table 3). The Ka/Ks values of both pairs were more than 1, indicating that these genes were positively selected over the course of evolution, and the novel protein functions may be beneficial to the survival and reproduction of T.cacao. The MCScanX showed that there were 2767 segmental duplications in the genome of T.cacao, and only 6 pairs of fragmental duplicated genes were predicted out of 46 identified TcGRAS genes. The Advanced Circos function of the TBtools software was used to visualize the segmental duplication of GRAS genes on 10 chromosomes, as shown in Figure 4. Chr01 contained three duplicated genes, and Chr02, Chr03, and Chr04 contained two duplicated genes, while Chr05, Chr06, and Chr09 each contained only one duplicated gene. However, Chr07, Chr08, and Chr10 did not contain any segmental duplicated genes.    The MCScanX showed that there were 2767 segmental duplications in the genome of T. cacao, and only 6 pairs of fragmental duplicated genes were predicted out of 46 identified TcGRAS genes. The Advanced Circos function of the TBtools software was used to visualize the segmental duplication of GRAS genes on 10 chromosomes, as shown in Figure 4. Chr01 contained three duplicated genes, and Chr02, Chr03, and Chr04 contained two duplicated genes, while Chr05, Chr06, and Chr09 each contained only one duplicated gene. However, Chr07, Chr08, and Chr10 did not contain any segmental duplicated genes.
The syntenic analyses of TcGRAS genes with the GRAS genes of A. thaliana, P. trichocarpa, and S. indicum were separately analyzed to find homologous gene pairs ( Figure 5). A total of 36 GRAS genes of T. cacao had a syntenic relationship with the GRAS genes of A. thaliana (16), P. trichocarpa (32), and S. indicum (30). Some genes had multiple syntenic relationships with other closely related species. Therefore, a total number of 25 (Supplementary File S4), 77 (Supplementary File S5), and 48 (Supplementary File S6) GRAS genes of A. thaliana, P. trichocarpa, and S. indicum, respectively, had synteny with 36 GRAS genes. Furthermore, it was found that 13 GRAS genes existed in these 4 plants at the same time ( Figure 6). Two homologous GRAS genes existed in T. cacao and S. indicum rather than in P. trichocarpa and A. thaliana. Similarly, T. cacao, P. trichocarpa, and S. indicum had 13 homologous TcGRAS genes that did not exist in A. thaliana, T. cacao, P. trichocarpa, and A. thaliana had 1 homologous TcGRAS gene that did not exist in S. indicum, and T. cacao, S. indicum, and A. thaliana had 2 homologous GRAS genes that did not exist in P. trichocarpa. Five homologous GRAS genes existed in T. cacao and P. trichocarpa but did not exist in A. thaliana and S. indicum. The syntenic analyses of TcGRAS genes with the GRAS genes of A. thaliana, P. trichocarpa, and S. indicum were separately analyzed to find homologous gene pairs ( Figure 5). A total of 36 GRAS genes of T. cacao had a syntenic relationship with the GRAS genes of A. thaliana (16), P. trichocarpa (32), and S. indicum (30). Some genes had multiple syntenic relationships with other closely related species. Therefore, a total number of 25 (Supplementary File

GO Annotation of T.cacao TcGRAS Proteins
To understand TcGRAS protein function in different biological processes, we performed a GO annotation analysis of the TcGRAS genes (Figure 7), and the GO numbers are shown in Supplementary File S7. The analysis of the cellular composition showed that most of the TcGRAS proteins were mainly concentrated in the nucleus. The analysis of biological processes showed that the TcGRAS genes were involved in many biological processes. The large portions of GRAS proteins were involved in transcriptional regulation. Otherwise, some TcGRAS proteins were involved in the negative regulation of biological processes, for example, negative regulation of seed germination and the gibberellic acid-mediated signaling pathway. In addition, some TcGRAS genes also respond to abiotic stresses and regulate plant organ development. The analysis of the molecular functions of TcGRAS genes revealed that they had functions in transcription factor activity and sequence-specific DNA binding.

GO Annotation of T.cacao TcGRAS Proteins
To understand TcGRAS protein function in different biological processes, we performed a GO annotation analysis of the TcGRAS genes (Figure 7), and the GO numbers are shown in Supplementary File S7. The analysis of the cellular composition showed that most of the TcGRAS proteins were mainly concentrated in the nucleus. The analysis of biological processes showed that the TcGRAS genes were involved in many biological processes. The large portions of GRAS proteins were involved in transcriptional regulation. Otherwise, some TcGRAS proteins were involved in the negative regulation of biological processes, for example, negative regulation of seed germination and the gibberellic acid-mediated signaling pathway. In addition, some TcGRAS genes also respond to abiotic stresses and regulate plant organ development. The analysis of the molecular functions of TcGRAS genes revealed that they had functions in transcription factor activity and sequence-specific DNA binding.

Discussion
In this study, we provided the first comprehensive analysis of the GRAS gene family in T.cacao. Based on the latest genome sequences and annotation files, we identified 46 GRAS genes distributed across 10 chromosomes in the T.cacao genome. These 46 TcGRAS genes were classified into 12 subgroups (HAM, LISCL, LAS, SCL4/7, SCR, DLT, SCL3,

Discussion
In this study, we provided the first comprehensive analysis of the GRAS gene family in T. cacao. Based on the latest genome sequences and annotation files, we identified 46 GRAS genes distributed across 10 chromosomes in the T. cacao genome. These 46 TcGRAS genes were classified into 12 subgroups (HAM, LISCL, LAS, SCL4/7, SCR, DLT, SCL3, DELLA, SHR, PAT1, UN1, and UN2) according to their phylogenetic relationship with A. thaliana. We found that the GRAS gene family members were unevenly distributed among subgroups; for instance, the subgroups of UN1 and UN2 only contained T. cacao members, and the member number of subgroups of SCR and SCL3 in T. cacao was more than that in A. thaliana. During the evolution of gene families, the gene structure changes in response to environmental changes to acquire new functions. The structural analysis of TcGRAS genes according to phylogenetic relationships showed that different subgroups had different gene structures and conserved motifs, while the same subgroup had similar motifs and gene structures, which meant that members of the same subgroup had similar functions. Since T. cacao and A. thaliana were exposed to different environments during their evolutionary processes, the number of GRAS genes in their subgroups became different as GRAS genes differentiated. By analyzing the intron/exon structure of the TcGRAS genes, we found that majority of these genes were free of introns, which was similar to the observed lack of introns in Arabidopsis and rice GRAS genes [8]. A previous study showed that ancestors of each eukaryote had intron-rich genes and that extensive loss and insertion of introns from most genes may have occurred due to selective pressure, with gene duplication accelerating this process [48,49]. Nevertheless, some GRAS genes have evolved different intron/exon structures, indicating that they likely evolved new specialized functions to adapt to their environment.
Tandem and segmental duplications are thought to be the main mechanisms contributing to the expansions of gene families in plants [50]. Both tandemly and segmentally duplicated genes that have been retained in plant genomes play important roles in adaptive responses to environmental stimuli [51,52]. The collinearity analysis in our study showed that there were two pairs of tandem duplication and six pairs of segmental duplication events in the T. cacao GRAS gene family, and this might play an important role in the GRAS family expansion in T. cacao.
The cis-acting elements play a vital role in regulating gene expression during plant growth and development [53]. The promoter analysis showed that the light-responsive element was found in all promoter regions of TcGRAS genes. In addition, more than half of the 46 GRAS genes had the abscisic acid-responsive element, MeJA-responsive element, and gibberellin-responsive element, which made it possible to study the function of these genes in the future.
To further analyze the function of GRAS transcription factors in T. cacao, we studied the end of the genotype affected by functional diversity after GO enrichment analysis, and the results showed that the majority of cacao GRAS proteins play an important role in many different biological processes, including abiotic stresses and plant organ development.

Conclusions
In this study, we identified and systematically analyzed the GRAS gene family in T. cacao. Based on the genomic data of the cacao tree, we finally identified 46 GRAS genes using double HMM profiles. These 46 GRAS genes were distributed on 10 chromosomes and phylogenetically divided into 12 subfamilies, with highly similar gene structures and conserved motifs within the same subfamily. Cis-acting element analysis indicated that GRAS genes may be involved in various abiotic stress responses. In addition, we found that tandem and segmental duplications contribute to the expansions of the GRAS gene family. A further syntenic analysis showed that the functions of TcGRAS genes might be speculated from the function of GRAS genes in other plants. Through GO analysis, we found that most of the TcGRAS genes were involved in transcriptional regulation. In summary, the results provide information for further research of the TcGRAS genes' function and lay the foundation for further investigation.