Genome-Wide Characterization and Expression Analysis of GATA Transcription Factors in Response to Methyl Jasmonate in Salvia miltiorrhiza

Salvia miltiorrhiza is an important medicinal plant, which is mainly used for treatment of cardiovascular and cerebrovascular diseases. GATA transcription factors are evolutionarily conser-ved proteins that play essential roles in biological process of plants. In this study, we systematically characterized the GATA transcription factors in S. miltiorrhiza. A total 28 SmGATA genes were identified and divided into four subfamilies based on phylogenetic analysis and domain. SmGATA genes being clustered into a subfamily have similar conserved motifs and exon-intron patterns, and unevenly distribute on eight chromosomes of S. miltiorrhiza. Tissue-specific expression analysis based on transcriptome datasets showed that the majority of SmGATA genes were preferentially expressed in roots. Under methyl jasmonate (MeJA) treatment, the quantitative real-time PCR (qRT-PCR) analysis indicated that several SmGATA genes in roots showed distinct upregulation post-MeJA treatment, especially SmGATA08, which was highly responsive to MeJA, and might be involved in the jasmonate signal, thereby affecting root growth, development, tolerance to various stresses, or secondary metabolites biosynthesis. The study found that several SmGATAs, like SmGATA08, are highly responsive to MeJA, indicating that these SmGATAs might be vital in the biosynthesis of tanshinones and phenolic acids by regulating the response to MeJA in S. miltiorrhiza. Our results laid the foundation for understanding their biological roles and quality improvement in S. miltiorrhiza.


Introduction
Plant transcription factors are usually regarded as molecular switches for regulating gene expression, which play vital role in plant growth, development, as well as the response to diverse stresses. GATA transcription factors widely exist in eukaryotes, and all contain the conserved type IV zinc finger motifs (C-X 2 -C-X 17-20 -C-X 2 -C), and a basic region that can bind to WGATAR (W = T or A; R = G or A) to regulate their downstream genes on the transcription levels [1]. In plants, the first GATA gene Ntl1 encoding a GATA-1 zinc finger protein was isolated, which is homologous to nit-2 in Neurospora crassa, and the characteristics of Ntl1 gene expression are compatible with those of a regulator of the nitrogen metabolism [2]. Beyond that, in Arabidopsis, 30 AtGATAs were identified and clustered into four subfamilies; the subfamily (I) has 14 members with two exons, these AtGATAs contain a zinc finger loop with 18 amino acids. The subfamily (II) encode 11 AtGATAs, which also contain a single zinc finger. The subfamily (III) GATA genes contain 20 amino acids in zinc finger loop, and the GATA zinc finger, CCT and TIFY motifs also exist exclusively in this subfamily. ASXH motifs exist specifically in GATA subfamily IV in Arabidopsis thaliana [3][4][5]. The studies on GATA genes in this model plant laid a foundation for further studies of GATA genes in other plants.

Identification and Sequence Analysis of S. miltiorrhiza GATA Genes
The GATA protein sequences of Arabidopsis were firstly retrieved and used as the retrieval sequence to retrieve the S. miltiorrhiza GATA family members from the whole genome database of S. miltiorrhiza. Then, 28 SmGATA members were identified and confirmed in the genome database, and they were renamed SmGATA1 to SmGATA28 for further analysis. Except for the GATA conserved domains (CD), the SmGATA proteins differ greatly in protein sequence, sequence length, and physicochemical properties (Supplementary Materials 2: Table S2). Generally, SmGATAs consist of 113-590 amino acids (aa), and the gene length ranges from 342 bp to 1773 bp. The relative molecular mass of SmGATA proteins vary from 12.25 kDa to 64.98 kDa, with the isoelectric point in the range of 5.31 to 10.63. N-or O-glycosylation site and phosphorylation site prediction showed that the potential post-translational modifications existed in all SmGATA proteins. Furthermore, the subcellular prediction showed that most of the SmGATA proteins are localized in the cell nucleus, whereas SmGATA14 and SmGATA21 are in the chloroplast (Supplementary Materials 2: Table S2).

Phylogenetic and Conserved Domains Analysis
For understanding the evolutionary relationships among 82 GATAs from A. thaliana, S. bowleyana and S. miltiorrhiza, an unrooted phylogenetic tree was constructed and clustered into four phylogenetic groups (Figure 1). Of these, the subfamily I contained 15 SmGATAs, while 7, 5, and 1 SmGATAs belong to subfamily II, III and IV, respectively. Moreover, the GATA domains within subfamily I members, located in the position 170-310 aa; 10-70 or 100-200 aa for the subfamily II; 150-280 aa for the subfamily III, and 6-60 aa for the subfamily IV, respectively (Supplementary Materials 2: Table S2). Except for the GATA motif found in all SmGATA proteins, other conserved domains were also found in some SmGATA proteins, such as ASXH, CCT, and TIFY domains. Moreover, the CCT and TIFY domains only exist in the subfamily III. In subfamily III, three SmGATAs (SmGATA02, SmGATA05 and SmGATA06), have two domains of CCT and TIFY; only SmGATA26 and SmGATA28 have the CCT domain.
the range of 5.31 to 10.63. N-or O-glycosylation site and phosphorylation site prediction showed that the potential post-translational modifications existed in all SmGATA proteins. Furthermore, the subcellular prediction showed that most of the SmGATA proteins are localized in the cell nucleus, whereas SmGATA14 and SmGATA21 are in the chloroplast (Supplementary Materials 2: Table S2).

Phylogenetic and Conserved Domains Analysis
For understanding the evolutionary relationships among 82 GATAs from A. thaliana, S. bowleyana and S. miltiorrhiza, an unrooted phylogenetic tree was constructed and clustered into four phylogenetic groups (Figure 1). Of these, the subfamily I contained 15 SmGATAs, while 7, 5, and 1 SmGATAs belong to subfamily II, III and IV, respectively. Moreover, the GATA domains within subfamily I members, located in the position 170-310 aa; 10-70 or 100-200 aa for the subfamily II; 150-280 aa for the subfamily III, and 6-60 aa for the subfamily IV, respectively (Supplementary Materials 2: Table S2). Except for the GATA motif found in all SmGATA proteins, other conserved domains were also found in some SmGATA proteins, such as ASXH, CCT, and TIFY domains. Moreover, the CCT and TIFY domains only exist in the subfamily III. In subfamily III, three SmGATAs (SmGATA02, SmGATA05 and SmGATA06), have two domains of CCT and TIFY; only SmGATA26 and SmGATA28 have the CCT domain. The proteins with highly similar sequences are generally considered to be functionally similar. Till now, the functions of SmGATAs have not been reported; however, the phylogenetic analysis can provide some evidence to filter candidate genes, which will lay the foundation for future research on the gene functions. As shown in Figure 1, SmGATA03 has 99% similarity to AtGATA18, which indicated that SmGATA03 may have The proteins with highly similar sequences are generally considered to be functionally similar. Till now, the functions of SmGATAs have not been reported; however, the phylogenetic analysis can provide some evidence to filter candidate genes, which will lay the foundation for future research on the gene functions. As shown in Figure 1, SmGATA03 has 99% similarity to AtGATA18, which indicated that SmGATA03 may have the same functions as AtGATA18. In addition, there were five pairs (SmGATA18 and AtGATA22, SmGATA14 and AtGATA12, SmGATA26 and AtGATA24, SmGATA28 and AtGATA24, SmGATA05 and At-GATA19) with high similarity between S. miltiorrhiza and A. thaliana, which also suggested that they have similar function, respectively.

Chromosome Localizations and Genomic Synteny of SmGATA Genes
As shown in Figure 2 and Supplementary Materials 2: Table S2, 23 out of 28 SmGATAs are unevenly distributed on chromosomes, while the other 5 SmGATA genes are located on random fragments. Of these chromosomes, chromosome 1 contained the largest number of SmGATAs with seven genes, the second was chromosome 3 containing four SmGATA genes. Both chromosomes 5 and 7 had three SmGATAs, chromosomes 2 and 6 had two SmGATAs, while only one SmGATA gene was located on each of chromosomes 4 and 8. This uneven distribution may be due to the differences in the size and structure of chromosomes. the same functions as AtGATA18. In addition, there were five pairs (SmGATA18 AtGATA22, SmGATA14 and AtGATA12, SmGATA26 and AtGATA24, SmGATA28 AtGATA24, SmGATA05 and AtGATA19) with high similarity between S. miltiorrhiza A. thaliana, which also suggested that they have similar function, respectively.

Chromosome Localizations and Genomic Synteny of SmGATA Genes
As shown in Figure 2 and Supplementary Materials 2: Table S2, 23 out o SmGATAs are unevenly distributed on chromosomes, while the other 5 SmGATA g are located on random fragments. Of these chromosomes, chromosome 1 contained largest number of SmGATAs with seven genes, the second was chromosome 3 con ing four SmGATA genes. Both chromosomes 5 and 7 had three SmGATAs, chromoso 2 and 6 had two SmGATAs, while only one SmGATA gene was located on each of c mosomes 4 and 8. This uneven distribution may be due to the differences in the size structure of chromosomes.   For further confirming the phylogenetic relationships of the SmGATA genes, the synteny relationships among S. miltiorrhiza, S. bowleyana and A. thaliana were analyzed ( Figure 3). The results showed that 22 SmGATAs exhibited a syntenic relationship with SbGATAs, and three SmGATAs share homology with those in AtGATAs. Moreover, some SmGATAs have multiple orthologous copies in S. bowleyana, e.g., SmGATA10 had a syntenic relationship with SbGATA18 and SbGATA19. Detailed information of the synteny results are shown in Supplementary Table S3.
AtGATA24, SmGATA05 and AtGATA19) with high similarity between S. miltiorrhiza a A. thaliana, which also suggested that they have similar function, respectively.

Chromosome Localizations and Genomic Synteny of SmGATA Genes
As shown in Figure 2 and Supplementary Materials 2: Table S2, 23 out of SmGATAs are unevenly distributed on chromosomes, while the other 5 SmGATA ge are located on random fragments. Of these chromosomes, chromosome 1 contained largest number of SmGATAs with seven genes, the second was chromosome 3 conta ing four SmGATA genes. Both chromosomes 5 and 7 had three SmGATAs, chromosom 2 and 6 had two SmGATAs, while only one SmGATA gene was located on each of ch mosomes 4 and 8. This uneven distribution may be due to the differences in the size a structure of chromosomes.

Gene Structure and Conserved Motif Analysis of SmGATA Genes
The gene structures of SmGATAs were analyzed by the Gene Structure Display Server (Figure 4), and revealed that the gene structure were significantly different among those SmGATA genes. The number of exons in the 28 SmGATAs varied from 2 to 8, which indicated various intron-exon patterns among SmGATA genes. However, the members within each subfamily displayed similar exon/intron structures. The SmGATAs in subfamily I has two or three exons. Members in subfamily II possess three exons, while the SmGATAs in subfamily III contain seven or eight exons, except SmGATA26 and SmGATA28 which have only one exon. The secondary structures of SmGATAs were predicted using SOPMA, as shown in Supplementary Materials 3: Table S3; a random coil is the main unit in SmGATAs, followed by an α-helix and an extended strand. The proportion of the random coil in SmGATAs fluctuates from 43.36% (SmGATA14) to 72.0% (SmGATA10), while α-helix ratio from 15.19% (SmGATA22) to 41.73% (SmGATA16), and the ratio of extended strand ranges from 5.33% (SmGATA23) to 15.44% (SmGATA12).
To better understand the structural divergence and predict the function of SmGATA proteins, ten conserved motifs were found in the SmGATAs (Figure 4), and all the SmGATAs contained the typical type-IV zinc finger (motif 1). On the whole, the SmGATAs belonging to the same subfamily had similar motifs. Of these motifs, motif 1 and motif 4 were detected in all SmGATA proteins. Some motifs only appeared in only certain specific subgroups, for instance, the motifs 2, 3, 5, 6 and 10 were detected in the members of subfamily I, the motif 9 was present only in the members of subfamily II, while the motif 7 and 8 existed only in the members of subfamily III, representing the TIFY and the CCT domain. Except for the motif 1 and 4, no other motifs were found in the subfamily IV ( Figure 4). In brief, the similar exon-intron organization pattern and conserved motifs within a subfamily further validated the subfamily classification by the phylogenetic analysis, and so GATA proteins belonging to the same subfamily may have similar functions.
showed the orthologous relationships of GATA genes between within S. miltiorrhiza, A. thaliana, and S. bowleyana.

Gene Structure and Conserved Motif Analysis of SmGATA Genes
The gene structures of SmGATAs were analyzed by the Gene Structure Display Server (Figure 4), and revealed that the gene structure were significantly different among those SmGATA genes. The number of exons in the 28 SmGATAs varied from 2 to 8, which indicated various intron-exon patterns among SmGATA genes. However, the members within each subfamily displayed similar exon/intron structures. The SmGATAs in subfamily I has two or three exons. Members in subfamily II possess three exons, while the SmGATAs in subfamily III contain seven or eight exons, except SmGATA26 and SmGATA28 which have only one exon. The secondary structures of SmGATAs were predicted using SOPMA, as shown in Supplementary Materials 3: Table S3; a random coil is the main unit in SmGATAs, followed by an α-helix and an extended strand. The proportion of the random coil in SmGATAs fluctuates from 43.36% (SmGATA14) to 72.0% (SmGATA10), while α-helix ratio from 15.19% (SmGATA22) to 41.73% (SmGATA16), and the ratio of extended strand ranges from 5.33% (SmGATA23) to 15.44% (SmGATA12).
To better understand the structural divergence and predict the function of SmGATA proteins, ten conserved motifs were found in the SmGATAs (Figure 4), and all the SmGATAs contained the typical type-IV zinc finger (motif 1). On the whole, the SmGATAs belonging to the same subfamily had similar motifs. Of these motifs, motif 1 and motif 4 were detected in all SmGATA proteins. Some motifs only appeared in only certain specific subgroups, for instance, the motifs 2, 3, 5, 6 and 10 were detected in the members of subfamily I, the motif 9 was present only in the members of subfamily II, while the motif 7 and 8 existed only in the members of subfamily III, representing the TIFY and the CCT domain. Except for the motif 1 and 4, no other motifs were found in the subfamily IV ( Figure 4). In brief, the similar exon-intron organization pattern and conserved motifs within a subfamily further validated the subfamily classification by the phylogenetic analysis, and so GATA proteins belonging to the same subfamily may have similar functions.

Cis-Acting Regulatory Element Analysis of SmGATA Genes
The cis-acting regulatory elements were searched for further studying the underlying molecular mechanisms of SmGATAs in response to various stresses. The types and numbers of cis-acting regulatory elements varied in SmGATA genes (Supplementary Materials 4: Table S4), and the core CAAT box and TATA box account for a relatively high proportion. There are several types of representative regulatory elements ( Figure 5). The first type are the light-responsive elements, which exist in 26 SmGATA promoters, fluctuating between 4 and 30, such as the G-box, I-box, GATA-motif, MRE, Sp1, LAMP-element, and GA-motif cis-elements, indicating the important role of SmGATAs in plant growth and development. The second type are the hormone-responsive elements, which respond to plant hormones, such as auxin (TGA-element, TGA-box, AuxRR-core), MeJA (CGTCA-motif), salicylic acid (TCA-element), abscisic acid (ABRE), and gibberellins (GARE-motif). The third type are stress-responsive elements responding to diverse abiotic stress, including ARE, DRE, MBS, DRE, and the WUN-motif, which are related with defense and stress, such as dehydration, cold, drought, and salt stresses. In addition, 14 SmGATA genes contain MBS, and 2 SmGATA genes contain DRE cis-acting regulatory elements.
In addition, there are several other types of representative regulatory elements, for example, plant growth and development-related elements related to seed-, endosperm-, and root-specific regulation, circadian, and meristem expression ( Figure 5), such as the RY-element, GCN4_motif, motif I, and CAT-box. Moreover, metabolism-related elements, the MYB binding site elements (MBSI), were found in the promoters of three SmGATA genes (SmGATA7/11/14), which related to the regulation of some secondary metabolite synthesis. O2-site, HD-Zip 3 and Box III also exist in some SmGATA TFs.

Cis-Acting Regulatory Element Analysis of SmGATA Genes
The cis-acting regulatory elements were searched for further studying the und ing molecular mechanisms of SmGATAs in response to various stresses. The types numbers of cis-acting regulatory elements varied in SmGATA genes (Supplemen Materials 4: Table S4), and the core CAAT box and TATA box account for a relat high proportion. There are several types of representative regulatory elements (Figu The first type are the light-responsive elements, which exist in 26 SmGATA promo fluctuating between 4 and 30, such as the G-box, I-box, GATA-motif, MRE, LAMP-element, and GA-motif cis-elements, indicating the important role of SmGA in plant growth and development. The second type are the hormone-responsive ments, which respond to plant hormones, such as auxin (TGA-element, TGA AuxRR-core), MeJA (CGTCA-motif), salicylic acid (TCA-element), abscisic acid (AB and gibberellins (GARE-motif). The third type are stress-responsive elements resp ing to diverse abiotic stress, including ARE, DRE, MBS, DRE, and the WUN-motif, w are related with defense and stress, such as dehydration, cold, drought, and salt stre In addition, 14 SmGATA genes contain MBS, and 2 SmGATA genes contain cis-acting regulatory elements.
In addition, there are several other types of representative regulatory elements example, plant growth and development-related elements related to seed-, endospe and root-specific regulation, circadian, and meristem expression ( Figure 5), such a RY-element, GCN4_motif, motif I, and CAT-box. Moreover, metabolism-related ments, the MYB binding site elements (MBSI), were found in the promoters of t SmGATA genes (SmGATA7/11/14), which related to the regulation of some secon metabolite synthesis. O2-site, HD-Zip 3 and Box III also exist in some SmGATA TFs.

Expression Profiles of SmGATA Genes Based on Transcriptome Datasets
Based on the RNA-seq database [22], the expression patterns of SmGATAs studied for clues of their possible functions. Firstly, the expression levels of 28 SmGA

Expression Profiles of SmGATA Genes Based on Transcriptome Datasets
Based on the RNA-seq database [22], the expression patterns of SmGATAs were studied for clues of their possible functions. Firstly, the expression levels of 28 SmGATAs in stem, leaf, and root tissues were all investigated, and assembled hierarchically in a heat map ( Figure 6A and Supplementary Materials 5: Table S5). In total, 20 SmGATAs were expressed in all organs, 14 of them with FPKM value > 1, while 6 SmGATAs with FPKM < 1, and 2 SmGATAs with no expression. Moreover, most of the SmGATAs showed higher expression in root compared to stem and leaf. Of these, 14 SmGATAs had a FPKM value > 1 in the roots; among them, SmGATA07, SmGATA14 and SmGATA17 had the highest expression. In the stem, 6 SmGATAs had FPKM values > 1, while there were no SmGATA genes with a FPKM value > 1 in the leaf. High expression levels of SmGATA07, SmGATA14 and SmGATA17 in the three tissues suggested that these SmGATAs might have an important function in plant growth and development. enes 2022, 13, x FOR PEER REVIEW in stem, leaf, and root tissues were all investigated, and assembled hierarchic map ( Figure 6A and Supplementary Materials 5: Table S5). In total, 20 Sm expressed in all organs, 14 of them with FPKM value > 1, while 6 SmGATAs 1, and 2 SmGATAs with no expression. Moreover, most of the SmGATAs sh expression in root compared to stem and leaf. Of these, 14 SmGATAs had a F 1 in the roots; among them, SmGATA07, SmGATA14 and SmGATA17 had expression. In the stem, 6 SmGATAs had FPKM values > 1, while there were genes with a FPKM value > 1 in the leaf. High expression levels of SmGATA14 and SmGATA17 in the three tissues suggested that these SmG have an important function in plant growth and development. Under MeJA treatment ( Figure 6B and Supplementary Materials 6: Ta were nine SmGATAs with FPKM value  1 or no expression in mock-treate MeJA-treated S. miltiorrhiza leaves. Only one gene showed significantly i pression at 12 h compared with mock-treated leaves, which indicated a rescue to MeJA, and nine with no obvious change. On the contrary, te showed significant downregulation under MeJA treatment, which sugge regulation in the leaf.

Expression Analysis of SmGATAs under MeJA Treatment
Previous studies proved that MeJA can induce some genes in the seco  Table S6), there were nine SmGATAs with FPKM value <1 or no expression in mock-treated leaves and MeJA-treated S. miltiorrhiza leaves. Only one gene showed significantly increased expression at 12 h compared with mock-treated leaves, which indicated an emergency rescue to MeJA, and nine with no obvious change. On the contrary, ten SmGATAs showed significant downregulation under MeJA treatment, which suggests negative regulation in the leaf.

Expression Analysis of SmGATAs under MeJA Treatment
Previous studies proved that MeJA can induce some genes in the secondary metabolic pathway of S. miltiorrhiza and then increased the content of secondary metabolites [22][23][24][25][26]. To get relevant information on the SmGATAs expression post-MeJA, and verify the function of SmGATA genes, the expression levels of the 25 SmGATAs in S. miltiorrhiza roots at 0, 12, 24, 48, and 72 h after MeJA treatment were analysed. As shown in Figure 7, the expressions of five SmGATAs significantly increased more than 3-fold at 12 h post-MeJA treatment; in particular, SmGATA08 responded to MeJA strongly, which was 70 times higher than the control. At 24 h post-MeJA treatment, five SmGATAs (SmGATA08, SmGATA09, SmGATA11, SmGATA13, and SmGATA18) showed the higher expression level compared with 12 h, the expression level of SmGATA08 increased continuously nearly 60-fold, then maintained higher expression level at 72 h post-MeJA treatment. Other than SmGATA08, the SmGATA09 and SmGATA13 also maintained the high expression level, SmGATA08 reached its highest level at the 24th hour, while the highest expression of SmGATA09 and SmGATA13 occurred at the 72nd hour, which were more than 30 times higher compared to the control. All these suggested that the SmGATA08, SmGATA09, and SmGATA13 can be induced by MeJA.
Genes 2022, 13, x FOR PEER REVIEW 8 of 14 expression level compared with 12 h, the expression level of SmGATA08 increased continuously nearly 60-fold, then maintained higher expression level at 72 h post-MeJA treatment. Other than SmGATA08, the SmGATA09 and SmGATA13 also maintained the high expression level, SmGATA08 reached its highest level at the 24th hour, while the highest expression of SmGATA09 and SmGATA13 occurred at the 72nd hour, which were more than 30 times higher compared to the control. All these suggested that the SmGATA08, SmGATA09, and SmGATA13 can be induced by MeJA.

Discussion
GATA genes are evolutionarily conserved proteins, which participate in regulating various biological processes, and response to environmental stimuli, hormones, as well as nitrogen metabolism [6,7,16,[27][28][29][30]. Although the GATA genes have been studied in some plants, such as A. thaliana [3,27], Oryza sativa [3], Malus × domestica Borkh [31], grape [32], Glycine max [8], and Brassica napus [33], till now, there has been no report of the GATA gene family in S. miltiorrhiza, and thus we finally identified 28 GATAs in S. miltiorrhiza that are similar to AtGATAs (30 GATA genes) [27]. The 28 SmGATAs were classified into four subfamilies, named subfamily (I) to (IV); the subfamily (I) had the most SmGATA genes, and this result is consistent with A. thaliana, which indicated that the classification of the GATAs appeared to be conservative in different plants, such as A. thaliana and O. sativa [3,27]. Except for the GATA domain found in all SmGATA proteins, it is worth noting that the SmGATAs in subfamily III possess two additional well-known domains, namely CCT and TIFY domains. These conserved domains might therefore be involved with different functions of SmGATAs. The CCT domains are responsible for protein-protein interactions and photoperiodic signaling, which are essential for plant photosynthesis, nutrient element utilization, environmental stress response [34]. For example, ZML1 (AtGATA24) and ZML2 (AtGATA28), two subfamily III proteins, can specifically combine with the photoreceptor cryptochrome 1(CryR1) cis-element, and have been confirmed as the vital elements of the cry1-mediated photoprotective response in A. thaliana [35]. In this study, SmGATA26 and SmGATA28 also have the CCT domain, and share high similarity with ZML1; the two genes are not expressed in the tissues in S. miltiorrhiza, and are not induced by MeJA. It was reported that the TIFY domain-containing proteins are involved in jasmonic acidrelated stress responses and developmental processes [36]. SmGATA02, SmGATA05, and SmGATA06 all contain TIFY and CCT domains, and thus, they may be related to plant growth and development, and tolerance or sensibility to stresses. Subfamily (IV) has only one GATA member, viz. SmGATA03, which also contains the ASXH domain. Till now, the research on the ASXH domain mainly focused on animals [37][38][39].
Different conserved domains in different subfamilies may lead to different functions of SmGATAs. Exon gain/loss widely exists in a variety of gene families in the process of evolution. Moreover, the number of gene exons or introns in different subfamilies was inconsistent; just as in SmGATA subfamily I, SmGATA21 and SmGATA11 contain three exons, other SmGATAs possess two exons, which is different to GATA genes of group A in A. thaliana [3]. All of these indicated that the structures and functions of SmGATAs have experienced modest differentiation in the process of evolution. In brief, the exon-intron organization pattern and conserved motifs distribution were similar in the same subfamily, which is consistent with the SmGATA family classification based on the phylogenetic analysis, and so GATA proteins belonging to the same subfamily likely have similar functions.
Studies have been devoted to exploring the functions of GATAs in hormonal signaling, especially gibberellin, IAA and brassinosteroid signaling, but there are limited publications regarding this subject [6,10,12,13,40,41]. Many hormone-responsive cis-elements had been identified in the promoter of SmGATA genes, such as MeJA-responsive cis-elements, which revealed their vital roles in regulating biological processes in S. miltiorrhiza. Previous studies prove that MeJA can induce the biosynthesis genes of tanshinones and phenolic acids in S. miltiorrhiza, and then increase the content of these components [22,24,26]. Therefore, we analyzed the expression profiles of SmGATAs in different tissues and with MeJA treatment, and the result revealed that most of the SmGATA genes have the highest expression level in roots, while they have lower expressions in leaves or stems, such as SmGATA14 and SmGATA17, which suggested that they probably take part in root development or tolerance to various stresses. Under MeJA treatment, only one gene in leaves showed significantly increased expression, eight with no expression, nine with no obvious change, and ten SmGATAs showed significant downregulation, which suggests negative regulation in leaf, thus indicating that SmGATAs related to the biological processes occur mainly in roots. Then, the expression patterns of SmGATAs in response to MeJA in S. miltiorrhiza roots were analyzed by qRT-PCR, the SmGATA08, SmGATA09, SmGATA12, SmGATA13, SmGATA14, and SmGATA18 had higher expression, and showed significant upregulation at 12-h post-MeJA treatment, especially the SmGATA08, which was 70 times higher than the control. The result strongly suggest that these genes can be regulated by the jasmonate signal, and might be closely related to plant growth and development. SmGATA08 and GNC (AtGATA21) have a close phylogenetic relationship. It was reported that GNC play important roles in regulating plant biological processes, such as modulation of chlorophyll biosynthesis (greening) and glutamate synthase (GLU1/Fd-GOGAT) expression, regulation of photosynthetic activities, control of convergence of auxin and gibberellin signaling, cytokinin-regulated development, carbon and nitrogen metabolism [6,7,[27][28][29][30]. SmGATA08 contained a larger number of MeJA-responsive cis-elements, strongly suggesting that SmGATA08 may be involved in the jasmonate signal, thereby affecting plant growth or secondary metabolites. Many studies have been reported that exogenous application of MeJA can regulate the accumulation of two active components in S. miltiorrhiza by simultaneously inducing the expression of related genes on the synthetic pathway of salvianolic acid and tanshinone [23,25]. Despite these insights, how the SmGATAs regulate the tanshinones and phenolic acids biosynthesis, root growth and development in S. miltiorrhiza is still unknown. The functions of these GATA family members in S. miltiorrhiza need to be confirmed through a series of experiments in the future.

Conclusions
The GATA gene family plays a significant role in the regulation of biological processes in plants. Here, 28 SmGATA genes were identified in the genome of S. miltiorrhiza. Based on the evolutionary analysis, 28 SmGATAs unevenly distribute on eight chromosomes, and were classified into four subfamilies. SmGATAs clustered into the same subfamily have similar conserved motifs and exon-intron patterns. Light-responsive and hormoneresponsive elements account for the highest proportion in SmGATAs. Tissue-specific expression analysis based on RNA-seq showed that most SmGATAs were preferentially expressed in roots. Under MeJA treatment, the gene expression analysis revealed that several SmGATA genes in roots showed distinct upregulation post-MeJA treatment. In particular, the gene SmGATA08 was highly responsive to MeJA, and might be involved in the jasmonate signal, thereby affecting root growth, development, tolerance to various stresses, or secondary metabolites biosynthesis. Our results provided more information about SmGATA genes, which laid the foundation for understanding their biological roles and quality improvement in S. miltiorrhiza.

Identification and Sequence Analysis of SmGATA Genes
The Arabidopsis GATA protein sequences were download from The Arabidopsis Information Resource (TARI). The genome sequences of S. miltiorrhiza and S. bowleyana were downloaded from the Genome Warehouse (GWH) [42].
For identifying all members of the GATA family in S. miltiorrhiza, the hidden Markov model profile of the GATA domain (PF00320) was downloaded from the Pfam database (http://pfam.xfam.org/family/PF00320/hmm) (accessed on 2 March 2021). The model was used for searching for S. miltiorrhiza GATA genes with HMMER 3.0 [43], then the basic local alignment search tool (BLAST) was used to further validate the GATA family members, the online NCBI tool CD-Search (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) (accessed on 2 March 2021) was used for confirming the conserved domain of candidate SmGATAs, and the proteins containing the GATA domain were regarded as GATA family members of S. miltiorrhiza [44].
Protein physicochemical properties of the SmGATA TFs were obtained with the aid of the ExPASy proteomics server database [45]. Potential glycosylation sites and phosphorylation sites were predicted through the online NetNGlyc 1.0 server, the YinOYang 1.2 server, and the NetPhos 3.1 server [46]. The subcellular localization of the SmGATA proteins were predicted by the WoLF PSORT server and Euk-mPLoc 2.0 server, respectively [47].

Phylogenetic Analysis of GATA Genes
The molecular dendrogram of GATA proteins from three plants (A. thaliana, Salvia bowleyana and S. miltiorrhiza) was drawn with MEGA7.0.25 software using the neighborjoining method, with the parameters as follows: Poisson model, pairwise deletion, and 1000 bootstrap replications. Based on their aggregation with the AtGATAs, the SmGATAs were divided into different subfamilies [48].

Gene Structure and Conserved Motifs Analysis
The gene structure of the SmGATAs was analyzed using the Gene Structure Display Server (GSDS v.2.0) [49]. The conserved motifs of SmGATAs were analyzed by Multiple Em for Motif Elicitation (MEME). Finally, the Secondary Structure Prediction Method (SOPMA11) was used to predict the secondary structure of the SmGATA protein [50].

Analysis of Cis-Acting Elements in the Promoters of SmGATA Genes
Next, 2 Kb DNA sequence upstream of the start codon of SmGATA family members were searched in the S. miltiorrhiza genome database, and the cis-acting elements present in the promoters of SmGATAs were predicted by Plant cis-Acting Regulatory Elements (PlantCARE). The results were summarized to several types, including the light-, hormone-, plant growth and development-, stress-, and metabolism-responsive elements.

Chromosomal Localization and Collinearity Analysis
Chromosomal locations of SmGATA genes were predicted based on the genomic sequences of S. miltiorrhiza, and mapped with Mapchart 2.32 [51]. The homology between GATA genes in S. miltiorrhiza and other species were analyzed by Multiple Collinearity Scan toolkit X (MCScanX) software [52].

Expression Profiles of SmGATAs in S. miltiorrhiza
Expression profiles of SmGATA genes in leaves under MeJA treatment, and in the different organs (roots, stems, and leaves) were analyzed based on the transcriptome datasets of S. miltiorrhiza obtained from the NCBI (ID: PRJNA214019) [22]. The fragments per kilobase of transcript per million fragments mapped (FPKM) value were standardized for quantifying the expression level of each SmGATA gene, and the changes in gene expression are shown in heat-maps drawn by the Multiple Experiment Viewer (4.9.0).

Plant Materials and MeJA Treatments
The two-year old S. miltiorrhiza seedlings were planted in a greenhouse under 22-25 • C at the Medicinal Herb Garden of Shenyang Agricultural University (42 • 1 N, 124 • 41 E), Shenyang, China. The leaves were sprayed with 50µM MeJA, and the control group was sprayed with the same amount of water. The root samples were collected at 12, 24, 48 and 72 h after MeJA-treatment or water-treatment. Each treatment consisted of three biological replicates for each time point.

Quantitative Real-Time PCR (qRT-PCR) Analysis
The expression levels of SmGATA genes in roots post-MeJA treatment were determined through qRT-PCR analyses. Every sample had three biological replicates. The cDNA was synthesized with 2 µg of total RNA by using the FastKing RT Kit (with gDNase), and the EasyPure Plant RNA Kit (TransGen Biotech Co., Beijing, China), according to the manufacturer's instructions. The primers are shown in Supplementary Materials 1: Table S1. β-Actin served as an internal control. The relative expression levels of SmGATAs were computed with the 2 -∆∆Ct method [53]. The reaction mixture was comprised of 2.6 µL ddH 2 O, 0.2 µL primers (10 µmol·L −1 ), 2 µL cDNA, and 5 µL 2 × SYBR Green qPCR Master Mix