Comprehensive Comparative Analysis of the GATA Transcription Factors in Four Rosaceae Species and Phytohormonal Response in Chinese Pear (Pyrus bretschneideri) Fruit

The GATA gene family is one of the most important transcription factors (TFs). It extensively exists in plants, contributes to diverse biological processes such as the development process, and responds to environmental stress. Although the GATA gene family has been comprehensively and systematically studied in many species, less is known about GATA genes in Chinese pears (Pyrus bretschneideri). In the current study, the GATA gene family in the four Rosaceae genomes was identified, its structural characteristics identified, and a comparative analysis of its properties was carried out. Ninety-two encoded GATA proteins were authenticated in the four Rosaceae genomes (Pyrus bretschneideri, Prunus avium, Prunus mume, and Prunus persica) and categorized into four subfamilies (Ⅰ–Ⅳ) according to phylogeny. The majority of GATA genes contained one to two introns and conserved motif composition analysis revealed their functional divergence. Whole-genome duplications (WGDs) and dispersed duplication (DSD) played a key role in the expansion of the GATA gene family. The microarray indicated that, among P. bretschneideri, P. avium, P. mume and P. persica, GATA duplicated regions were more conserved between Pyrus bretschneideri and Prunus persica with 32 orthologous genes pairs. The physicochemical parameters, duplication patterns, non-synonymous (ka), and synonymous mutation rate (ks) and GO annotation ontology were performed using different bioinformatics tools. cis-elements respond to various phytohormones, abiotic/biotic stress, and light-responsive were found in the promoter regions of GATA genes which were induced via stimuli. Furthermore, subcellular localization of the PbGATA22 gene product was investigated, showing that it was present in the nucleus of tobacco (Nicotiana tabacum) epidermal cells. Finally, in silico analysis was performed on various organs (bud, leaf, stem, ovary, petal, and sepal) and different developmental stages of fruit. Subsequently, the expression profiles of PbGATA genes were extensively expressed under exogenous hormonal treatments of SA (salicylic acid), MeJA (methyl jasmonate), and ABA (abscisic acid) indicating that play important role in hormone signaling pathways. A comprehensive analysis of GATA transcription factors was performed through systematic biological approaches and comparative genomics to establish a theoretical base for further structural and functional investigations in Rosaceae species.


Introduction
Transcription factors (TFs) are a major component that regulates gene expression by binding to the gene's promoter and controlling various biological processes. The capacity of TFs to bind cis-acting elements on the promoter regions can be used to classify them into various categories [1]. Based on the existence of DNA binding motifs, these transcription factors have been assigned the names zinc-finger, MYB, WRKY, PHD (plant homeodomain), AP2/EREBP (Apetala2/ethylene-responsive element-binding protein), MADS, bZIP (basic leucine zipper), and NAC (NAM, ATAF1/2, and CUC1/2) [2][3][4]. The zinc-finger transcription factors are divided into different families related to their conserved domain structure [5]. GATA transcription factors are distinguished by their ability to bind the W-G-A-TA-R (W = T/A, R = G/A) sequences on the promoter regions [6]. Type IV zinc-finger motifs with the CX2CX1720CX2C consensus sequence is accompanied by a specific region that facilitates DNA binding. The members of GATA zinc-finger (with 17-18 residues in the binding loop) are present in both animal and fungal GATA TFs, while 17-20 residues are present in plants' GATA TFs [7]. The function of GATA transcription factors in animals and fungi has been extensively investigated [8][9][10] while in plants, GATA transcription factors are directly implicated in the regulation of stress signaling and metabolic pathways in addition to their active role in cell differentiation [11,12]. In terms of the amino acid residues available in the zinc-finger loop, fungal GATA transcription factors are a composite of plant and animal GATA transcription factors. GATA transcription factors execute several functions in fungal cells, including siderophore formation, circadian regulation, and nitrogen metabolism [13,14]. GATA transcription factors have still not been thoroughly investigated in plants. NTL1 is a homolog of NIT-2, the first plant GATA transcription factor discovered in tobacco which plays a key role in nitrogen metabolism [15]. The GATA TFs are associated with the regulation of various stress-sensitive genes, light response, floral development-related genes, and hormonal signaling such as cytokinin, auxin, and gibberellin in plants [7,16,17]. Besides that, in Arabidopsis, the active function of GATA TFs in photooxidative damage prevention through tetrapyrrole biosynthesis (TPB) has been quite well reported [18]. GNL-class B GATA (GNC-LIKE) TFs were found to function downregulated for type-B ARRs in Arabidopsis and were thus coordinated at the junction of cytokinin signaling and auxin [19]. Photomorphogenesis is regulated by GATA2 TFs, which is a key role in light signaling, while CGA1 and GNC are the main transcriptional regulators and are involved in chloroplast biogenesis in Arabidopsis [17,20]. In addition, hormonal signals, such as auxin and gibberellin regulation, are governed by the GATA factor and control the downregulation of target genes GNL and GNC during plant development/growth. These stimuli also manipulate the brassinosteroid level and facilitates the regulation of the GATA2 (which is a transcription factor of GATA in Arabidopsis) [20,21]. Studies have endeavored to investigate the role of GATAs in stress response and hormonal signaling, especially gibberellin, auxin, and jasmonic acid signaling, but there have been limited publications regarding this subject.
Due to changes in the environment, abiotic stress is a vital element that affects the final yield of fruit crops by manipulating fruit growth and development [22,23]. In temperate climates, the pear (Pyrus bretschneideri) is cultivated to produce fruit with high commercial value [24,25]. For sustainable agriculture, investigating the mechanisms of pear fruit growth and development, as well as abiotic stress tolerance, is critical. Although the importance of GATA TFs in model plants such as Arabidopsis and rice has been studied, the characterization of the pear GATA family and its response to abiotic stress and hormonal signaling remains uninvestigated. This study was planned to reveal all aspects of the GATA family for abiotic stress and hormonal signaling.

Identification and Classification of GATA Gene Family in Four Rosaceae Species
To identify the GATA gene family, we utilized the four species genomes data for BLASTP and Pfam and HMM searches [26,27]. All potential GATA genes were confirmed using plant transcription factors database and Pfam (http://pfam.xfam.org/, accessed on 5 February 2021) [28,29] and SMART (http://smart.embl-heidelberg.de/, accessed on 5 February 2021) to ensure that they included the GATA domains [30]. The protein sequence and the coding sequence were obtained from the database project of Chinese pear and other three species (peach, sweet cherry, and Japanese apricot) from the Genome Database for Rosaceae (GDR). A total of 92 GATA members were investigated from four Rosaceae species with 32 from Pyrus bretschneideri (PbGATA1-PbGATA32), 18 from Prunus avium (PaGATA1-PaGATA18), 20 proteins from Prunus mume (PmGATA1-PmGATA20), and 22 from Prunus persica (PpGATA1-PpGATA22).
Moreover, the evolutionary relationship was investigated between four Rosaceae species (Pyrus bretschneideri, Prunus avium, Prunus mume, and Prunus persica). We retrieved 30 GATA protein sequences from the Arabidopsis genome [31] to further identify and reveal the potential evolutionary history of GATA genes. All GATA genes were aligned through Clustalx software for the construction of a phylogenetic tree ( Figure 1). The maximum likelihood method (ML-M) was utilized with 1000 times bootstrap and other default parameters for phylogeny. 30 Arabidopsis thaliana GATA genes along with 30 PbGATA, 20 PmGATA, 18 PaGATA, and 22 PpGATA genes were classified into four subfamilies (I to IV). According to the phylogeny, subfamily-I contained 9 while subfamily-II, III, IV comprised 24, 41, and 48 genes, respectively. These findings indicated evolutionary conservation and a stronger homology between GATA genes in highly associated subfamily whenever evaluated simultaneously. The length of GATA proteins varied from 119 (PmGATA13) to 548 (PaGATA3) with an average of 300.2 amino acids in four Rosaceae species, as shown in Table S1. GATAs protein molecular weights and isoelectric points (pI) were anticipated to be between 12,985.2 and 60,234.7 kDa with an average of 32502.5 kDa and between 4.7 and 10.0 (with an average of 10.01), respectively (Table S1). At least one domain was identified in all members of four Rosaceae species (Figure 1). Current findings reveal that the 92 predicted proteins have varying pI and molecular weights due to their various protein lengths. It was suggested that various GATA proteins might have a functional divergence.

Gene Structure and Conserved Motifs of GATA Gene Family
The conserved motif analysis of the PbGATA family was investigated to validate their evolutionary relationship and classification. The MEME program [32] was used to predict and validate the conserved motifs in the PbGATA proteins to learn more about their sequence features ( Figure 2). The conserved motifs (1 and 9) were identified across all GATA proteins. From the findings, the majority of PbGATA members in the same group have the same pattern. The highest conserved motifs (9) were observed in subfamily-IV, while subfamily-I contained the lowest conserved motifs (3). Some proteins, on the other hand, contain many distinct conserved motifs in the various subfamily. For example, motifs 1 and 6 were found only in subfamilies (I, II, III, and IV). The same motif in all subfamilies revealed that they might be necessary for some basic functions.
Genetic variation is a vital component for the evolution of various gene families. To further describe and comprehend the structural diversity of the GATA gene, we conducted an exon-intron investigation. The quantity of introns/exons varies from 1-9/1-10 in subfamily II; PbGATA32 had the most (9/10), while in subfamily III, PbGATA15,27 had the least introns and exons (1/2) ( Figure 2 and Table S1). The number of introns ranged from 0 to 9, indicating a wide range of variability. Furthermore, the PbGATA genes which were classified together in the same subfamily had extremely identical exon-intron structures, verifying a close evolutionary connection and group categorization. It was also confirmed that the length of genes differed in every subfamily. The evolutionary relationship of the PbGATA gene family was predicted, which led to functional variations among GATA genes in Chinese pear.

Gene Structure and Conserved Motifs of GATA Gene Family
The conserved motif analysis of the PbGATA family was investigated to validate their evolutionary relationship and classification. The MEME program [32] was used to predict and validate the conserved motifs in the PbGATA proteins to learn more about their sequence features ( Figure 2). The conserved motifs (1 and 9) were identified across all GATA proteins. From the findings, the majority of PbGATA members in the same group have the same pattern. The highest conserved motifs (9) were observed in subfamily-IV, while subfamily-I contained the lowest conserved motifs (3). Some proteins, on the other hand, contain many distinct conserved motifs in the various subfamily. For example, motifs 1 and 6 were found only in subfamilies (I, II, III, and IV). The same motif in all subfamilies revealed that they might be necessary for some basic functions.

GO Annotation Analysis
Subcellular location and the functions of the potential GATA protein are predicted while using (GO) gene ontology annotation analysis in pear. 32 GATA proteins were classified into 27 functional groups based on amino acid resemblances and divided into four ontologies, namely molecular function, cellular component, biological process, and subcellular localization (Table S3). In subcellular localization, we examined that 97% annotated GATA protein anticipated their function into nuclear followed by extracellular 3%. In molecular function annotation, GATA protein anticipated their maximum functionality (25.42%) in ion binding, DNA binding, and nucleic acid binding transcription factor, followed by signal transducer activity (21.25%), protein binding transcription factor (1.25%), and oxidoreductase activity (1.25%). In cellular component annotation, PbGATA protein annotated with intracellular, cell, cytoplasm, nucleus, an organelle with 23.13% followed by cytoplasm (7.50%). Moreover, predicted GATA proteins were annotated with signal transduction, cellular nitrogen compound metabolic process, and biosynthetic process along with the same percentage (16.18%) while a response to stress, reproduction, and cell differentiation contribute 11.36%, 12.53%, and 3.02%, respectively ( Figure 3). Genetic variation is a vital component for the evolution of various gene families. To further describe and comprehend the structural diversity of the GATA gene, we conducted an exon-intron investigation. The quantity of introns/exons varies from 1-9/1-10 in subfamily II; PbGATA32 had the most (9/10), while in subfamily III, PbGATA15,27 had the least introns and exons (1/2) ( Figure 2 and Table S1). The number of introns ranged from 0 to 9, indicating a wide range of variability. Furthermore, the PbGATA genes which were classified together in the same subfamily had extremely identical exon-intron structures, verifying a close evolutionary connection and group categorization. It was also confirmed that the length of genes differed in every subfamily. The evolutionary relationship of the PbGATA gene family was predicted, which led to functional variations among GATA genes in Chinese pear.

GO Annotation Analysis
Subcellular location and the functions of the potential GATA protein are predicted while using (GO) gene ontology annotation analysis in pear. 32 GATA proteins were classified into 27 functional groups based on amino acid resemblances and divided into four ontologies, namely molecular function, cellular component, biological process, and subcellular localization (Table S3). In subcellular localization, we examined that 97% annotated GATA protein anticipated their function into nuclear followed by extracellular 3%. In molecular function annotation, GATA protein anticipated their maximum functionality (25.42%) in ion binding, DNA binding, and nucleic acid binding transcription factor, followed by signal transducer activity (21.25%), protein binding transcription factor (1.25%), and oxidoreductase activity (1.25%). In cellular component annotation, PbGATA protein annotated with intracellular, cell, cytoplasm, nucleus, an organelle with 23.13% followed by cytoplasm (7.50%). Moreover, predicted GATA proteins were annotated with signal transduction, cellular nitrogen compound metabolic process, and biosynthetic process along with the same percentage (16.18%) while a response to stress, reproduction, and cell differentiation contribute 11.36%, 12.53%, and 3.02%, respectively ( Figure 3). Phylogenetic tree, conserved motif, and gene structure analysis of GATA genes. Phylogenetic relationship (left); 20 conserved motifs analysis was performed using MEME (motif-based sequence analysis tool) and each motif is represented with a colored box (middle). The introns-exons organization in GATA genes is denoted; yellow boxes represent exons, introns are represented with thin black lines, and the green boxes indicated the untranslated (UTR) region (right).

Chromosomal Distribution and Gene Duplication Events and Ka/Ks Analysis
We visualized a map of chromosomal locations based on the genomic data of Chinese pear, sweet cherry, peach, and Japanese apricot. In Chinese pear, 78.1% of genes were distributed on the chromosome, while 21.9% were traced on the unassembled scaffold. Moreover, maximum PbGATA genes (7) were identified on chr15 while, chr6,7,11,13 and 17 contain at least 1 gene number. Subsequently, in sweet cherry and Japanese apricot, all genes were traced on chromosomes which were arranged in clusters formation. Meanwhile, 90% of PmGATA genes were noticed on chromosomes, remaining 10% were scattered on the unassembled scaffold. The maximum number of PmGATA genes (4) were visualized on chr4 and chr6, 4 PaGATA genes on chr7, and 7 PpGATA genes on chr1. (Table S5).
To further comprehend the extensive mode of gene duplications were identified in Chinese pear, sweet cherry, Japanese apricot, and peach. We investigated five modes of gene duplications in all GATA genes families, including proximal duplication (PD), tandem duplication (TD), dispersed duplication (DSD), transposed duplication (TRD), and whole-genome duplication (WGD) in order to give more relevant information regarding duplication patterns and evolutionary connections amongst these genes ( Figure 4).
WGD and TRD duplication events were found in all four species while DSD duplication event was traced only in pear, sweet cherry, and Japanese apricot. TD duplication was only noticed in peach, Japanese apricot, and sweet cherry. Surprisingly, 81 duplicated gene pairs were traced in four Rosaceae genomes, with a maximum number of duplicated gene pairs derived from DSD (dispersed duplications) (26 pairs out of 81), followed by WGD (whole-genome duplications) (22 pairs out of 81) indicating that the expansion of the GATA genes was linked with DSD and WGD duplication. On the other hand, P. bretschneideri 32% and 27% GATA genes had to contribute in DSD and WGD ( Figure 5).

Chromosomal Distribution and Gene Duplication Events and Ka/Ks Analysis
We visualized a map of chromosomal locations based on the genomic data of Chinese pear, sweet cherry, peach, and Japanese apricot. In Chinese pear, 78.1% of genes were distributed on the chromosome, while 21.9% were traced on the unassembled scaffold. Moreover, maximum PbGATA genes (7) were identified on chr15 while, chr6,7,11,13 and 17 contain at least 1 gene number. Subsequently, in sweet cherry and Japanese apricot, all genes were traced on chromosomes which were arranged in clusters formation. Meanwhile, 90% of PmGATA genes were noticed on chromosomes, remaining 10% were scattered on the unassembled scaffold. The maximum number of PmGATA genes (4) were visualized on chr4 and chr6, 4 PaGATA genes on chr7, and 7 PpGATA genes on chr1. (Table S5).
To further comprehend the extensive mode of gene duplications were identified in Chinese pear, sweet cherry, Japanese apricot, and peach. We investigated five modes of gene duplications in all GATA genes families, including proximal duplication (PD), tandem duplication (TD), dispersed duplication (DSD), transposed duplication (TRD), and whole-genome duplication (WGD) in order to give more relevant information regarding duplication patterns and evolutionary connections amongst these genes ( Figure 4).  WGD and TRD duplication events were found in all four species while DSD duplication event was traced only in pear, sweet cherry, and Japanese apricot. TD duplication was only noticed in peach, Japanese apricot, and sweet cherry. Surprisingly, 81 duplicated gene pairs were traced in four Rosaceae genomes, with a maximum number of duplicated gene pairs derived from DSD (dispersed duplications) (26 pairs out of 81), followed by WGD (whole-genome duplications) (22 pairs out of 81) indicating that the expansion of  These data validates the close evolution history of these four species. Moreover, these duplication events were also responsible for the expansion of the GATA family in four Rosaceae species. The nonsynonymous (ka) and synonymous (ks) values are used to estimate evolutionary history and gene selection pressures [33]. Amongst these GATA genes, we computed the frequency of nonsynonymous/ synonymous substitutions (ka/ks) in four Rosaceae species. Positive selection was indicated by ka/ks > 1, whereas negative selection with functional limitations were indicated by ka/ks < 1. The mean ka/ks value of whole-genome duplication (WGD) events in P. persica, P. bretschneideri, P. avium, and P. mume were 0.40, 0.37, 0.38, 0.37 correspondingly (Table S5). The ka/ks ratio of duplicated gene pairs in P. avium, P. mume, P. bretschneideri, and P. persica were <1, suggesting that GATA genes had a strong purifying selection. However, strawberry, Japanese apricot, Chinese pear, peach, and sweet cherry duplicated gene pairs had higher ka/ks ratio, demonstrating that the GATA family expansion has a complex evolutionary history (Figure 6 and Table S5). These data validates the close evolution history of these four species. Moreover, these duplication events were also responsible for the expansion of the GATA family in four Rosaceae species. The nonsynonymous (ka) and synonymous (ks) values are used to estimate evolutionary history and gene selection pressures [33]. Amongst these GATA genes, we computed the frequency of nonsynonymous/ synonymous substitutions (ka/ks) in four Rosaceae species. Positive selection was indicated by ka/ks > 1, whereas negative selection with functional limitations were indicated by ka/ks < 1. The mean ka/ks value of whole-genome duplication (WGD) events in P. persica, P. bretschneideri, P. avium, and P. mume were 0.40, 0.37, 0.38, 0.37 correspondingly (Table S5). The ka/ks ratio of duplicated gene pairs in P. avium, P. mume, P. bretschneideri, and P. persica were <1, suggesting that GATA genes had a strong purifying selection. However, strawberry, Japanese apricot, Chinese pear, peach, and sweet cherry duplicated gene pairs had higher ka/ks ratio, demonstrating that the GATA family expansion has a complex evolutionary history ( Figure 6 and Table S5).

Collinearity Relationships
The collinearity interactions of GATA genes were studied among Fragaria vesca, P. avium, P. mume, P. persica, and P. bretschneideri as these five species belong to the Rosaceae family and had a common ancient (Table S4). Total 114 collinear gene pair were traced among the five Rosaceae species, containing 32 orthologous gene pairs among Chinese pear and peach, 29 orthologous genes pairs between pear and strawberry, 25 orthologous gene pairs between sweet cherry and Chinese pear, and 28 orthologous gene pairs among Chinese pear and Japanese apricot, indicating a very close association between four Rosaceae species (Figure 7).

Collinearity Relationships
The collinearity interactions of GATA genes were studied among Fragaria vesca, P. avium, P. mume, P. persica, and P. bretschneideri as these five species belong to the Rosaceae family and had a common ancient (Table S4). Total 114 collinear gene pair were traced among the five Rosaceae species, containing 32 orthologous gene pairs among Chinese pear and peach, 29 orthologous genes pairs between pear and strawberry, 25 orthologous gene pairs between sweet cherry and Chinese pear, and 28 orthologous gene pairs among Chinese pear and Japanese apricot, indicating a very close association between four Rosaceae species (Figure 7).
In the growth/development group, cis-acting elements were placed widely throughout the promoter regions, including O2-site (involved in zein metabolism regulations) and G-Box-4 (responsible for plant growth in response to light). G-Box-4 covered the largest portion (57%) and O2-site (43%) (Figure 8A,B). These findings indicate that PbGATAs have the potential to respond to phytohormones (ABA, SA, MeJA, and auxin) and improve abiotic/biotic stress. In the growth/development group, cis-acting elements were placed widely throughout the promoter regions, including O2-site (involved in zein metabolism regulations) and G-Box-4 (responsible for plant growth in response to light). G-Box-4 covered the largest portion (57%) and O2-site (43%) (Figure 8A,B). These findings indicate that PbGATAs have the potential to respond to phytohormones (ABA, SA, MeJA, and auxin) and improve abiotic/biotic stress.
Surprisingly, the PbGATA32 gene gradually increase and showed that this gene had a potential role in a later stage, indicating that might be PbGATA genes play an important role in fruit development and ripening ( Figure 9A,B).

Subcellular Localization Analysis
To analyze the subcellular localization, we transiently overexpressed the PbGATA22 gene fused with eGFP into Nicotiana benthamiana leaves through agroinfiltration. In this construct, PbGATA22 protein was fused to the N-terminus of GFP protein under the control of the CaMV 35S promoter produced a strong green fluorescent signal in the nucleus ( Figure 10) which is consistent with the previous results [36]. These results suggested that PbGATA22 was indeed localized in the nucleus.

Figure 9.
A heat map is visualized to demonstrate the relative expression patterns of GATA genes. (A) The transcription patterns of the GATA gene family in organs (leaf, ovary, sepals, petals, bud, and stem). (B) The fruit of Chinese pear at different developmental stages (S1 to S7). Different colors correspond to log2 transformed values. Blue or violet color indicates no or lower relative transcripts abundance of each sample and red showed higher expression respectively.

Subcellular Localization Analysis
To analyze the subcellular localization, we transiently overexpressed the PbGATA22 gene fused with eGFP into Nicotiana benthamiana leaves through agroinfiltration. In this construct, PbGATA22 protein was fused to the N-terminus of GFP protein under the control of the CaMV 35S promoter produced a strong green fluorescent signal in the nucleus ( Figure 10) which is consistent with the previous results [36]. These results suggested that PbGATA22 was indeed localized in the nucleus.

Discussion
GATA transcription factors (TFs) are involved in various important biochemical and physiological processes in plants [37][38][39][40]. Genome-wide investigation of the GATA gene family was conducted to find expression diversity and potential functions in four Rosaceae species. In the present study, we investigated 92 genes of GATA transcription factors in four Rosaceae spp, namely as PbGATA1-30 (Pyrus bretschneideri), Pm1-18 (Prunus mume), Pp1-Pp20 (Prunus persica), and Pa1-Pa18 (Prunus avium) based on chromosomal location. Bioinformatics analysis such as phylogeny, conserved motifs, domains, orthologous genes, gene structure, chromosomal location, physiochemical properties, cis-elements, and gene duplication events was performed in the GATA gene family. Moreover, RNA-seq, subcellular localization, and qRT-PCR analysis under hormonal treatments

Discussion
GATA transcription factors (TFs) are involved in various important biochemical and physiological processes in plants [37][38][39][40]. Genome-wide investigation of the GATA gene family was conducted to find expression diversity and potential functions in four Rosaceae species. In the present study, we investigated 92 genes of GATA transcription factors in four Rosaceae spp, namely as PbGATA1-30 (Pyrus bretschneideri), Pm1-18 (Prunus mume), Pp1-Pp20 (Prunus persica), and Pa1-Pa18 (Prunus avium) based on chromosomal location. Bioinformatics analysis such as phylogeny, conserved motifs, domains, orthologous genes, gene structure, chromosomal location, physiochemical properties, cis-elements, and gene duplication events was performed in the GATA gene family. Moreover, RNA-seq, subcellular localization, and qRT-PCR analysis under hormonal treatments were analyzed. These results suggested that GATA genes are classified into four subfamilies (I-IV) according to phylogeny and genetic structure. Subfamily-IV had the highest PbGATA genes, which was similar to A. thaliana [40].
Additionally, the orthologous syntenic relationship among Pyrus bretschneideri, Prunus avium, Prunus persica, Prunus mume, was analyzed. Pyrus bretschneideri and Prunus avium contained a maximum (25) gene pairs, followed by Pyrus bretschneideri and Prunus mume (33 gene pairs), and Pyrus bretschneideri and Prunus persica (32 gene pairs), while 29 orthologous gene pairs were identified in Pyrus bretschneideri and Fragaria vesca. These results also revealed that all of the GATA homologous gene pairs were firmly clustered together, suggesting that they were more strongly linked to one another. Syntenic patterns may provide details about the evolution of a genome. Due to chromosomal localizations, fusions, and selective gene loss, certain homologous GATA genes may not have been mappable to any syntenic regions, making chromosomal syntenic difficult to identify (Figure 7 and Table S4) [41]. Taken altogether, our analysis showed that PbGATA genes have physicochemical features that are highly conserved across species. Gene duplication events are a vital mechanism for generating various genetic novelty in plants, which could improve the organism's ability to adapt the environmental stress. The revolution in plant genomes is facilitated by gene duplication events [42,43]. Gene duplication events (segmental, tandem, and whole-genome duplication) are crucial for evolution. Consequently, identifying duplication mode help in the function and the structure of the GATA gene family [40,44]. Our results indicate that whole-genome duplication (WGDs) is more common in Roseacea species as compared to tandem duplication ( Figure 4 and Table S5). Many GATA genes in pears have two or more equivalents in A. thaliana, suggesting that genome duplication events may have led to the amplification of the PbGATA gene family in pears. Furthermore, four and six gene pairs were discovered after further analysis, both of which were thought to have resulted from segmental and whole-genome duplication events, correspondingly. Moreover, the motif seen in all GATA proteins, various PbGATA groups included additional conserved motifs. The existence of these diverse, highly conserved domains might therefore be linked to diverse PbGATA protein activities. Exon gain/loss has occurred often during the history of numerous gene families. Most GATA genes in group A in A. thaliana have just two exons. On the other hand, PbGATA has one exon/intron and PbGATA32 contained the highest introns and exons ( Figure 2B). Taken together, our findings show that GATA genes have experienced moderate structural and functional divergence during evolution.
In eukaryotes, transcription factors are produced in the cytoplasm and perform their function in the nucleus to regulate the transcription of downstream genes. PbGATA22 was a predicted bioinformatics analysis that also localizes in the nucleus ( Figure 10) [36]. The coordination of numerous cis-acting elements regulates gene expression patterns [45]. Numerous hormone-responsive elements on the promoter regions of the PbGATA gene family were discovered in this work ( Figure 12 and Table S6). The transcript levels of most potential PbGATA genes were stimulated to various degrees after post-treatment of fruit with exogenous hormones such as MeJA, SA, and ABA (as predicted). These findings are critical for pear fruit production in the field. By spraying various exogenous hormones onto pear fruits, we can control the metabolism of stone cells in the fruits and increase their quality [46]. Interestingly, none of the four putative PbGATA promoters had hormone-responsive elements, such as SA response elements in the PbGATA promoter regions. (Figure 12). This might be due to interactions between various types of plant hormones, which may encourage synergy and increases in each other's content [47]. As a result, spraying hormones may cause a rise in the number of other hormones and increase peak gene expression in pear fruit. Gene expression patterns would provide important information about genes underlying activities [48,49]. Exogenous hormones (ABA) influence the expression levels of the GATA gene in Gossypium plants [46]. Previous research has shown that hormones on Chinese pear fruits can control the growth of stone cells and are involved in fruit ripening and senescence [50,51]. GATA TFs are involved in chloroplast biogenesis, hormones related to stress, light response, and floral development [17,36]. In Arabidopsis thaliana, ATGATA2 is a positive regulator of phytohormones, which is significantly expressed in petioles and hypocotyls, and GATA24 [20,52] and ATGATA28 were identified as two compulsory components of the cryptochrome1-mediated photoprotective response in Arabidopsis Thaliana [53]. In grapes, VvGATA5, VvGATA2, and VvGATA7 were extremely expressed in flowers and VvGATA5 was abundantly expressed in the berry ripening stage [16]. SIGATA11 and SIGATA12 in the root, while SIGATA7 were expressed in root and flower, fruits, and SIGATA25 gene involved in abscisic acid, flavonoids, and carotenoids [54]. GATA7 (Cs5g26470) was identified in citrus fruit which is involved in abscisic acid (ABA), glucose, and fructose [55]. BnGATAs showed a unique expression profile in various tissue of Brassica napus under abiotic stress [56].
In our study, qRT-PCR analysis was used to detect the expression profile of the GATA gene family in Chinese pear under three hormonal stress (ABA, SA, and MeJA) (Figures 11 and 13). Several plants are improving their abiotic-biotic stress response after different exogenous hormonal treatments such as SA, MeJA, and ABA [57,58]. The fluctuations in various essential hormones such as SA, ABA, and MeJA occur as a response to stress [46]. We randomly selected 10 PbGATA genes from each subfamily to analyze their expression profile after the hormonal (MeJA, SA, and ABA) stress on pear fruit. In our study, all GATA genes revealed a strong upregulation under hormonal response, indicating that PbGATA members play a crucial role in the abiotic stress-related response.

Chromosomal Localization and Introns/Exons Analysis
The starting and ending points of each GATA member were obtained from the Plant Transcription Factor Database website and confirmed by using the GFF3 file. Chromosomal localization was also validated by using CLC sequence viewer v7 [61]. Eventually, using MapChart v2.32, the GATA genes were scale-mapped on the chromosomes of Pyrus bretschneideri. Subsequently, the introns-exons alignment of PbGATA genes was analyzed by evaluating coding and genomic sequences using GSDS v2.0 (Gene Structure Display Server) (http://gsds.cbi.pku.edu.cn/, accessed on 4 April 2021) [62].

Conserved Motifs and Promoter Sequence Analysis
Moreover, the full-length sequence of amino acids was used in the MEME suite (Multiple EM for Motif Elicitation) (https://meme-suite.org/, accessed on 26 April 2021) online tools to classify conserved motifs and associated duplication events during the evolution of PbGATA members [32]. The default parameter configures rations were used for the corresponding exemptions. The maximal number of motifs to be identified was 20, and the possibility of motifs occurrence was varied per gene sequence. The graphical representation of the evolutionary relationship based on the described domains was developed. The promoter sequences (upstream of the ATG start codon with 1500 bp) of the PbGATA genes were carried out from the pear genome project and analyzed using online the Plant-CARE database (http://bioinformatics.psb.ugent.be/webtools/plantcare, accessed on 2 May 2021).

In Silico Expression Analysis of Annotated PbGATA Genes
Transcriptomic data of P. brestschneideri was downloaded from the NCBI website (https://www.ncbi.nlm.nih.gov/sra, accessed on 10 May 2021) of d47ifferent fruit developmental stages with accession number SRX1595645, SRX1595648, SRX1595646, SRX1595647, SRX1595651, SRX1595650, SRX1595652. Meanwhile, RNA-seq reads of different organs (bud, leaves, stem, sepal ovary, and petal) were also downloaded with the following accession number SRR8119889, SRR8119895, SRR8119898, SRR8119903, SRR8119906, SRR8119907. Finally, FPKM (the fragments per kilobase of transcript per million mapped read) values were used for evaluating the expression patterns of PbGATA members and a heat map was visualized using R software.

Comparative Phylogenetic Analysis
ClustalX software was used to perform protein sequence-based on multiple sequence alignment for a comparison study of 32 PbGATA proteins from Pyrus bretschneideri, 18 proteins from Prunus avium, 22 from Prunus persica, 20 proteins from Prunus mume, and 30 from Arabidopsis thaliana and were aligned by using ClustalX tool (https://www.genome.jp/ tools-bin/clustalx, accessed on 10 February 2021). The evolutionary relationship was computed using the Maximum likelihood method (ML-M) by using online IQ-tree software. Finally, all alignments were used to visualize a phylogenetic tree with itol software [66].

Plant Materials and Stress Treatment
Pear fruit samples were taken 39 days after flowering from a 45-year-old plant that was grown in a research horticulture garden in (Dangshan, Anhui) China. According to a previously mentioned procedure, 500 µM Methyl jasmonate (MeJA), 500 µM abscisic acid (ABA), and 200 µM salicylic acid (SA) were sprayed on the entire surface of the fruits [46,57] at 39 DAF. All fruit samples were obtained at 0 h, 1 h, 2 h, and 3 h. Finally, the fruits sample were instantly frozen in liquid nitrogen and stored at −80 • C for further in vitro experiments.

Subcellular Localization of PbGATA Protein
To analyze subcellular localization, the PbGATA22 gene containing stop codon was inserted into the pHB-35Spro eGFP vector with the primers in listed in (Table S2) [67]. p35S::PbGATA22::eGFP recombinant vector was transferred into Agrobacterium tumefaciens GV3101 competent cells (Shanghai, China, Weidi Biotechnology Company) and further recombinant vectors were transiently infiltered in Tobacco (Nicotiana benthamiana) epidermal cells. After cultivating in the greenhouse for 48 h without light, the agroinfiltered leaf area was detected using a LAS-AF confocal microscope (Leica, Wetzlar, Germany).

Isolation of Total RNA and Quantitative Real-Time PCR
To examine the qRT-PCR analysis, total RNA was extracted from frozen fruit tissue using RNAiso-mate Tissue Kit (Tiangen, Beijing, China). The purity and quantity of RNA were assessed by Nanodrop 1000 spectrophotometer (thermoscientic, Beijing, China). The RNA was extracted then reverse transcribed into the first-strand cDNA using a one-step RT-qPCR kit (Takara, Shanghai, China). As per the manufacturer instructions, quantitative RT-PCR (qRT-PCR) was performed using an SYBR green Premix Ex TaqTM kit (Takara) on an ABI 7500 real-time PCR detection system. The expression data were normalized with the tubulin gene as an internal control to investigate the gene expression level [68]. The sets of all primers used for qRT-PCR analysis were designed on Gen script online software (https://www.genscript.com/tools/, accessed on 4 June 2021) are listed in Supplementary  Table S2. The experiments were conducted for three biological and technical replicates and relative expression levels for each gene were evaluated via the 2 − CT method [57,69].

Conclusions
In this study, a total of 92 GATA genes were scanned and isolated from four Rosaceae species (Pyrus bretschneideri, Prunus avium, Prunus mume, and Prunus persica) and classified into four subfamilies based on phylogeny. A systematic analysis of the GATA genes was carried out, including physicochemical characterization, conserved motif, chromosomal location, gene structure (introns/exons), evolutionary relationship, conserved domain, synonymous and non-synonymous ratios, transcriptomic, collinearity relationship, and cis-acting elements. Dispersed duplication (DSD) and whole-genome duplication (WGD) might highly contribute to the expansion of GATA genes. In addition, qRT-PCR results revealed that PbGATAs had a significant role related to abiotic stress. Subcellular localization of PbGATA22 by transient expression of GFP fusion protein in tobacco cells predicted that the majority of GATA family proteins are localized in the nucleus of leaf panels. These results provide basic information that may facilitate the evolutionary relationship, molecular mechanism, and functional analysis of PbGATA genes to understand their roles in pear fruits.