Genome-Wide Identiﬁcation and Expression Analysis of MAPK and MAPKK Gene Family in Pomegranate (

: Mitogen-activated protein kinase (MAPK) cascade is involved in the regulation of a series of biological processes in organisms, which are composed of MAPKKKs , MAPKKs , and MAPKs . Although genome-wide analyses of it has been well described in some species, little is known about MAPK and MAPKK genes in pomegranates. In this study, we identiﬁed 18 PgMAPKs , 9 PgMAPKKs through a genome-wide search. Chromosome localization showed that 27 genes are distributed on 7 chromosomes with di ﬀ erent densities. Multiple sequence alignment and phylogenetic analysis revealed that PgMAPKs and PgMAPKKs could be divided into 4 subfamilies (groups A, B, C, and D), respectively. In addition, exon-introns structural analysis of each candidate gene has indicated high levels of conservation within and between phylogenetic groups. Cis-acting element analysis predicted that PgMAPKs and PgMAPKKs were widely involved in the growth, development, stress and hormone response of pomegranate. Expression proﬁle analyses of PgMAPKs and PgMAPKKs were performed in di ﬀ erent tissues (root, leaf, ﬂower and fruit), and PgMAPK13 was signiﬁcantly expressed in all tissues. To our knowledge, this is the ﬁrst genome-wide analysis of the MAPK and MAPKK gene family in pomegranate. This study provides valuable information for understanding the classiﬁcation and functions of pomegranate MAPK signal. this study, a total of 18 PgMAPK and 9 PgMAPKK genes were identiﬁed in pomegranate and their phylogenetic relationships were explored. The gene structure of each group members was similar. PgMAPKs and PgMAPKKs may participate in the apical meristems, ﬂower organ and fruit development, and the same group may have the similar expression pattern. These results lay the foundation for function studies on PgMAPKs and PgMAPKKs during their evolutionary process.


Introduction
In the long process of evolution, plants have evolved complex signal network regulation mechanism to adapt to the biotic and abiotic stress during growth and development [1][2][3][4]. The mitogen-activated protein kinase (MAPK) cascade is regarded as one of the typical mechanisms of signal transduction. As broad and relatively conservative evolutionary pathway, it is involved in multiple metabolic pathways such as cell division, development process and defense response in organism [5][6][7]. The typical MAPK cascade is composed of three specific kinases, MAP kinase (MAPK), MAPK kinase (MAPKK), and MAPKK kinase (MAPKKK) [8]. They are activated by phosphorylation and dephosphorylation at specific sites [9]. According to sequence alignment and phylogenetic analysis of model species Arabidopsis thaliana [9], the MAPK and MAPKK genes can be divided into four groups (A, B, C, and D). MAPK family members all have a unique "TxY" phosphorylation motif, while the MAPKK genes are activated when serine and serine/threonine residues in the "S/TXXXXXS/T" motif [10]. Despite the highly conserved structure of plant MAPKs and MAPKKs, they have been shown to be extensively involved in plant growth, and hormonal response regulation processes, as well as diverse biological and abiotic stress [11,12]. To date, members of the MAPK cascade gene family have been identified in sequence containing MAPK cascade genes sequence structure domain. Moreover, the protein domains were validated by online program SMART (http://smart.embl-heidelbergDE) [42] and NCBI CDD (https://www.ncbi.nlm.nih.gov/CDD) [43]. The isoelectric points and molecular weights of PgMAPKs and PgMAPKKs were obtained by the ExPASy Proteomics Server (http://expasy.org/) [44].

Phylogenetic Relationship and Conserved Motif Analysis
ClustalX V2.0 [45] was used for multi-sequence alignment of identified proteins of PgMAPKs and PgMAPKKs (Fasta file S2) with parameters set as default values. The results were then submitted to GeneDoc [46] for visualization. A neighbor-joining phylogenetic tree was constructed with MEGA V7.0 [47] (selecting position model with 1000 bootstrap replicates) to explore the phylogenetic relationships of PgMAPK and PgMAPKK family members based on the full-length protein sequence. The conservative motif was obtained by online program MEME (http://meme.nbcr.net/meme/intro. html) [48] based on the expectation maximum (EM) algorithm. The introns and exons of PgMAPKs and PgMAPKKs genes were analyzed and visualized by TBtools V0.6696 [49]. Finally, the 2.0kb upstream DNA sequence of each candidate gene was extracted, and then submitted to Plantcare database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) [50] to predict the cis-acting elements. Cis-acting elements related to growth, stress and hormones were manually retained, and then submitted to TBtools v0.6696 for drawing.

Chromosome Localization and Expression Profiles
The chromosome distribution information of PgMAPKs and PgMAPKKs were displayed by TBtools. RNA-seq data of six pomegranate cultivars were downloaded from NCBI database, including "Dabenzi", "Tunisia", "Baiyushizi", "Wonderful", "Nana", and "Black127" (Table 1 and Table S2). All RNA-Seq data were quality-controlled by fastp [49] to obtain clean reads. The index file was construct by Kallisto V0.44.0 [51] and then the normalized expression units (Transcripts Per Kilobase of exon model per Million mapped reads (TPM) values) of each gene were extracted by kallisto quant command. The final expression levels was from the TPM values that converted by Log 2 (TPM+1). Then the Heatmap package in RStudio V3.6.1 [52] was used to draw the Heatmap of PgMAPKs and PgMAPKKs.

Identification and Sequence Analysis of PgMAPKs and PgMAPKKs
According to the described method, 18 PgMAPKs and 9 PgMAPKKs were obtained from pomegranate whole genome, respectively (Table 2). Furthermore, the number of amino acids, molecular weight, theoretical pI of PgMAPKs and PgMAPKKs were analyzed by ExPASy online tool. The results showed that the molecular weight was between 34910.05~70420.74 Da, the isoelectric point was between 4.94~9.38, the protein length was within the range of 314~619 amino acids, the number of amino acids varied significantly. These data were similar to those reported in economic trees such as grape [56], banana [57], and apple [58]. However, the variations in physicochemical properties among the PgMAPKs and PgMAPKKs indicated that these genes may have subfunctionalization and new functionalization [59]. Chromosomal location analyses showed that 18 MAPKs and 9 MAPKKs presented on 7 chromosomes (chr1, chr2, chr3, chr4, chr6, chr7, chr8) ( Table 2 and Figure 1).

Conserved Motif Analysis of PgMAPK and PgMAPKK Gene Family Members
According to specific conserved motifs analysis, PgMAPKs and PgMAPKKs were found to have unique conserved motifs (Figures 1 and 2). The "VGTxxYMSPER" conserved motifs were found in 9 PgMAPKK genes, and the "T (E/D) YVxTRWYRAPE (L/V)" conserved motifs were contained in 18 PgMAPK genes ( Figure 2). Through multiple sequence alignment of the protein sequences of PgMAPK family genes, "T-loop" motifs were detected in the domain ( Figure 3). There was an ATP

Conserved Motif Analysis of PgMAPK and PgMAPKK Gene Family Members
According to specific conserved motifs analysis, PgMAPKs and PgMAPKKs were found to have unique conserved motifs (Figures 1 and 2). The "VGTxxYMSPER" conserved motifs were found in 9 PgMAPKK genes, and the "T (E/D) YVxTRWYRAPE (L/V)" conserved motifs were contained in 18 PgMAPK genes ( Figure 2). Through multiple sequence alignment of the protein sequences of PgMAPK family genes, "T-loop" motifs were detected in the domain ( Figure 3). There was an ATP phosphorylation site catalyzed by "TxY", which plays an important role in the function of PgMAPKs. According to previous studies of model plants, the PgMAPK gene family can also be divided into four groups (A, B, C, and D) and can be divided into subtypes of "TDY" and "TEY" according to different "TxY" phosphorylation sites (Table 1). In addition, phosphorylation domains with "S/T-XXXXX-S/T" and an anchor site with "-D (L/I/V) K-" were found in PgMAPKKs ( Figure 3). PgMAPKKs were also divided into four groups (A, B, C and D) ( Table 1) based on results reported in the Arabidopsis thaliana [60].

Phylogenetic Relationship of PgMAPKs and PgMAPKKs
In order to explore the evolutionary relationship toward PgMAPK and PgMAPKK proteins, the predicted protein sequences of 18 PgMAPKs, 9PgMAPKKs, 20 AtMAPKs, and 10AtMAPKKs from Arabidopsis were subjected to a multiple sequence alignment using the MEGA7. The N-J (Neighbor-Joining) method was then used to construct an unrooted phylogenetic tree. As shown in Figure 4, the phylogenetic tree divided PgMAPKs and PgMAPKKs into four groups (A, B, C, D). The most abundant of MAPK gene family members were in group D (7), followed by group C (6), group B (3) and group A (2). This result was consistent with previous studies on Arabidopsis [14], rice [15], and tomato [61]. The number of MAPKK gene family was less than that of MAPKK, so there was no significant difference in the number of each group. Three (PgMAPKK1, PgMAPKK2, PgMAPKK3), three (PgMAPKK4, PgMAPKK5, PgMAPKK6), one (PgMAPKK7) and two (PgMAPKK8, PgMAPKK9) PgMAPKK genes belonged to groups A, B, C, and D respectively. The ratio of PgMAPKKS to PgMAPKs was about 1:2, which was in accordance with that in other plant (Table 3). In summary, compared with these model plants, the quantity and grouping status of MAPKs and MAPKKs were highly conserved in pomegranate ( Figure 4, Table 3).

Phylogenetic Relationship of PgMAPKs and PgMAPKKs
In order to explore the evolutionary relationship toward PgMAPK and PgMAPKK proteins, the predicted protein sequences of 18 PgMAPKs, 9PgMAPKKs, 20 AtMAPKs, and 10AtMAPKKs from Arabidopsis were subjected to a multiple sequence alignment using the MEGA7. The N-J (Neighbor-Joining) method was then used to construct an unrooted phylogenetic tree. As shown in Figure 4, the phylogenetic tree divided PgMAPKs and PgMAPKKs into four groups (A, B, C, D). The most abundant of MAPK gene family members were in group D (7), followed by group C (6), group B (3) and group A (2). This result was consistent with previous studies on Arabidopsis [14], rice [15], and tomato [61]. The number of MAPKK gene family was less than that of MAPKK, so there was no significant difference in the number of each group. Three (PgMAPKK1, PgMAPKK2, PgMAPKK3), three (PgMAPKK4, PgMAPKK5, PgMAPKK6), one (PgMAPKK7) and two (PgMAPKK8, PgMAPKK9) PgMAPKK genes belonged to groups A, B, C, and D respectively. The ratio of PgMAPKKS to PgMAPKs was about 1:2, which was in accordance with that in other plant (Table 3). In summary, compared with these model plants, the quantity and grouping status of MAPKs and MAPKKs were highly conserved in pomegranate ( Figure 4, Table 3).

Gene Structure Analysis of PgMAPKs and PgMAPKKs
Gene structure analysis was supposed to provide valuable information about evolutionary and duplication events among gene families. We found that most members of the same group share similar exon/intron structures and conserved motifs. In the PgMAPKs gene family, each group showed different structural characteristics. In group A, PgMPK2 and PgMAPK13 were composed of 6 exons with the longer introns. Six exons were also found in same group of MAPKs in Arabidopsis, poplar and tomato. PgMPK8, PgMAPK9, and PgMAPK12 in group B had 6 exons, which was also consistent with the results of studies on tomato and poplar studies. In group C, PgMAPK5 and PgMAPK17 were only composed of two exons, and the size of each exon was strictly conservative. Compared with the highly conserved structural pattern in group C, the exon and intron distribution of PgMPKs in group D was more complex, exhibiting a diverse pattern. For example, PgMPK1, PgMAPK18 had 11 exons, while PgMAPK11, PgMAPK14, PgMAPK15, and PgMAPK16 had 10 exons, and PgMAPK3 only had 9 ( Figure 5). Although there were slight differences in exon lengths, it was clear that the structural patterns of exons well conserved not only between close paralogs, but also between the PgMAPKs that apparently diverged following earlier duplication events.
The PgMAPKK gene showed two distinct structural patterns ( Figure 5) which were quite similar to the MAPKK gene in Arabidopsis and poplar. PgMAPKK7 in group C and PgMAPKK8, PgMAPKK9 in group D both had less introns and extrons, whereas the PgMAPKKs in group A and group B possessed more exon and intron junctions. In group A, PgMAPKK1 and PgMAPKK2 had 7 exons, while PgMAPKK3 had 8 exons. Consistent with other plants, PgMAPKK1 and PgMAPKK2 shared strong conservation of exonic length. As shown in Figure 5, there were highly consistency of gene structure among the three genes (PgMAPKK4, PgMAPKK5, and PgMAPKK6) in group B. The PgMAPKK gene showed two distinct structural patterns ( Figure 5) which were quite similar to the MAPKK gene in Arabidopsis and poplar. PgMAPKK7 in group C and PgMAPKK8, PgMAPKK9 in group D both had less introns and extrons, whereas the PgMAPKKs in group A and group B possessed more exon and intron junctions. In group A, PgMAPKK1 and PgMAPKK2 had 7 exons, while PgMAPKK3 had 8 exons. Consistent with other plants, PgMAPKK1 and PgMAPKK2 shared strong conservation of exonic length. As shown in Figure 5, there were highly consistency of gene structure among the three genes (PgMAPKK4, PgMAPKK5, and PgMAPKK6) in group B.

Cis-Acting Element Prediction of PgMAPKs and PgMAPKKs
We extracted upstream 2.0 KB of genome sequence for the cis-acting elements analysis. And hundreds of possible cis-acting elements downloaded from the Platcare database (Table S3). After screening, 25 cis-acting elements with clear functional information were retained (Table S4). Among them, we found that a total of 6 elements were related to stress, 9 elements were related to growth and development, and the remaining 10 elements were related to hormones ( Table 4). As shown in Table 3, elements named ARE, LTR, and MBS were present in 20, 18 and 17 genes, respectively. These results indicated that PgMAPK and PgMAPKK genes were widely involved in hypoxia, low temperature and drought stress of pomegranate. Furthermore, WUN-motif was found in six genes, indicating that these genes involved in the external damage mechanism of pomegranate. The elements related to plant growth and development mainly included circadian rhythm, meristem, palisade mesophytic differentiation and endosperm. Notably, MBSI elements were found in five

Cis-Acting Element Prediction of PgMAPKs and PgMAPKKs
We extracted upstream 2.0 KB of genome sequence for the cis-acting elements analysis. And hundreds of possible cis-acting elements downloaded from the Platcare database (Table S3). After screening, 25 cis-acting elements with clear functional information were retained (Table S4). Among them, we found that a total of 6 elements were related to stress, 9 elements were related to growth and development, and the remaining 10 elements were related to hormones ( Table 4). As shown in Table 3, elements named ARE, LTR, and MBS were present in 20, 18 and 17 genes, respectively. These results indicated that PgMAPK and PgMAPKK genes were widely involved in hypoxia, low temperature and drought stress of pomegranate. Furthermore, WUN-motif was found in six genes, indicating that these genes involved in the external damage mechanism of pomegranate. The elements related to plant growth and development mainly included circadian rhythm, meristem, palisade mesophytic differentiation and endosperm. Notably, MBSI elements were found in five genes (PgMAPK2, PgMAPK4, PgMAPK8, PgMAPK14, and PgMAPK15), suggesting that these genes are involved in the mechanism of flavonoid synthesis in pomegranate. In addition, the cis-acting elements related to various hormone reactions (such as auxin, ethylene, salicylic acid, MeJA, GA3), indicating that PgMAPKs and PgMAPKKs were deeply involved in signal transduction of endogenous and exogenous hormone networks in pomegranate. To sum up, the various cis-acting elements in the gene promoter region suggested that the PgMAPK and PgMAPKK gene family plays a crucial role in its growth and development and stress adaptation (Table 4 and Figure 6). TCA-element cis-acting element involved in salicylic acid responsiveness 15

Expression Pattern of PgMAPKs and PgMAPKKs
We used the online RNA-seq data to explore the expression patterns of MAPK and MAPKK genes in pomegranates at different developmental stages. The results showed that these 27 genes were expressed in at least one tissue organ or stage of development ( Figure 6). Compared with other genes, PgMAPK13 was highly expressed in root, flower and seed coat, while PgMAPK4, PgMAPK5, PgMAPK6, PgMAPK7, PgMAPK8, PgMAPK10 had a less expression level. PgMAPK9, PgMAPK12 and PgMAPKK7 was expressed in all except pericarp as shown in Figure 7A. A high PgMAPKK9 and PgMAPK13 expression level was found in in inner seed coat after pollination for 50 days. PgMAPKK3 and PgMAPKK9 expressed significantly variously during the development of functional male flower and female sterility flower as illustrated in Figure 7C. The expression patterns demonstrated that the expression level of each gene was highly variable, suggesting that these genes play an indispensable role in regulating plant growth and development.

Expression Pattern of PgMAPKs and PgMAPKKs
We used the online RNA-seq data to explore the expression patterns of MAPK and MAPKK genes in pomegranates at different developmental stages. The results showed that these 27 genes were expressed in at least one tissue organ or stage of development ( Figure 6). Compared with other genes, PgMAPK13 was highly expressed in root, flower and seed coat, while PgMAPK4, PgMAPK5, PgMAPK6, PgMAPK7, PgMAPK8, PgMAPK10 had a less expression level. PgMAPK9, PgMAPK12 and PgMAPKK7 was expressed in all except pericarp as shown in Figure 7A. A high PgMAPKK9 and PgMAPK13 expression level was found in in inner seed coat after pollination for 50 days. PgMAPKK3 and PgMAPKK9 expressed significantly variously during the development of functional male flower and female sterility flower as illustrated in Figure 7C. The expression patterns demonstrated that the expression level of each gene was highly variable, suggesting that these genes play an indispensable role in regulating plant growth and development. Outer seed coat (50 days after pollination); S9: (50 days after pollination); S10: Inner seed coat of "Tunisia" (50 days after pollination); S11: Inner seed coat of "Baiyushizi" (50 days after pollination); S12: Functional male flower (13.1-25.0 mm); S13:functional male flower (5.1-13.0 mm); S14: Functional male flower (3.0-5.0 mm); S15: Female sterility flower (13.1-25.0 mm); S16: Female sterility flower (5.1-13.0 mm); S17: Female sterility flower (3.0-5.0 mm); S1-S4,S7-S9 and S12-S17 are cultivar "Dabenzi".

Discussion
Generally, plants responded to various biological and abiotic stresses through physiological, biochemical and even molecular changes to resist and adapt them [59]. As an important protein kinases group, MAPKs and MAPKKs can transfer extracellular signals to intracellular signals in plant growth and development, hormone regulation and stress response [63,64]. To date, MAPK and MAPKK gene family have been identified and functionally characterized in several model plants and numerous economic trees. However, the MAPK and MAPKK gene family in pomegranate has not been reported previously.

Discussion
Generally, plants responded to various biological and abiotic stresses through physiological, biochemical and even molecular changes to resist and adapt them [59]. As an important protein kinases group, MAPKs and MAPKKs can transfer extracellular signals to intracellular signals in plant growth and development, hormone regulation and stress response [63,64]. To date, MAPK and MAPKK gene family have been identified and functionally characterized in several model plants and numerous economic trees. However, the MAPK and MAPKK gene family in pomegranate has not been reported previously.
In this study, we first identified 18 MAPKs and 9 MAPKKs in pomegranate at genome-wide level. The numbers of these family members were similar to those found in other plant species, such as Ziziphus jujuba (11/5) [10], Arabidopsis thaliana (20/10) [13,14], Oryza sativa (17/8) [15,16] [67], and Jatropha curcas (12/5) [68]. Figure 1 displayed a scattered distribution of PgMAPKs and PgMAPKKs on the 7 chromosomes of Punica granatum. The chromosome distribution pattern showed that the PgMAPK or PgMAPKK genes density was relatively high in certain chromosomes. Regions containing 4 or more genes in the range of 200kb or less can be regarded as gene clusters [69]. Only one gene cluster with four members in PgMAPK gene family and no other gene cluster was found in PgMAPKs ( Figure 1). However, given that our clustering criteria was relaxed to at least two genes, 7 clusters would be found (63.6% of MAPK and MAPKK family members appeared in the cluster). In addition, we also found that although the PgMAPKK3 and the gene cluster consisting of PgMAPKK4, PgMAPKK5, and PgMAPKK6 were all located on the same chromosome, they were physically far apart, which was consistent with the result of phylogenetic and multiple alignment analysis.
PgMAPKs were divided into three subtypes of "TDY", "TEY", and "TGY" according to different "TxY" phosphorylation sites by multiple sequence alignment (Table 2, Figure 3). Interestingly, all PgMAPKs in group C had a "TGY" motif except PgMAPK17. The "TGY" activation loop was similar to P38 MAPKs in mammals, rather than the T(E/D)Y motif commonly in plants [19]. As far as we know, the same results were also found in wheat, where TaMPK25 also carries a TGY activating loop [70]. Furthermore, there were confirmed results included MAPKs in group B mainly involved in cell division and environmental stress response [66]. Therefore, PgMAPK8, PgMAPK9, and PgMAPK12 may also response to various stress. MAPKKs in pomegranate were phosphorylated by "S/T-XXXXX-S/T" and anchored by "-D(L/I/V) K-". Due to its wide range of functions, many MAPKK genes have been cloned recently, such as AtMKK1 and AtMKK2-5 of Arabidopsis thaliana, SiMKK of Alfalfa, MEK1 of tomato, NTMEK1-2 of tobacco and ZmMEK1 of maize [7].
Since highly differentiated sequences may responsible for regulatory diversity of MAPK and MAPKK protein, we constructed a phylogenetic tree based on the predicted MAPK and MAPKK proteins of pomegranate and Arabidopsis thaliana. Four groups were separated in phylogenetic tree, which were named following previous studies [67]. The members in each group shared similar intron-exon structures. Consistent with previous studies, these data indicate similar origins and evolutionary patterns of MAPK and MAPKK genes in plants. It is possible that original family member had experienced replication events prior to the eudicot-monocot divergence [58,65].
Previous studies have confirmed that the response of plants to the development and biological process are greatly regulated at the transcription level [66,73]. RNA-seq data showed that the expression levels of PgMAPKs and PgMAPKKs varied greatly in different tissues, which was consistent with the research of pepper and other species [65]. Obviously, some PgMAPK and PgMAPKK genes were highly expressed while others expressed at a low level or not expressed, suggesting their diverse functions in pomegranate growth and development [78]. For example, MAPKs in rice can regulate auxin signaling and cell cycle-related gene expression of root under cadmium stress [79].
Our results will lay the foundation for the functional characterization of MAPK and MAPKK gene family, and further understanding of the structure-function relationship among gene members. In addition, our study also provides comprehensive information and new insights into the evolution and differentiation of PgMAPK and PgMAPKK genes. These studies may provide a molecular basis for key agronomic traits, developmental, and other physiological processes in pomegranates.

Conclusions
In this study, a total of 18 PgMAPK and 9 PgMAPKK genes were identified in pomegranate and their phylogenetic relationships were explored. The gene structure of each group members was similar. PgMAPKs and PgMAPKKs may participate in the apical meristems, flower organ and fruit development, and the same group may have the similar expression pattern. These results lay the foundation for function studies on PgMAPKs and PgMAPKKs during their evolutionary process.