Systematic Analysis and Expression Profiles of the 4-Coumarate: CoA Ligase (4CL) Gene Family in Pomegranate (Punica granatum L.)

4-Coumarate:CoA ligase (4CL, EC6.2.1.12), located at the end of the phenylpropanoid metabolic pathway, regulates the metabolic direction of phenylpropanoid derivatives and plays a pivotal role in the biosynthesis of flavonoids, lignin, and other secondary metabolites. In order to understand the molecular characteristics and potential biological functions of the 4CL gene family in the pomegranate, a bioinformatics analysis was carried out on the identified 4CLs. In this study, 12 Pg4CLs were identified in the pomegranate genome, which contained two conserved amino acid domains: AMP-binding domain Box I (SSGTTGLPKGV) and Box II (GEICIRG). During the identification, it was found that Pg4CL2 was missing Box II. The gene cloning and sequencing verified that this partial amino acid deletion was caused by genome sequencing and splicing errors, and the gene cloning results corrected the Pg4CL2 sequence information in the ‘Taishanhong’ genome. According to the phylogenetic tree, Pg4CLs were divided into three subfamilies, and each subfamily had 1, 1, and 10 members, respectively. Analysis of cis-acting elements found that all the upstream sequences of Pg4CLs contained at least one phytohormone response element. An RNA-seq and protein interaction network analysis suggested that Pg4CL5 was highly expressed in different tissues and may participate in lignin synthesis of pomegranate. The expression of Pg4CL in developing pomegranate fruits was analyzed by quantitative real-time PCR (qRT-PCR), and the expression level of Pg4CL2 demonstrated a decreasing trend, similar to the trend of flavonoid content, indicating Pg4CL2 may involve in flavonoid synthesis and pigment accumulation. Pg4CL3, Pg4CL7, Pg4CL8, and Pg4CL10 were almost not expressed or lowly expressed, the expression level of Pg4CL4 was higher in the later stage of fruit development, suggesting that Pg4CL4 played a crucial role in fruit ripening. The expression levels of 4CL genes were significantly different in various fruit development stages. The results laid the foundation for an in-depth analysis of pomegranate 4CL gene functions.


Introduction
The emergence of lignin enables terrestrial plants to stand upright and transport water for a long distance. Flavonoids are a crucial class of secondary metabolites of plants [1]. These two metabolites participate in regulating various physiological functions and improving the adaptability to environmental stress [2,3]. The 4CL gene can encode multiple enzymes and exhibit distinct substrate affinities that appear to coincide with different metabolic functions [4]. Different 4CL isozymes can selectively catalyze cinnamic acid, p-coumaric acid and other substances to produce corresponding CoA thioesters. Some CoA thioesters can be synthesized by cinnamoyl-CoA reductase (CCR), and cinnamyl The Pg001327.1 gene was amplified from the cDNA of 'Taishanhong' by PCR ( Figure  2). The DNA was collected and sequenced after cloning. The total length of the PCR prod ucts was 1738 bp. The sequencing results were compared with pomegranate 'Taishanhong' genome sequence Pg001327.1 and 'Tunisian' genome sequence XP_ 031382283.1 (Figure 3), reaching 90.14% and 92.05%, respectively, with high homology The Pg001327.1 gene was missing 36 nucleotides within positions 1404-1439 bp. Follow ing the nucleotide positions of Pg001327.1 and XP_031382283.1, the Pg001327.1 nucleotide deletion sequence 'GGTGAAATTTGTATTCGAGGCCAACAGATTATGAAAGG' wa translated, and the amino acid sequence was 'GEICIRGQQIMK', which contained Box I (GEICIRG), indicating that the deletion of amino acid sequence Box II of Pg001327.1 wa caused by genome sequencing and assembly errors. According to the sequencing results the sequence information of the genome Pg001327.1 was supplemented with missing nu cleotides, and the corrected sequence was used for subsequent bioinformatics analysis. The Pg001327.1 gene was amplified from the cDNA of 'Taishanhong' by PCR ( Figure 2). The DNA was collected and sequenced after cloning. The total length of the PCR products was 1738 bp. The sequencing results were compared with pomegranate 'Taishanhong' genome sequence Pg001327.1 and 'Tunisian' genome sequence XP_ 031382283.1 (Figure 3), reaching 90.14% and 92.05%, respectively, with high homology. The Pg001327.1 gene was missing 36 nucleotides within positions 1404-1439 bp. Following the nucleotide positions of Pg001327.1 and XP_031382283.1, the Pg001327.1 nucleotide deletion sequence 'GGT-GAAATTTGTATTCGAGGCCAACAGATTATGAAAGG' was translated, and the amino acid sequence was 'GEICIRGQQIMK', which contained Box II (GEICIRG), indicating that the deletion of amino acid sequence Box II of Pg001327.1 was caused by genome sequencing and assembly errors. According to the sequencing results, the sequence information of the genome Pg001327.1 was supplemented with missing nucleotides, and the corrected sequence was used for subsequent bioinformatics analysis.

Analysis of Physical and Chemical Properties
This study finally identified 12 candidate Pg4CLs genes in the pomegranate genome and named the Pg4CLs gene sequence as Pg4CL1-Pg4CL12 in the order of gene ID (Table  1). An analysis of the physical and chemical properties of Pg4CLs showed that the coding sequence of Pg4CLs was between 1629 and 2343 bp. In addition, the number of exons of Pg4CLs members was between 4 (Pg4CL8) and 14 (Pg4CL1), most of which were 6 exons. The Pg4CLs protein contains 542 (Pg4CL6) to 780 (Pg4CL1) amino acids, and the isoelectric point was between 5.49 (Pg4CL8) and 8.91 (Pg4CL12). A total of 58.3% of the Pg4CLs protein had a pI value greater than 7, which was slightly alkaline. The instability index of Pg4CLs protein ranged from 26.44 (Pg4CL6) to 49.09 (Pg4CL4), and 42% of Pg4CLs protein had good structural stability (less than 40 was stable). The total grand average of hydropathicity (GRAVY) of Pg4CL proteins ranged from −0.057 (Pg4CL4) to 0.159 (Pg4CL3). A total of 66.7% of Pg4CLs proteins were hydrophobic proteins (positive values were expressed as hydrophobic proteins). The analysis found that none of the Pg4CL proteins contained signal peptides and belonged to intracellular proteins. The prediction results of subcellular localization showed that most of the Pg4CL proteins (Pg4CL1, Pg4CL3, Pg4CL6, Pg4CL7, Pg4CL8, Pg4CL11, Pg4CL12) were localized in the plasma membrane, while the others were localized in the cytosol, extracellular, and peroxisome, respectively.

Analysis of Physical and Chemical Properties
This study finally identified 12 candidate Pg4CLs genes in the pomegranate genome and named the Pg4CLs gene sequence as Pg4CL1-Pg4CL12 in the order of gene ID (Table  1). An analysis of the physical and chemical properties of Pg4CLs showed that the coding sequence of Pg4CLs was between 1629 and 2343 bp. In addition, the number of exons of Pg4CLs members was between 4 (Pg4CL8) and 14 (Pg4CL1), most of which were 6 exons. The Pg4CLs protein contains 542 (Pg4CL6) to 780 (Pg4CL1) amino acids, and the isoelectric point was between 5.49 (Pg4CL8) and 8.91 (Pg4CL12). A total of 58.3% of the Pg4CLs protein had a pI value greater than 7, which was slightly alkaline. The instability index of Pg4CLs protein ranged from 26.44 (Pg4CL6) to 49.09 (Pg4CL4), and 42% of Pg4CLs protein had good structural stability (less than 40 was stable). The total grand average of hydropathicity (GRAVY) of Pg4CL proteins ranged from −0.057 (Pg4CL4) to 0.159 (Pg4CL3). A total of 66.7% of Pg4CLs proteins were hydrophobic proteins (positive values were expressed as hydrophobic proteins). The analysis found that none of the Pg4CL proteins contained signal peptides and belonged to intracellular proteins. The prediction results of subcellular localization showed that most of the Pg4CL proteins (Pg4CL1, Pg4CL3, Pg4CL6, Pg4CL7, Pg4CL8, Pg4CL11, Pg4CL12) were localized in the plasma membrane, while the others were localized in the cytosol, extracellular, and peroxisome, respectively.

Analysis of Physical and Chemical Properties
This study finally identified 12 candidate Pg4CLs genes in the pomegranate genome and named the Pg4CLs gene sequence as Pg4CL1-Pg4CL12 in the order of gene ID (Table 1). An analysis of the physical and chemical properties of Pg4CLs showed that the coding sequence of Pg4CLs was between 1629 and 2343 bp. In addition, the number of exons of Pg4CLs members was between 4 (Pg4CL8) and 14 (Pg4CL1), most of which were 6 exons. The Pg4CLs protein contains 542 (Pg4CL6) to 780 (Pg4CL1) amino acids, and the isoelectric point was between 5.49 (Pg4CL8) and 8.91 (Pg4CL12). A total of 58.3% of the Pg4CLs protein had a pI value greater than 7, which was slightly alkaline. The instability index of Pg4CLs protein ranged from 26.44 (Pg4CL6) to 49.09 (Pg4CL4), and 42% of Pg4CLs protein had good structural stability (less than 40 was stable). The total grand average of hydropathicity (GRAVY) of Pg4CL proteins ranged from −0.057 (Pg4CL4) to 0.159 (Pg4CL3). A total of 66.7% of Pg4CLs proteins were hydrophobic proteins (positive values were expressed as hydrophobic proteins). The analysis found that none of the Pg4CL proteins contained signal peptides and belonged to intracellular proteins. The prediction results of subcellular localization showed that most of the Pg4CL proteins (Pg4CL1, Pg4CL3, Pg4CL6, Pg4CL7, Pg4CL8, Pg4CL11, Pg4CL12) were localized in the plasma membrane, while the others were localized in the cytosol, extracellular, and peroxisome, respectively. The prediction of the tertiary structure of proteins contributes to the study of their functions. AlphaFold produces a per-residue confidence score (pLDDT) between 0 and 100. The Pg4CL protein tertiary structure model had a very high confidence and great similarity ( Figure 4). The prediction of the tertiary structure of proteins contributes to the study of their functions. AlphaFold produces a per-residue confidence score (pLDDT) between 0 and 100. The Pg4CL protein tertiary structure model had a very high confidence and great similarity ( Figure 4). . Tertiary structure of Pg4CL proteins. The tertiary structure of the protein was predicted through the AlphaFold. Model confidences are indicated by different colors. Blue, light blue, yellow, and red represent very high, confident, low, and very low confidence, respectively.

Phylogenetic Analysis and Classification of the Pg4CL Gene Family
In order to investigate the phylogenetic relationships of 4CL proteins, a phylogenetic tree was constructed using the neighbor-joining (NJ) method for a total of 38 amino acid sequences of 4CLs from A. thaliana, E. grandis, and P. granatum ( Figure 5). The results showed that Pg4CL5 successfully clustered with At4CL1, At4CL2, At4CL4, and Egr4CL1, indicating that Pg4CL5 belonged to Group 1; Pg4CL2 clustered with At4CL3/Egr4CL2 and belonged to Group 2; the remaining 10 Pg4CLs clustered with At4CL-like/Egr4CL-like gene sequences and belonged to Group 3. Moreover, the functional studies of Group 1 and Group 2 genes in A. thaliana and E. grandis showed that they were involved in lignin synthesis and flavonoid synthesis. Therefore, it speculated that Pg4CL5 and Pg4CL2 play a pivotal role in lignin and flavonoid biosynthesis in the pomegranate.

Phylogenetic Analysis and Classification of the Pg4CL Gene Family
In order to investigate the phylogenetic relationships of 4CL proteins, a phylogenetic tree was constructed using the neighbor-joining (NJ) method for a total of 38 amino acid sequences of 4CLs from A. thaliana, E. grandis, and P. granatum ( Figure 5). The results showed that Pg4CL5 successfully clustered with At4CL1, At4CL2, At4CL4, and Egr4CL1, indicating that Pg4CL5 belonged to Group 1; Pg4CL2 clustered with At4CL3/Egr4CL2 and belonged to Group 2; the remaining 10 Pg4CLs clustered with At4CL-like/Egr4CL-like gene sequences and belonged to Group 3. Moreover, the functional studies of Group 1 and Group 2 genes in A. thaliana and E. grandis showed that they were involved in lignin synthesis and flavonoid synthesis. Therefore, it speculated that Pg4CL5 and Pg4CL2 play a pivotal role in lignin and flavonoid biosynthesis in the pomegranate. Figure 5. The phylogenetic tree of the 4CL gene family in A. thaliana, E. grandis, and P. granatum. The phylogenetic tree was constructed by using the neighbor-joining method with 1000 bootstrap replications. The prefixes At, Egr, and Pg stand for A. thaliana, E. grandis, and P. granatum.

Analysis of Conserved Motifs and Gene Structures of Pg4CL Gene Family
A total of 10 conserved motifs were identified, numbered motifs 1-10 ( Figure S1, Table 2). Pfam analysis showed that among these 10 motifs, five motifs (motif 1, motif 2, motif 4, motif 5 and motif 6) encoded AMP structural domains, of which motif 5 and motif 2 contained the conserved structural domains Box I (SSGTTGLPKGV) and Box II (GEI-CIRG), respectively. The conserved motif distribution of Pg4CLs protein was constructed based on the results of motif analysis ( Figure 6B). The results showed that all Pg4CLs Figure 5. The phylogenetic tree of the 4CL gene family in A. thaliana, E. grandis, and P. granatum. The phylogenetic tree was constructed by using the neighbor-joining method with 1000 bootstrap replications. The prefixes At, Egr, and Pg stand for A. thaliana, E. grandis, and P. granatum.

Analysis of Conserved Motifs and Gene Structures of Pg4CL Gene Family
A total of 10 conserved motifs were identified, numbered motifs 1-10 ( Figure S1, Table 2). Pfam analysis showed that among these 10 motifs, five motifs (motif 1, motif 2, motif 4, motif 5 and motif 6) encoded AMP structural domains, of which motif 5 and motif 2 contained the conserved structural domains Box I (SSGTTGLPKGV) and Box II (GEICIRG), respectively. The conserved motif distribution of Pg4CLs protein was constructed based on the results of motif analysis ( Figure 6B). The results showed that all Pg4CLs amino acid sequences contained all motifs, and the same subgroup contained similar motif distributions. The gene structure was visualized by TBtools ( Figure 6C). The analysis showed that the length of the Pg4CLs gene was 2.1-12 kb. Among the 12 Pg4CLs, 9 of them consisted of 6 exons and 5 introns. In addition, Pg4CL1 contained the most exons (14), resulting in a significantly longer gene length than other members. Group 1 and group 2 4CL members (Pg4CL5, Pg4CL2) had 6 exons. Because of their longer introns, their gene lengths were longer than that of members of group 3 (except for Pg4CL1). The number of introns ranged from 4 (Pg4CL8) to 14 (Pg4CL1). consisted of 6 exons and 5 introns. In addition, Pg4CL1 contained the most exons (14), resulting in a significantly longer gene length than other members. Group 1 and group 2 4CL members (Pg4CL5, Pg4CL2) had 6 exons. Because of their longer introns, their gene lengths were longer than that of members of group 3 (except for Pg4CL1). The number of introns ranged from 4 (Pg4CL8) to 14 (Pg4CL1).

Cis-Acting Elements in the Promoter Region of Pg4CLs
The cis-acting elements are the binding sites of transcription factors and play a crucial role in regulating gene expression function. Visualizing the identified cis-acting elements of Pg4CLs, all Pg4CLs contained at least one phytohormone response element ( Figure 7A), including abscisic acid response element (ABRE), growth hormone response elements

Cis-Acting Elements in the Promoter Region of Pg4CLs
The cis-acting elements are the binding sites of transcription factors and play a crucial role in regulating gene expression function. Visualizing the identified cis-acting elements of Pg4CLs, all Pg4CLs contained at least one phytohormone response element ( Figure 7A), including abscisic acid response element (ABRE), growth hormone response elements (TGA-element, AuxRR-core, TGA-box), MeJA response elements (CGTCA-motif, TGACGmotif), gibberellin response elements (TATC-box, GARE-motif, P-box), and salicylic acid response elements (TCA-elements). TGA-element was an auxin-responsive element and TGA-box was part of an auxin-responsive element. Among the 12 Pg4CLs, 8 contained MeJA response elements, 6 contained gibberellin response elements, 10 contained abscisic acid response elements, 9 contained salicylic acid response elements, and 7 contained auxin response elements. There were also low-temperature response elements (LTR) and drought response elements (MBS) in the promoter regions of some Pg4CLs, 6 contained LTR and 6 contained MBS. These Pg4CLs may respond to plant biotic and abiotic stresses. In addition, there was an MYB binding site (MBSI) involved in gene regulation of flavonoid biosynthesis in the promoter region of the Pg4CL6. Some Pg4CLs promoter regions also contained regulatory elements, such as endosperm expression (GCN4_motif) and meristem expression-related (CAT-box) ( Figure 7A). These Pg4CLs may be closely related to plant growth and development. In addition, G-box and G-Box were cis-acting regulatory elements involved in light responsiveness. The binding sites of G-box and G-box transcription factors were different, the former was CACGTG and the latter was CACGTT. A total of 24 light-responsive elements appeared in this study ( Figure 7B), of which the promoter regions of the Pg4CLs all contained a large number (5-17) of light-responsive elements, indicating that the Pg4CLs may be regulated by light. ment and TGA-box was part of an auxin-responsive element. Among the 12 Pg4CLs, 8 contained MeJA response elements, 6 contained gibberellin response elements, 10 contained abscisic acid response elements, 9 contained salicylic acid response elements, and 7 contained auxin response elements. There were also low-temperature response elements (LTR) and drought response elements (MBS) in the promoter regions of some Pg4CLs, 6 contained LTR and 6 contained MBS. These Pg4CLs may respond to plant biotic and abiotic stresses. In addition, there was an MYB binding site (MBSI) involved in gene regulation of flavonoid biosynthesis in the promoter region of the Pg4CL6. Some Pg4CLs promoter regions also contained regulatory elements, such as endosperm expression (GCN4_motif) and meristem expression-related (CAT-box) ( Figure 7A). These Pg4CLs may be closely related to plant growth and development. In addition, G-box and G-Box were cis-acting regulatory elements involved in light responsiveness. The binding sites of G-box and G-box transcription factors were different, the former was CACGTG and the latter was CACGTT. A total of 24 light-responsive elements appeared in this study ( Figure  7B), of which the promoter regions of the Pg4CLs all contained a large number (5-17) of light-responsive elements, indicating that the Pg4CLs may be regulated by light.

Protein Interaction Networks of Pg4CL Gene Family
The protein interaction network prediction of 12 Pg4CLs showed no direct interaction between the proteins, which indicated that these proteins had no direct regulatory relationship and might play their functions. The potential functions of 4CL proteins Pg4CL5 and Pg4CL2 were predicted by protein interaction network analysis ( Figure 8). The results are shown in Figure 8A,B, Pg4CL5 had high similarity with A. thaliana 4CL1 (e value was 5.51 × 10 −234 ), and 4CL1 was co-expressed with lignin synthesis proteins (C4H, IRX4, CYB84A1). Pg4CL2 was similar to A. thaliana 4CL3 (e value was 3.9 × 10 −212 ); 4CL3 was co-expressed with flavonoid synthesis proteins (TT4, TT5, OMT1, F3H, and FLS1). In addition, 4CL1 and 4CL3 were co-expressed with proteins in other metabolite synthesis pathways (At1g80820, HCT, LysoPL2).

Protein Interaction Networks of Pg4CL Gene Family
The protein interaction network prediction of 12 Pg4CLs showed no direct interaction between the proteins, which indicated that these proteins had no direct regulatory relationship and might play their functions. The potential functions of 4CL proteins Pg4CL5 and Pg4CL2 were predicted by protein interaction network analysis ( Figure 8). The results are shown in Figure 8A,B, Pg4CL5 had high similarity with A. thaliana 4CL1 (E-value was 5.51 × 10 −234 ), and 4CL1 was co-expressed with lignin synthesis proteins (C4H, IRX4, CYB84A1). Pg4CL2 was similar to A. thaliana 4CL3 (E-value was 3.9 × 10 −212 ); 4CL3 was co-expressed with flavonoid synthesis proteins (TT4, TT5, OMT1, F3H, and FLS1). In addition, 4CL1 and 4CL3 were co-expressed with proteins in other metabolite synthesis pathways (At1g80820, HCT, LysoPL2).

Expression Analysis of Pg4CL Genes with RNA-Seq
To investigate the expression pattern of Pg4CL genes in different tissues of pomegranate, the expression of Pg4CLs family genes was analyzed based on published transcriptome data of different tissues of the pomegranate. The results are shown in Figure 9, Pg4CL genes were expressed in leaves, roots, flowers, seed coats, and pericarps, but there were significant differences in the expression levels of genes in different subfamilies ( Figure 9A). The class I 4CL gene (Pg4CL5) was highly expressed in different tissues of pomegranate, with the highest expression in the roots and pericarps. Class II 4CL gene (Pg4CL2) was highly expressed in the leaves, flowers, and exocarps of 'Dabenzi' and the pericarps of 'Tunisia' and 'Baiyushizi', but the lower expression in the roots and pericarps of 'Dabenzi'. In addition, some of the 4CL-like genes (Pg4CL1, Pg4CL4, Pg4CL6, and Pg4CL11) were highly expressed in various tissues of pomegranate, but Pg4CL8, Pg4CL9, and Pg4CL10 were lowly expressed or not expressed in various tissues, and the expression levels of 4CL-like genes were significantly different in various tissues. These results suggest that 4CL-like genes may be functionally differentiated.

Expression Analysis of Pg4CL Genes with RNA-Seq
To investigate the expression pattern of Pg4CL genes in different tissues of pomegranate, the expression of Pg4CLs family genes was analyzed based on published transcriptome data of different tissues of the pomegranate. The results are shown in Figure 9, Pg4CL genes were expressed in leaves, roots, flowers, seed coats, and pericarps, but there were significant differences in the expression levels of genes in different subfamilies (Figure 9A). The class I 4CL gene (Pg4CL5) was highly expressed in different tissues of pomegranate, with the highest expression in the roots and pericarps. Class II 4CL gene (Pg4CL2) was highly expressed in the leaves, flowers, and exocarps of 'Dabenzi' and the pericarps of 'Tunisia' and 'Baiyushizi', but the lower expression in the roots and pericarps of 'Dabenzi'. In addition, some of the 4CL-like genes (Pg4CL1, Pg4CL4, Pg4CL6, and Pg4CL11) were highly expressed in various tissues of pomegranate, but Pg4CL8, Pg4CL9, and Pg4CL10 were lowly expressed or not expressed in various tissues, and the expression levels of 4CL-like genes were significantly different in various tissues. These results suggest that 4CL-like genes may be functionally differentiated.
Based on the differences in expression patterns, the Pg4CLs family was subjected to cluster analysis ( Figure 9B), classified into three groups, A, B, and C. The gene expression level of group A was significantly higher than those of groups B and C and had higher expression in various tissues. The genes in group B were low or not expressed in all tissues, suggesting that these genes may have lost some of their functions during evolution. Genes in group C were expressed in leaves, flowers, and fruits, and the expression of different genes had some tissue specificity. Cluster analysis showed that different subfamily genes might have similar expression patterns, such as class I Pg4CL5 and 4CL-like genes (Pg4CL1, Pg4CL4, Pg4CL6, Pg4CL11), which had high expression in pomegranate roots, leaves, flowers, seed coats, and pericarps.

qRT-PCR Analysis of Pg4CLs during Fruit Development in Pomegranate
To explore the putative function of Pg4CLs in fruit development, the expression patterns of Pg4CLs in pericarp of pomegranate were analyzed by qRT-PCR ( Figure 10). The expression level of Pg4CL1 showed an increasing trend in S1-S4 stages, the highest expression level was reached at the S4 stage. With the ripening of pomegranate fruit, the flavonoid content gradually decreased, and the expression level of Pg4CL2 and flavonoid content demonstrated a similar trend, indicating that Pg4CL2 was involved in the synthesis of flavonoids. The expression level of Pg4CL4 was higher in the later stage of fruit development, suggesting that Pg4CL4 played a crucial role in fruit ripening. The expression Based on the differences in expression patterns, the Pg4CLs family was subjected to cluster analysis ( Figure 9B), classified into three groups, A, B, and C. The gene expression level of group A was significantly higher than those of groups B and C and had higher expression in various tissues. The genes in group B were low or not expressed in all tissues, suggesting that these genes may have lost some of their functions during evolution.
Genes in group C were expressed in leaves, flowers, and fruits, and the expression of different genes had some tissue specificity. Cluster analysis showed that different subfamily genes might have similar expression patterns, such as class I Pg4CL5 and 4CL-like genes (Pg4CL1, Pg4CL4, Pg4CL6, Pg4CL11), which had high expression in pomegranate roots, leaves, flowers, seed coats, and pericarps.

qRT-PCR Analysis of Pg4CLs during Fruit Development in Pomegranate
To explore the putative function of Pg4CLs in fruit development, the expression patterns of Pg4CLs in pericarp of pomegranate were analyzed by qRT-PCR ( Figure 10). The expression level of Pg4CL1 showed an increasing trend in S1-S4 stages, the highest expression level was reached at the S4 stage. With the ripening of pomegranate fruit, the flavonoid content gradually decreased, and the expression level of Pg4CL2 and flavonoid content demonstrated a similar trend, indicating that Pg4CL2 was involved in the synthesis of flavonoids. The expression level of Pg4CL4 was higher in the later stage of fruit development, suggesting that Pg4CL4 played a crucial role in fruit ripening. The expression of Pg4CL5 was higher in S5 stages. In addition, Pg4CL3, Pg4CL7, Pg4CL8, and Pg4CL10 were almost not expressed or lowly expressed in S2-S7, and the expression levels in the S1 stage were significantly higher than those in S2-S7, and the expression levels of 4CL-like genes were significantly different in various fruit development stages.  Figure 10. Relative expression levels of 12 Pg4CL from the pericarp during fruit development. Note: the seven stages of fruit development were S1, S2, S3, S4, S5, S6, and S7, and the specific dates of the seven stages were 15 July, 28 July, 10 August, 23 August, 5 September, 18 September, and 1 October, respectively. The vertical bar in the bar graph is standard error. Bars with different letters (a-d) indicate significant differences at p < 0.05 according to Duncan's test.

Discussion
Multiple gene families encode 4CLs in higher plants, and the number of their gene members varies depending on the plant species. In recent years, the identification and functional analysis of 4CL gene families have been reported for many species, such as 13 members in A. thaliana [7], 13 members in E. grandis [10], 12 members in P. bretschneideri [11], 14 members in O. sativa [8], etc. In this study, the identification of the 4CL gene family members in the 'Taishanhong' genome revealed that A. thaliana At4CL3 had the highest amino acid sequence similarity to Pg001327.1. However, the At4CL3 in the pomegranate Figure 10. Relative expression levels of 12 Pg4CL from the pericarp during fruit development. Note: the seven stages of fruit development were S1-S7, and the specific dates of the seven stages were 15 July, 28 July, 10 August, 23 August, 5 September, 18 September, and 1 October, respectively. The vertical bar in the bar graph is standard error. Bars with different letters (a-d) indicate significant differences at p < 0.05 according to Duncan's test.

Discussion
Multiple gene families encode 4CLs in higher plants, and the number of their gene members varies depending on the plant species. In recent years, the identification and functional analysis of 4CL gene families have been reported for many species, such as 13 members in A. thaliana [7], 13 members in E. grandis [10], 12 members in P. bretschneideri [11], 14 members in O. sativa [8], etc. In this study, the identification of the 4CL gene family members in the 'Taishanhong' genome revealed that A. thaliana At4CL3 had the highest amino acid sequence similarity to Pg001327.1. However, the At4CL3 in the pomegranate 'Tunisia' genome and other homologous protein sequence alignment results of different species showed that the Pg001327.1 was missing Box II, a conserved structural domain to the 4CL gene family. It was speculated that the Pg001327.1 might have the sequencing and splicing errors of the genome, which caused the deletion of Box II in the amino acid sequence. The Pg001327.1 gene cloning verified this conjecture, and the genome was corrected. A total of 12 Pg4CLs were identified in the genome of 'Taishanhong'. Combined with the phylogenetic tree of 4CL amino acid sequences of A. thaliana and E. grandis, it was found that there was 2-4 4CL genes in class I, which was different from dicots plants [13]. In addition, only one gene was contained in the 4CL class I gene of the pomegranate and E. grandis [10]. By clustering with the model plant A. thaliana, the expression of its homologous gene Pg4CL5 may be related to lignin synthesis based on the functions of At4CL1, At4CL2, and At4CL5 [3,4]. Pomegranate, E. grandis, and A. thaliana contained only one 4CL class II gene, and its homologue Pg4CL2 was presumed to be involved in flavonoid synthesis based on the function of At4CL3 [3,4]. In addition, pomegranate contained 10 4CL-like genes, and although the functions of these 4CL-like genes were unknown, results from studies in other species suggested that both 4CL and 4CL-like genes can regulate various physiological functions of plants and improve their ability to adapt to their environment [17].
Based on the conserved motifs of Pg4CLs, it was found that all amino acid sequences of Pg4CLs contained 10 motifs, and similar motif distributions were found in the same subgroup. A gene structure analysis showed that Pg4CL genes contained 4-14 exons, including 6 exons in class I and class II 4CL genes, which was consistent with the number of exons in A. thaliana (3)(4)(5)(6) and Gossypium hirsutum (4-6) [9,27], indicating that the structure of 4CL class I and class II genes were relatively conservative. However, the number of exons of 4CL-like varied greatly among different species, such as 1-5 in A. thaliana and 1-11 in Populus pruinosa [27]. This study showed that the members of Pg4CLs were conserved in both gene structure and protein conserved motifs.
Cis-acting elements existed in gene promoters and formed specific binding with transcription factors, which played a principal role in regulating the expression of target genes [28,29]. The promoter region of pomegranate Pg4CLs was enriched in a variety of cis-acting elements related to hormone responses and abiotic stresses, and this result was broadly consistent with the pear and G. hirsutum [9,30], indicating that 4CL gene promoter regulatory elements were somewhat conservative among different species. In addition, 66.7% of the promoter regions of Pg4CLs contained MYB transcription factor binding sites, among which, 6 Pg4CLs contained MYB binding sites (MBS) involved in drought induction, 4 Pg4CLs contained MBS involved in photoreaction, and there were also MBS involved in the regulation of flavonoid biosynthesis genes in the promoter region of Pg4CL6; these three MBS were found in the 4CL gene family of longan [31]. All Pg4CLs contained at least one phytohormone response element, such as the TGA-element, TATC-box, TGACG-motif, and TCA-elements, etc., which indicated that Pg4CLs might respond to auxin, gibberellin, jasmonic acid, salicylic acid, and other hormones involved in plant growth, development, and stress response regulation. In addition, 4CLs involved in stress resistance had been verified in plants, such as Fraxinus mandshurica, G. hirsutum, and poplar [9,17,27].
The protein interaction network prediction of 12 Pg4CLs revealed no interaction between their proteins, indicating no direct regulatory relationship between these proteins [13]. The interaction network analysis of Pg4CL5 and Pg4CL2 showed that Pg4CL5 was co-expressed with lignin biosynthesis, combined with a high expression of Pg4CL5 in different tissues and the highest expression in roots and pericarps, indicating that Pg4CL5 may be associated with lignin biosynthesis in various tissues. Pg4CL5 had been cloned and validated in pomegranate 'Baiyushizi', and the conclusion is consistent with the present study [32]. In combination with the 4CL class II gene Pg4CL2 co-expressed with flavonoid synthesis, the high expression of Pg4CL2 in flowers, seed coats, and fruit may be associated with flavonoid synthesis and pigment accumulation [33]. In addition, it was also found that Pg4CL2 had high expression in the pericarps of 'Tunisia' and 'Baiyushizi', but a lower expression in 'Dabenzi', which may be due to different genotypes and differences in gene expression patterns and may also be related to fruit characteristics [34]. We found that the expression of Pg4CL2 decreased gradually, which was similar to the change trend of flavonoid content, indicating that Pg4CL2 was involved in the synthesis of flavonoids. This is consistent with the results in Pyrus bretschneideri [30] and Morus alba [35]. To a large extent, the function of the 4CL-like gene is still unknown, and it has been reported that both 4CL and 4CL-like genes can regulate various physiological functions of plants. The expression of Pg4CLs family genes was analyzed based on published transcriptome data of different tissues of pomegranate, some 4CL-like genes (Pg4CL1, Pg4CL6, Pg4CL11) were highly expressed in pomegranate tissues, but some 4CL-like genes (Pg4CL8, Pg4CL9, Pg4CL10) were low or not expressed in various tissues. In addition, the expression of Pg4CL in developing pomegranate fruits was analyzed by quantitative real-time PCR (qRT-PCR), Pg4CL1 showed an increasing trend in S1-S4 stages, the highest expression level was reached at the S4 stage; this is similar to the results in the Chinese pear [30]. Pg4CL3, Pg4CL7, Pg4CL8, and Pg4CL10 were significantly increased, suggesting that these genes might play important roles in the early stage of fruit development [30]. The expression level of the 4CL-like gene in different fruit developments is different, which indicates that the function of the 4CL-like gene may be differentiated.

Gene Cloning of Pg001327.1
Total RNA was extracted using the BioTeke plant total RNA extraction kit (centrifugal column type) (BioTeke Corporation Co., Ltd., Beijing, China), and cDNA was obtained by the reverse transcription kit (PrimeScript TM RT reagent Kit with gDNA Eraser, TaKaRa Biomedical Technology Co., Ltd., Tokyo, Japan). Oligo 7 software (Cascade, CA, USA) was used to design cloning primers. Used the cDNA as a template, the upstream and downstream primers (F: 5 -ATGATATCTGTTGCCCCTTCT-3 , R: 5 TTATAAGGGAGTGGAGGAGG-3 ) were used for PCR amplification. The PCR reaction system was 50 µL, including 25 µL 2 × Rapid Taq Master Mix (Vazyme Biotech Co., Ltd., Nanjing, China), 2 µL upstream and downstream primers, 2 µL cDNA template, and 21 µL ddH 2 O. The PCRs were performed on the Applied Biosystems 7500 (Thermo Fisher Scientific Inc., Waltham, MA, USA). The amplification procedures were 95 • C for 3 min; 95 • C for 15 s, 58 • C for 45 s, 72 • C for 1 min, a total of 35 cycles, and 72 • C for 5 min. The similarity between sequencing results and sequences Pg001327.1 and XP_031382283.1 was analyzed using BioXM 2.6 software, and the conservativeness of sequences was analyzed using Jalview software (Dundee, UK) after multiple sequence alignments using MAFFT [39].

Multiple Sequences Alignment and Phylogenetic Analysis
Multiple sequences alignment was performed by MAFFT [39] using a total of 38 fulllength protein sequences, including Arabidopsis thaliana (At, 13 sequences), Punica granatum (Pg, 12 sequences), and Eucalyptus grandis (Egr, 13 sequences). To investigate the phylogenetic relationships of the 4CL proteins, MEGA7.0 software [44] was used to construct a phylogenetic tree using the NJ method, Bootstrap 1000 repeats, and the others were set as default parameters. The online software EvolView (https://evolgenius.info/ (accessed on 11 February 2022)) was used to beautify the evolutionary tree [45]. According to the 4CL classification of Arabidopsis and E. grandis, Pg4CLs were classified and predicted.

Analysis of Gene Structure and Protein Conserved Motifs
According to the obtained pomegranate 4CL protein sequences and gene sequences (including introns, exons and upstream and downstream sequences), the online software GSDS 2.0 (http://gsds.gao-lab.org/ (accessed on 15 February 2022)) was used to analyze its gene structure [46]. The protein conserved motifs of Pg4CLs were analyzed online by MEME (http://meme-suite.org/tools/meme (accessed on 15 February 2022)) with a maximum number of motifs of 10 [47]. The phylogenetic tree of Pg4CLs, the obtained motifs, and the gene structure, were visualized using TBtools (Guangdong, China) [48].

Analysis of Cis-Acting Elements and Protein-Protein Interaction Networks
The promoter region contains conserved sequences required for specific binding and transcription initiation of RNA polymerase, which is generally located 1500-2000 bp upstream of the gene. The 2000 bp upstream base sequence of Pg4CL genes were extracted from the pomegranate genome database using TBtools [48]. The promoter characteristics and cis-acting elements of Pg4CLs were analyzed by the online website PlantCARE (http://bioinformatics. psb.ugent.be/webtools/plantcare/html/ (accessed on 15 February 2022)) [49], and the predicted results were sorted and simplified. Finally, TBtools was used to visualize cis-acting elements. The co-expressions of Pg4CLs, Pg4CL5, and Pg4CL2 in the pomegranate were analyzed (A. thaliana was selected as the model species) by using String (https://string-db.org (accessed on 15 February 2022)) [50].

Expression Patterns by Quantitative Real-Time PCR (qRT-PCR)
We collected samples of seven stages S1, S2, S3, S4, S5, S6, and S7 of fruit development in the pomegranate orchard of Baimashi Village, Tai'an City, Shandong Province. The specific dates of the seven stages were 15 July, 28 July, 10 August, 23 August, 5 September, 18 September, and 1 October. RNA was extracted from the exocarp of the samples, and cDNA was further obtained. Specific primers of Pg4CL2 and Pg4CL5 were designed ( Table 4). The pomegranate PgActin was used as the internal reference gene. There were three biological replicates for each treatment. The qRT-PCR reaction system was 20 µL, including 10 µL Hieff ® qPCR SYBR Green Master Mix (Low Rox Plus) (Yeasen Biotechnology Co.,Ltd., Shanghai, China), 0.4 µL upstream and downstream primers, 2 µL cDNA template, and 7.2 µL ddH2O. The amplification procedures were 95 • C for 5 min; 95 • C for 15 s, 58 • C for 45 s, a total of 40 cycles. The relative expression level was analyzed by 2 −∆∆CT method. The data were analyzed and plotted using SPSS 23.0 (Santa Clara, CA, USA) and Origin 2018 software (Newton, MA, USA), respectively.

Conclusions
In this study, 12 Pg4CLs were identified in the pomegranate genome, all of which contained AMP binding domains. According to the established 4CL classification of A. thaliana and E. grandis, Pg4CLs were divided into three subfamilies, I, II, and 4CLlike. Each subfamily had 1, 1, and 10 members, respectively. Box I and box II were highly conserved in class I and II members and relatively conserved in 4CL-like members. Among them, 4CL class I gene Pg4CL5 was involved in lignin synthesis, and class II Pg4CL2 was related to flavonoid synthesis. The results of gene cloning corrected the sequence information of Pg4CL2 in the 'Taishanhong' genome and laid a foundation for its functional research.