Phenylalanine Ammonia-Lyase (PAL) Genes Family in Wheat ( Triticum aestivum L.): Genome-Wide Characterization and Expression Proﬁling

: Phenylalanine ammonia-lyase (PAL) is the ﬁrst enzyme in the phenylpropanoid pathway and plays a vital role in adoption, growth, and development in plants but in wheat its characterization is still not very clear. Here, we report a genome-wide identiﬁcation of TaPAL genes and analysis of their transcriptional expression, duplication, and phylogeny in wheat. A total of 37 TaPAL genes that cluster into three subfamilies have been identiﬁed based on phylogenetic analysis. These TaPAL genes are distributed on 1A, 1B, 1D, 2A, 2B, 2D, 4A, 5B, 6A, 6B, and 6D chromosomes. Gene structure, conserved domain analysis, and investigation of cis -regulatory elements were systematically carried out. Chromosomal rearrangements and gene loss were observed by evolutionary analysis of the orthologs among Triticum urartu , Aegilops tauschii , and Triticum aestivum during the origin of bread wheat. Gene ontology analysis revealed that PAL genes play a role in plant growth. We also identiﬁed 27 putative miRNAs targeting 37 TaPAL genes. The high expression level of PAL genes was detected in roots of drought-tolerant genotypes compared to drought-sensitive genotypes. However, very low expressions of TaPAL10, TaPAL30, TaPAL32, TaPAL3, and TaPAL28 were recorded in all wheat genotypes. Arogenate dehydratase interacts with TaPAL29 and has higher expression in roots. The analysis of all identiﬁed genes in RNA-seq data showed that they are expressed in roots and shoots under normal and abiotic stress. Our study offers valuable data on the functioning of PAL genes in wheat.


Introduction
Phenylalanine ammonia-lyase (PAL) produces precursors of various secondary metabolites, including lignin, phytoalexin, and phenolic compounds. This gene family is also associated with the production of the first enzyme of the phenylpropanoid pathway [1][2][3]. PAL genes have a molecular mass in the range of 270-330 kilodalton (kDa) and are present in higher plants, yeast, some bacteria, and fungi. However, these genes are not found in animals because they have another histidine ammonia lyase (HAL) [4]. The PAL family encodes a variety of protective compounds such as components of the cell wall, flavonoids, phytoalexins, and furanocoumarin [5,6]. The conversion of L-phenylalanine to cinnamic acid, linking primary metabolism with secondary metabolism catalyzed by the PAL enzymes, also plays an essential role in phenylpropanol biosynthesis, a speed-limiting step in phenylpropanol metabolism [1].This metabolic pathway is involved in the production of various natural products (phytoalexin, hydroxycinnamic acids, flavonoids, etc.), and is also reported as a role player in phenolic glycoside and benzene compound synthesis, which are part of several enzyme-regulated reactions [1,2,[7][8][9][10]. Thus, phenylpropanoids play a critical role for the growth, development, and survival of vascular plants [1]. PAL activity is induced dramatically in reply to various stimuli such as tissue wounding, pathogenic attack, light, low temperature, and hormonal triggers [5,11].
The first plant PAL was found in Petroselinum crispum in crystal forms [12]. The PAL encoding genes are typically discovered as small gene families comprising one to five members [13,14]. During the evolution of higher plants, PAL diversified into different functions. Both HAL and PAL have different primary protein sequences, but they perform similar functions in vivo. It was thought that PAL is formed from HAL when the fungi and plants separated from other kingdoms [15,16]. There are two (the first is horizontal gene transfer (HGT) and the second is gene duplication) methods of evolution are reported. Studies showed that gene duplication is the major method of evolution and gymnosperms are thought to be the ancestors of angiosperms [16,17]. For instance, four PAL gene family members in Arabidopsis thaliana [18,19], five in Populus trichocarpa [20], three in Scutellaria baicalensis [21], and three in Coffeac anephora [22] have been recognized and functionally described. Nevertheless, some studies have indicated more than five PAL genes in certain plants. Moreover, five separate PAL genes were recognized in Pinus taeda [23]. Furthermore, as many as thirteen PAL genes were discovered in Cucumis sativus [24], twelve in Citrullus lanatus [24], thirteen in Cucumis melo, and sixteens in Vitis vinifera [25].
Wheat (Triticum aestivum) is an important source of starch, protein, and minerals in the diet for more than 35% of the world's inhabitants. It is grown on a variety of soil and in a range of environmental conditions [26]. To prevent environmental stresses, the wheat plant has evolved multiple plant protection systems [27]. Previous studies showed the involvement of the PAL gene family in coping with the environmental stresses by activating the transcriptional processes. The PAL gene family is responsible for the adaptation and resistance of plants to unfavorable biotic and abiotic environmental conditions. It also controls the expression and inhibition of genes to amend different biochemical pathways. Our research explored, a theoretical way. the functional characterization, and differential expression analysis of the PAL gene family engaged in the root development of six different wheat genotypes. This study carries immense importance in understanding the stress tolerance mechanisms in wheat and the role of the PAL gene family in the same.

Retrieval of Protein Sequences Containing the PAL Gene Family in Triticum aestivum
Two methods were applied to retrieve the phenylalanine ammonia-lyase (PAL) domaincontaining sequences in wheat. The first method searched the PAL gene family members in the Triticum aestivum by inputting the keywords "Phenyl ammonium lyase (PAL)" in the Ensembl plants database (http://plants.ensembl.org/Triticum_aestivum/ (accessed on 16 March 2021) [28], while in the second method, the search for wheat PAL genes was conducted using Arabidopsis thaliana PAL genes (At3g53260, At2g37040, At3g10340, and At5g04230) as reference/query to BLASTP [29] against wheat protein database International Wheat Genome Sequencing Consortium (IWGSC) (V2.1), and Triticum aestivum chromosome 3B RELEASE 1.0 (http://wheat-urgi.versailles.inra.fr/, accessed on 16 April 2021). Based on more than 75% sequence identity and E-value ≤ 1e-10, the wheat PAL gene family was identified. Unique non-redundant wheat PAL gene family members were identified by performing multiple sequence alignments using the ClustalW tool [30], and redundant gene sequences were removed. Further, the Pfam [31] and SMART (http://smart.embl-heidelberg.de/, accessed on 18 May 2021) databases [32] were used for the identification and confirmation of PAL-conserved domains.

Gene Structure and Conserved Domain Analysis of TaPAL Genes
The online Gene Structure Display Server GSDS 2.0 (http://gsds.gao-lab.org/, accessed on 20 May 2021) [33] was used to examine the gene structure by comparing the open reading frame (ORF) sequence with the corresponding genomic sequences. The conserved motifs of TaPAL protein analysis were determined by MEME Suit (Multiple EM for Motif Elicitation) Version 4.12.0 (http://meme-suite.org/tools/meme, accessed on 25 May 2021) [34] using the following parameters: the number of motifs to be found was ten, and the motif width was kept between 10 and 200; site distribution was set at zero or one occurrence per sequence (thus each sequence was allowed to contain at most one occurrence of each motif). The chromosomal location was drawn on respective chromosomes. The molecular weight (g/mol), isoelectric point, protein charge, and the subcellular location were retrieved from UniProt (https://www.uniprot.org, accessed on 28 May 2021) [35]. The conserved domains of TaPAL proteins were examined using the Unipro UGENE software package [36], which joined the sequences into alignment by the ClustalW algorithm and displayed conservation in the form of color patterns differentiating each amino acid based on physiochemical properties. Protein domain analysis was also performed by using TaPAL1

Phylogenetic Identification
To retrieve the protein sequence containing the PAL domain, 37 protein sequences of wheat were used as queries to BLASTP against the Triticum urartu, Solanum tuberosum, and Hordeum vulgare. The protein sequences with more than 70% sequence identity were downloaded from NCBI (https://www.ncbi.nlm.nih.gov/, accessed on 16 June 2021). The PAL genes of Arabidopsis thaliana, Zea mays, and Oryza sativa were retrieved from Ensembl (http://plants.ensembl.org, accessed on 18 June 2021). Molecular Evolutionary Genetics Analysis (MEGA version X) [38] was used to infer the evolutionary history of TaPAL by the maximum-likelihood (ML) method and 1000 bootstrap replicates were used. While the gene duplication was calculated by using MCScanX in Tbtools [39].

Synteny Analysis
The visualization sequence identity and synteny analysis of the PAL family genes were performed using Tbtools [40]. These analyses were used to study the sequence similarity patterns [41].

miRNA Prediction in Wheat PAL Family Genes
miRNA prediction was carried out as previously described [42]. In detail, all the genome sequences of TaPAL genes were submitted against the available reference of miRNA sequences using the psRNATarget Server (https://www.zhaolab.org/psRNATarget/, accessed on 14 September 2021) with default setting [43]. While the visualization of interaction was carried out with the help of Cytoscape software (https://cytoscape.org/, accessed on 14 September 2021) by following the default setting [44].

Promoter and Gene Ontology (GO) Enrichment Analysis
The upstream 1 kb nucleotide sequence from the start codon was retrieved for promoter analysis of all the 37 TaPAL genes using the Ensembl Plants database (http://plants.ensembl. org/Triticum_aestivum/, accessed on 19 June 2021). Subsequently, these were also subjected to identification of the already-defined motif by using the PLACE cis-regulatory element database [2,45]. These databases also helped to obtain five cis-regulatory elements (CACTFTPPCA1, CATTBOX1, ARR1AT, CGCGBOXAT, and WBOXNTERF) and their location. GO analysis of TaPAL protein sequences was conducted by using online

Protein-Protein Interaction
Protein-protein interactions of wheat were analyzed by using the STRING online server (http://string.embl.de, accessed on 14 September 2021) with the default setting [47].

Analysis of RNA-Seq Base expression profiling
Six different wheat varieties (Table 1) were used to analyze PAL RNA-seq base expression profiling of the TaPAL genes. All the wheat varieties were grown under normal conditions at the National Agricultural Research Centre (NARC), Islamabad, Pakistan. The root samples (each data point pooled from eight plants) were collected as previously described [48,49] from 35-day-old seedlings of all six wheat varieties and were frozen in liquid nitrogen prior to storage at −80 • C until use. Total RNA of the aboveprepared samples was isolated using the Gene JET™ Plant RNA Purification Mini Kit (Catalog # K0801). Illumina HiSeq2500 platform was used for paired-end (PE) sequencing of wheat RNA samples. The quality of raw data was checked with the help of FastQC (http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/, accessed on 25 June 2021). Trimming of reads (quality scores < 20) was done with the help of the Trimmomatic tool [50]. The HISAT2 (version 2.0.5) (http://ccb.jhu.edu/software/hisat2/faq.shtml, accessed on 28 June 2021) tool with default settings [51] was used for constructing a transcriptome map based on the genome reference of wheat (ftp://ftp.ensemblgenomes.org/pub/release-25 /plants/fasta/triticum_aestivum/dna/, accessed on 4 July 2021). The transcripts were assembled with String Tie software [52], while the NOIseq package was used to find the expression level of genes and transcripts and to draw the graph of the genes. The NOIseq package [53] was used to calculate the FPKM (fragments per kilobase per million) mapped. The genes with FPKM values greater than one were retained for subsequent analyses. The expression levels were also analyzed at different stages in root and shoot tissues in response to abiotic stress (drought, heat, combination of both, Supplementary Sheet S1). The RNA-seq data was retrieved in transcripts per million (TPM) from the expVIP [54] wheat expression browser (http://www.wheat-expression.com/, accessed on 6 August 2021). To check the expression patterns of a given PAL gene subjected to abiotic stress, the ratio of the expression under treatment to the control was calculated (ratio ≥ 1 = altered under stress; ratio ≤ 1 = un-altered under stress). Finally, a heatmap was constructed by R package pheatmap (version 1.7) [55].

Common Wheat PAL Gene Characterization and Identification
Systematic approaches were used to identify and characterize TaPAL genes from the T. aestivum genome using various genomic resources and tools. Finally, 37 full-length coding PAL genes were identified. The results evidenced that the 37 sequences containing PAL-HAL domains belonged to the PAL gene family (Supplementary File S1). The detailed information on these identified genes, including gene ID, chromosomal location, start and  Table 2. The stability of protein can be checked through the number of amino acids. The peptide length of deduced TaPAL proteins ranged from 498 (TaPAL37) to 714 (TaPAL7) amino acids with corresponding molecular weights ranging from 52.78 to 77.34 kDa with an average weight 74.97 kDa. Their predicted isoelectric (IP) points varied from 5.76 (TaPAL35) to 7.57 (TaPAL31), indicating that different TaPAL proteins function in different microenvironments. This IP value was used to measure the net charge on the proteins. The proteins with IP < 7 were considered as acidic and IP > 7 as basic. Thirty-five of the 37 TaPAL genes were acidic in nature. In addition, analysis of the subcellular localization of the T. aestivum indicated that all 37 PAL transcripts are localized in the cytoplasm ( Table 2).

Localization of TaPAL Genes on the Chromosomes
The predicted TaPAL genes were localized on Triticum aestivum chromosomes. For this purpose, 37 TaPAL genes were mapped to chromosomes of common wheat based on physical positions, as shown in Figure 1. Our results showed that the distribution pattern of the TaPAL genes was different on each chromosome. The maximum number (six) of PAL genes were present on chromosome 1B and 2B followed by chromosome 2A (five genes), chromosome 2D (four genes), and chromosomes 1A, 1D, 6B, and 6D (three genes each), while chromosome 5B had two genes. The remaining chromosomes (4A and 6A) each contained a single gene. The shortest chromosome was 6D with three genes, while the largest chromosome was 2B containing six genes. Some genes were far away from each other, and some genes were in cluster form, which indicates that these may contains a single QTL. Thus, the chromosomal localization studies revealed an uneven distribution of the 37 candidate genes on all the chromosomes of T. aestivum ( Figure 1). In nature there are two types of duplication involved in evolution. The first is first tandem duplication (among two or more genes on the same chromosome) and the second is segmental duplication (among different chromosomes and the same clades). TaPAL25/TaPAL7, TaPAL33/TaPAL29, TaPAL37/TaPAL18, TaPAL25/TaPAL14, TaPAL33/TaPAL11, TaPAL37/TaPAL27, TaPAL18/TaPAL27, TaPAL1/TaPAL14, TaPAL29/TaPAL11, TaPAL34/TaPAL22, TaPAL12/TaPAL23, TaPAL36/TaPAL35, TaPAL34/TaPAL16, TaPAL22/TaPAL16, TaPAL24/TaPAL3, TaPAL24/TaPAL19, TaPAL28/TaPAL26, TaPAL32/TaPAL30, and TaPAL3/TaPAL19 are segmentally duplicated in wheat.

Identification of Conserved Protein Domains and Motifs in TaPAL Proteins
The MEME server was used to analyze the conserved domains (motifs) within the TaPAL gene family. The MEME program resulted in the identification of 10 conserved motifs of the 37 TaPAL members ( Figure 2). The length of the predicted motifs ranged from 40 to 49 amino acids. Motif 9 and motif 10 were primarily present in all genes except the TaPAL13, TaPAL37, and TaPAL31 genes. Figure 2B shows that motifs 8, 9, and 10 were not present on the TaPAL37 gene. Furthermore, motifs 1-7 were conserved in all groups of the phylogenetic tree. The SMART database's result indicated that all TaPAL genes contain a well-conserved aromatic lyase domain (PF00221) [56]. This domain (e-value 2.4e-153) starts from 56 aa and ends at 536 aa ( Figure 2D and Supplementary File S2).  are two types of duplication involved in evolution. The first is first tandem duplication (among two or more genes on the same chromosome) and the second is segmental duplication (among different chromosomes and the same clades).

Identification of Conserved Protein Domains and Motifs in TaPAL Proteins
The MEME server was used to analyze the conserved domains (motifs) within the TaPAL gene family. The MEME program resulted in the identification of 10 conserved motifs of the 37 TaPAL members ( Figure 2). The length of the predicted motifs ranged from 40 to 49 amino acids. Motif 9 and motif 10 were primarily present in all genes except the TaPAL13, TaPAL37, and TaPAL31 genes. Figure 2B shows that motifs 8, 9, and 10 were not present on the TaPAL37 gene. Furthermore, motifs 1-7 were conserved in all groups of the phylogenetic tree. The SMART database's result indicated that all TaPAL genes contain a well-conserved aromatic lyase domain (PF00221) [56]. This domain (e-value 2.4e-153) starts from 56 aa and ends at 536 aa ( Figure 2D and Supplementary File S2).

Gene Structure Analysis
For the determination of intron and exon numbers and their positions, all the coding and genomic sequences of TaPAL members were aligned. The analysis of TaPAL genes illustrated variations in exon-intron structure. Nine TaPAL genes (TaPAL3, TaPAL5, TaPAL8, TaPAL18, TaPAL24, TaPAL27, TaPAL35, TaPAL36, and TaPAL37) contained no introns in their ORFs ( Figure 3). The present findings agreed with the previous studies, which reported that seven CsPAL genes contained no intron [26]. Parallel to our results, Vogt [5] also mentioned that none of the nine ClPAL genes contained introns. The ORFs of the 26 TaPAL genes (TaPAL1-2, TaPAL4, TaPAL6-7, TaPAL9-12, TaPAL14-17, TaPAL20-23, TaPAL26, and TaPAL27-34) were interrupted by a single intron (length ranged from 99 bp to 138 bp), whereas the intronic length of four genes (TaPAL28, TaPAL32, TaPAL26 and TaPAL30) varied from 1055 bp to 1618 bp ( Figure 3). Dong et al. [25] reported a similar exon-intron pattern in AtPAL1 and AtPAL2 [6], and NtPALl. Our results also explained that one additional intron was detected in TaPAL25 and TaPAL13. Finally, the length of exon 2 was highly conserved in all the TaPAL genes with one intron.

Gene Ontology of PAL Genes
For the functional prediction of PAL-genes, we conducted GO annotation analysis. In silico functional prediction was carried out and results showed that there were three types of processes involved-biological processes (BPs), molecular processes (MPs), and cellular processes (CPs) (Figure 4). The BPs suggested that PAL genes are actively involved in the different metabolic activities and biosynthesis of different organic substances. Furthermore, CPs prediction clarified that almost all the PAL genes reside in the cytoplasm and could be involved in regulation of metabolic processes. Meanwhile, MPs showed that PAL genes have the enzymatic ability. Such findings clearly indicate that PAL genes play role in plant growth by modulating the BPs, MPs, and CPs.

MicroRNA-Targeting TaPAL Genes
We discovered 27 putative miRNAs targeting 37 TaPAL genes to create an interaction network using Cytoscape software to better understand the underlying regulatory mechanism of miRNAs involved in the regulation of PALs ( Figure 5 and Supplementary Sheet S4). In the connection distribution and regulation network, we found that TaPAL26 is one of the most-targeted PAL genes of wheat. The tae-miR1119 targets the wheat genes TaPAL7, TaPAL32, TaPAL24, TaPAL3, TaPAL19, TaPAL30, TaPAL26, TaPAL27, TaPAL18, TaPAL9, TaPAL22, TaPAL23, TaPAL21, TaPAL6, TaPAL16, TaPAL15, TaPAL33, TaPAL31, and TaPAL4. Our results also indicated that miRNA tae-miR9781 target TaPAL22 and TaPAL34. Both these genes have low expression in shoots. Furthermore, the miRNAs tae-miR1119, tae-miR398, tae-miR444a, tae-miR444b, and tae-miR9664-3p targeting TaPAL29 have high expression in root tissues. TaPAL30) varied from 1055 bp to 1618 bp ( Figure 3). Dong et al. [25] reported a similar exon-intron pattern in AtPAL1 and AtPAL2 [6], and NtPALl. Our results also explained that one additional intron was detected in TaPAL25 and TaPAL13. Finally, the length of exon 2 was highly conserved in all the TaPAL genes with one intron.

Gene Ontology of PAL Genes
For the functional prediction of PAL-genes, we conducted GO annotation analysis. In silico functional prediction was carried out and results showed that there were three types of processes involved-biological processes (BPs), molecular processes (MPs), and cellular processes (CPs) (Figure 4). The BPs suggested that PAL genes are actively involved in the different metabolic activities and biosynthesis of different organic substances. Furthermore, CPs prediction clarified that almost all the PAL genes reside in the cytoplasm and could be involved in regulation of metabolic processes. Meanwhile, MPs showed that PAL genes have the enzymatic ability. Such findings clearly indicate that PAL genes play role in plant growth by modulating the BPs, MPs, and CPs.

Promoter Analysis
The promoter sequence is known as a regulatory element that controls gene expression and regulation [7][8][9]. The promoters are also called cis-acting regulatory DNA elements. Their location can be retrieved from the PLACE database (Table 3). Three regulatory elements, TCA-element, CGTAC-motif, and ABRE (abscisic acid or ABA responses), were identified for TaPAL genes. The TCA-element, CGTAC-motif, and ABRE-motifs are associated with SA responses, MeJA, and ABA, respectively.
Additionally, the expression of the TaPAL gene family is closely related to light, which was confirmed by the presence of MRE light-responsive element, G-Box, GT1-motif, AEbox, ATC-motif, C-box, CAG-motif, I-box, Sp1, Box 4, and ACE on some member TaPAL gene families. The putative TATA box was present on the upstream sequences from the start codon ATG on all TaPAL genes. Moreover, TaPAL gene promoters also contained several phytohormone-responsive elements, including ABRE, AuxRE (auxin-response elements), and GARE (gibberellin (GA) responses). The promoter of TaPAL genes also contained MBS (drought induction) and LTR repetitive sequences (cold stress) related to stress-response regulatory elements. TaPAL21, TaPAL35, TaPAL8, TaPAL5, TaPAL16, TaPAL15, TaPAL28, TaPAL32, TaPAL26, TaPAL25, TaPAL18, TaPAL27, TaPAL36, TaPAL34, and TaPAL12 contained a single copy of MBS cis-regulatory element, while TaPAL22 contained two copies of MBS cis-regulatory element. It was also observed that only TaPAL37, TaPAL29, TaPAL34, TaPAL35, TaPAL31, TaPAL24, TaPAL28, TaPAL3, and TaPAL19 contained the LTR cis-regulatory element. Additionally, the upstream regulatory sequences of the TaPAL34 and TaPAL16 genes contained the TC-rich repeat, which is related to the defense mechanism. These results suggested that the TaPAL gene family members may play an important role in the survival of plants under various environmental stresses. These cis-regulatory elements (promoters) receive stimuli from the environment via complex mechanisms and induce gene expression and regulation in response to various abiotic and biological stresses.

MicroRNA-Targeting TaPAL Genes
We discovered 27 putative miRNAs targeting 37 TaPAL genes to create an interaction network using Cytoscape software to better understand the underlying regulatory mechanism of miRNAs involved in the regulation of PALs ( Figure 5 and Supplementary Sheet S4). In the connection distribution and regulation network, we found that TaPAL26 is one of the most-targeted PAL genes of wheat. The tae-miR1119 targets the wheat genes

Protein-Protein Interaction of TaPAL
The TaPAL protein predicted analysis showed an array of other proteins which coregulate with TaPAL29 (Traes_1BS_BD86C90A7.1) ( Figure 6). Arogenate dehydratase (Traes_5BL_7B0ED7548.1), which is a key enzyme involved in synthesis of L-phenylalanine from L-arogenate, showed interaction with our reference gene. The bit-score of 0.895 showed that optimum interaction with our reference gene TaPAL29, which is a member of the TaPAL genes, while rest of the gene was uncharacterized.

Phylogenetic Analysis of the PAL Gene Family
Of the 37 TaPAL genes identified in this study, four PAL genes from Arabidopsis thaliana, nine PAL genes from Oryza sativa, and eight PAL genes from Zea mays were used to construct a maximum-likelihood-approach tree using MEGA X to determine the evolutionary relationships (Figure 7). The resultant phylogenetic tree based on protein sequence similarities divided PAL proteins into four major clades or groups represented in different colors. The first three groups represent the monocots, while the fourth group shows the dicots. Overall group I exhibited 30 TaPAL genes (TaPAL35-37, TaPAL1-2, TaPAL4-18, TaPAL20-23, TaPAL25, TaPAL27, TaPAL29, TaPAL31, and TaPAL33-37) and one PAL gene from each rice and maize. Group II possessed five TaPAL genes (TaPAL3, TaPAL19, TaPAL24, TaPAL30, and TaPAL32) that were found to be more closely associated with the genes of Z. mays genes as compared to O. sativa genes. Moreover, group III illustrated two TaPAL genes (TaPAL28 and TaPAL26) that were closely associated with PAL genes of O. sativa (OsPAL1, OsPAL5, and OsPAL6) versus that of Z. mays (ZmPAL1). The AtPALs genes which are dicots and made a separate group IV.

Promoter Analysis
The promoter sequence is known as a regulatory element that controls gene expression and regulation [7][8][9]. The promoters are also called cis-acting regulatory DNA elements. Their location can be retrieved from the PLACE database (Table 3). Three regulatory elements, TCA-element, CGTAC-motif, and ABRE (abscisic acid or ABA responses), were identified for TaPAL genes. The TCA-element, CGTAC-motif, and ABRE-motifs are associated with SA responses, MeJA, and ABA, respectively.
Additionally, the expression of the TaPAL gene family is closely related to light, which was confirmed by the presence of MRE light-responsive element, G-Box, GT1-motif, AE-box, ATC-motif, C-box, CAG-motif, I-box, Sp1, Box 4, and ACE on some member TaPAL gene families. The putative TATA box was present on the upstream sequences from the start codon ATG on all TaPAL genes. Moreover, TaPAL gene promoters also contained several phytohormone-responsive elements, including ABRE, AuxRE (auxin-response elements), and GARE (gibberellin (GA) responses). The promoter of TaPAL genes  Table 3. Cis-regulatory elements involved in plant growth regulation, stress, and hormonal responses.

Site Name Functions
Hormone ABRE cis-acting element involved in abscisic acid responsiveness ACE cis-acting element involved in light responsiveness CCAAT-box MYBHv1 binding site CGTCA-motif cis-acting regulatory element involved in MeJA-responsiveness GARE-motif Gibberellin-responsive element GC-motif Enhancer-like element involved in anoxic-specific inducibility P-box Gibberellin-responsive element and part of a light-responsive element TCA-element cis-acting element involved in salicylic acid responsiveness TGA-element Auxin-responsive element TGACG-motif cis-acting regulatory element involved in MeJA-responsiveness Table 3. Cont.

Site Name Functions
Stress and Growth A-box cis-acting regulatory element AE-box Part of a module for light response Box 4 Part of a conserved DNA module involved in light responsiveness ARE cis-acting regulatory element essential for anaerobic induction ATC-motif Part of a conserved DNA module involved in light responsiveness ATCT-motif Part of a conserved DNA module involved in light responsiveness C-box cis-acting regulatory element involved in light responsiveness CAAT-box Common cis-acting element in promoter and enhancer regions CAG-motif Part of a light response element CAT-box cis-acting regulatory element related to meristem expression chs-CMA2a Part of a light responsive element chs-Unit 1 ml Part of a light responsive element Circadian cis-acting regulatory element involved in circadian control G-box cis-acting regulatory element involved in light responsiveness GATA-motif Part of a light-responsive element GCN4-motif cis-regulatory element involved in endosperm expression GT1-motif TCT-motif Part of a light responsive element

Protein-Protein Interaction of TaPAL
The TaPAL protein predicted analysis showed an array of other proteins which coregulate with TaPAL29 (Traes_1BS_BD86C90A7.1) (Figure 6). Arogenate dehydratase (Traes_5BL_7B0ED7548.1), which is a key enzyme involved in synthesis of L-phenylalanine from L-arogenate, showed interaction with our reference gene. The bit-score of 0.895 showed that optimum interaction with our reference gene TaPAL29, which is a member of the TaPAL genes, while rest of the gene was uncharacterized.

Phylogenetic Analysis of the PAL Gene Family
Of the 37 TaPAL genes identified in this study, four PAL genes from Arabidopsis thaliana, nine PAL genes from Oryza sativa, and eight PAL genes from Zea mays were used to construct a maximum-likelihood-approach tree using MEGA X to determine the evolutionary relationships (Figure 7). The resultant phylogenetic tree based on protein sequence similarities divided PAL proteins into four major clades or groups represented in different colors. The first three groups represent the monocots, while the fourth group shows the dicots. Overall group I exhibited 30 TaPAL genes (TaPAL35-37, TaPAL1-2, TaPAL4-18, To investigate the ancestral relationship of the PAL gene family in T. aestivum with its ancestral species, the phylogenetic analysis also showed all the PAL genes from Hordeum vulgare, Solanum tuberosum, and Triticum urartu. Hordeum vulgare was domesticated from its wild relative, Hordeum spontaneum, while Triticum urartu is the progenitor of tetraploid Triticum turgidum and hexaploid Triticum aestivum. The ancestral plants had 8, 10, and 11 PAL genes, respectively. Common wheat PAL genes showed maximum association with HvPAL (H. vulgare), followed by TuPAL (T. urartu), as shown in Supplementary Figure S1.
To study the origin and evolutionary relationship of Triticum aestivum (tr), Aegilops tauschii (ae), Triticum turgidum (tg), and Triticum dicocoides (td) PAL protein sequences, a comparative synteny analysis was conducted (Figure 8 and Supplementary sheet S2). The proteins from four species were closely associated and showed higher similarity in evolutionary correlation analysis. It was noted that TaPAL genes on chromosome trchr6D have some evolutionary origins in common wheat with genes on chromosomes td6B, td6A, and To investigate the ancestral relationship of the PAL gene family in T. aestivum with its ancestral species, the phylogenetic analysis also showed all the PAL genes from Hordeum vulgare, Solanum tuberosum, and Triticum urartu. Hordeum vulgare was domesticated from its wild relative, Hordeum spontaneum, while Triticum urartu is the progenitor of tetraploid Triticum turgidum and hexaploid Triticum aestivum. The ancestral plants had 8, 10, and 11 PAL genes, respectively. Common wheat PAL genes showed maximum association with HvPAL (H. vulgare), followed by TuPAL (T. urartu), as shown in Supplementary Figure S1.
To study the origin and evolutionary relationship of Triticum aestivum (tr), Aegilops tauschii (ae), Triticum turgidum (tg), and Triticum dicocoides (td) PAL protein sequences, a comparative synteny analysis was conducted (Figure 8 and Supplementary sheet S2). The proteins from four species were closely associated and showed higher similarity in evolutionary correlation analysis. It was noted that TaPAL genes on chromosome trchr6D have some evolutionary origins in common wheat with genes on chromosomes td6B, td6A, and ae6D. We identified that 10 genes of Aegilops tauschii are duplicated with TaPAL33, TaPAL37, TaPAL27, TaPAL34, TaPAL36, TaPAL22, TaPAL23, TaPAL35, TaPAL26, and TaPAL30. Sixteen genes of Triticum dicocoides are orthologs with TaPAL genes of wheat, and TaPAL26 is twice duplicated in PAL genes of Triticum dicocoides. Nineteen orthologous gene pairs of Triticum turgidum and wheat were identified. More than two of the orthologous gene pairs of TaPAL 37, TaPAL36, TaPAL35, TaPAL34, and TaPAL26 were identified in Triticum turgidum. Seven paralogous pairs of TaPAL genes were identified.

In Silico Expression Profile Analysis of PAL Gene Family in Six Genotypes of Wheat
Gene expression analysis helps to probe the potential role and functions of a gene family [57]. Comparative gene expression analysis was used to elucidate the physiological function of different PAL gene family members. The in-silico expression profiling was done on the roots of different wheat genotypes (Figure 9). For this purpose, RNA-seqnormalized data were analyzed and based on FPKM values, a heatmap was constructed for diverse TaPAL genes. The expression of TaPAL genes was variable in different wheat genotypes. TaPAL35, TaPAL31, TaPAL23, TaPAL22, TaPAL8, TaPAL5, and TaPAL6 were only expressed in Local White. Very low expression of TaPAL10, TaPAL30, TaPAL32, TaPAL3, and TaPAL28 were recorded in all wheat genotypes. Nonetheless, they may have tissue-specific expression, such as in seeds, or their expression may be induced only by certain environmental stresses. TaPAL11, TaPAL14, TaPAL12, TaPAL34, TaPAL4,

In Silico Expression Profile Analysis of PAL Gene Family in Six Genotypes of Wheat
Gene expression analysis helps to probe the potential role and functions of a gene family [57]. Comparative gene expression analysis was used to elucidate the physiological function of different PAL gene family members. The in-silico expression profiling was done on the roots of different wheat genotypes (Figure 9). For this purpose, RNA-seq-normalized data were analyzed and based on FPKM values, a heatmap was constructed for diverse TaPAL genes. The expression of TaPAL genes was variable in different wheat genotypes. TaPAL35, TaPAL31, TaPAL23, TaPAL22, TaPAL8, TaPAL5, and TaPAL6 were only expressed in Local White. Very low expression of TaPAL10, TaPAL30, TaPAL32, TaPAL3, and TaPAL28 were recorded in all wheat genotypes. Nonetheless, they may have tissue-specific expression, such as in seeds, or their expression may be induced only by certain environmental stresses. TaPAL11, TaPAL14, TaPAL12, TaPAL34, TaPAL4, TaPAL21, TaPAL19, TaPAL24, and TaPAL36 were highly expressed in UZ-11-CWA-8. Similarly, TaPAL27, TaPAL16, TaPAL9, and TaPAL15 were highly expressed in Chakwal-50. Overall, TaPAL genes showed a higher expression pattern in roots of drought-tolerant genotypes as compared to drought-sensitive genotypes of wheat. TaPAL21, TaPAL19, TaPAL24, and TaPAL36 were highly expressed in UZ-11-CWA-8. Similarly, TaPAL27, TaPAL16, TaPAL9, and TaPAL15 were highly expressed in Chakwal-50. Overall, TaPAL genes showed a higher expression pattern in roots of drought-tolerant genotypes as compared to drought-sensitive genotypes of wheat.  Figure 7. Color scheme showing the intensity of expression (blue, low expression; red, high expression, and Z score was used).

Expression of PAL Gene Family under Abiotic Stress
The expression of TaPAL genes under abiotic stresses such as drought (DH), heat stress (HS), and phosphorous deficiency (PS) at various stages and in various tissues were explored ( Figure 10 and Supplementary Sheet S3). The expression levels of TaPAL37, TaPAL36, TaPAL35, TaPAL33, TaPAL29, TaPAL25, TaPAL24, TaPAL17, TaPAL14, TaPAL11, TaPAL7, TaPAL3, and TaPAL4 were upregulated in roots. Similarly, the same trend was shown by TaPAL10, TaPAL27, TaPAL15, TaPAL6, and TaPAL21 under drought stress,  Figure 7. Color scheme showing the intensity of expression (blue, low expression; red, high expression, and Z score was used).

Discussion
Wheat is the main crop for half of the world's population. Wheat faces various types of biotic and abiotic stresses. It has been suggested that phenylalanine ammonia-lyase (PAL) genes are essential for plant growth, development, adaptation, and mitigation responses to various environmental and pathogens stresses by producing secondary metabolites regulating plant growth response [11,58,59]. Phenylpropanoids are plant-based organic compounds, which are produced from the amino acids phenylalanine and tyrosine. PAL serves as the first enzyme in the phenylpropanoid pathway and in flavonoid biosynthesis that catalyzes the deamination of phenylalanine [1,24,45,60,61]. Recently, these enzymes have been reported by many researchers in different crops, including Juglans regia [62], Citrus reticulata [63], Citrullus lanatus [64], and Medicago truncatula [65]. This study was an investigation of PAL in wheat.

Discussion
Wheat is the main crop for half of the world's population. Wheat faces various types of biotic and abiotic stresses. It has been suggested that phenylalanine ammonia-lyase (PAL) genes are essential for plant growth, development, adaptation, and mitigation responses to various environmental and pathogens stresses by producing secondary metabolites regulating plant growth response [11,58,59]. Phenylpropanoids are plant-based organic compounds, which are produced from the amino acids phenylalanine and tyrosine. PAL serves as the first enzyme in the phenylpropanoid pathway and in flavonoid biosynthesis that catalyzes the deamination of phenylalanine [1,24,45,60,61]. Recently, these enzymes have been reported by many researchers in different crops, including Juglans regia [62], Citrus reticulata [63], Citrullus lanatus [64], and Medicago truncatula [65]. This study was an investigation of PAL in wheat.
The PAL family is a very large, multigene family. The family includes ten putative members in maize [66], four members in Arabidopsis [19] and tobacco [67], and more than 20 copies in tomato and potato [68]. In the present study, we demonstrated that common wheat (Triticum aestivum) has 37 genes of the PAL family, a significantly higher number than the above-mentioned species. However, the increase and decrease of PAL genes present among species (Z. mays, A. thaliana, and O. sativa) is random [6]. Our results showed that the number of PAL genes in T. aestivum far exceeds the four AtPALs, seven OsPALs, twelve JrPALs, and six ZmPALs, suggesting that whole-genome duplication, small-scale segmental duplications, local tandem duplications, or a combination of these duplication events may have caused this expansion in T. aestivum [7,69,70]. The duplicated PAL genes in this study were mapped to 11 chromosomes (Figure 1). This diversity of chromosomal distribution indicates that these genes have diverse function. The duplication events might have caused the expansion and dispersion of PAL genes giving rise to potential sources of functional variability in common wheat. Gene duplication events may have caused the significant increase in PAL genes in T. aestivum, as stated in recent studies on different species [16,42,65]. The isolation and identification of PAL genes in T. aestivum is critical because of their importance in adaption and stress resistance [1,71,72]. The activity of PAL genes in response to cold stress of Juglans regia (walnut) suggested that the PAL gene family in T. aestivum is also involved in providing resistance against cold, drought, salt, and disease [70,72]. Similarly, this study also indicates that the expression of TaPAL genes is higher in drought-tolerant wheat genotypes as compared to sensitive genotypes. Furthermore, we also checked the subcellular location of TaPAL. Our results showed that the 37 PAL genes are localized to the cytoplasm [62,73,74].
Conserved motifs referred to a part of proteins that is functionally important. The motifs were selected from the PLACE database and conservation patterns were retrieved from MEME suite ( Figure 2 and Table 3) and UGENE depicted that the protein structure of the PAL-gene family has been highly conserved. The PAL-gene family, including Z. mays, A. thaliana, O. sativa, H. vulgare, and T. urartu plant species, contained all the conserved domains indicating that the PAL-gene family remained highly conserved during evolution and took long-term speciation and duplication events to evolve; thus, the results demonstrated its importance in antiretroviral effects. It was evident that the key domain is phenyl ammonium lyase/aromatic lyase, which exists in all families and ancestral species, suggesting a structural similarity between proteins of the PAL gene family.
The intron-exon gene structure gives clues for gene evolution [10]. In parallel to the gene number, the structure of the TaPAL genes in Triticum aestivum has experienced developmental/evolutionary modifications. Out of 37 TaPAL genes, ten TaPAL genes (TaPAL3, TaPAL5, TaPAL8, TaPAL18, TaPAL19, TaPAL24, TaPAL27, TaPAL35, TaPAL36, and TaPAL37) have no intron in their coding regions, two of the TaPAL genes (TaPAL25, and TaPAL13) are interrupted by two introns in their ORFs, while 25 TaPAL genes have one intron in their ORFs (Figure 3). Recent studies stated that the duplicated genes showed structural divergence, which is very prevalent in the generation of functionally distinct paralogs. This structural divergence has played a key role in the evolution of duplicated genes compared to non-duplicated genes [69]. The PAL-gene structural-data analysis showed a significant variation in the evolution of the PAL family of common wheat, walnut, and poplar.
For the functional prediction of TaPAL genes we did the GO enrichment analysis (Figure 4). In silico prediction indicated that TaPAL genes were involved in numerous developmental processes by regulating biological processes (BPs), molecular processes (MPs), and cellular process (CPs), and showed response against environmental stresses. Many previous studies also reported that microRNAs respond to stress stimuli through regulation of gene expression [42]. TaPAL is highly expressed in roots as compared to shoot tissues against abiotic stress. The miRNAs tae-miR1119, tae-miR398, tae-miR444a, tae-miR444b, and tae-miR9664-3p targeting TaPAL29 have high expression in root tissues ( Figures 5 and 10). Previously it has been reported that plant miRNAs play a role in response to environmental stress. In bread wheat under the drought stress, different miRNAs such as miR159, and miR395 were found to be differentiated [75]. Similarly, VM-milR37 plays role in pathogenicity through regulation of the VmGPX gene [76]. In another study, miR164 regulated the salinity tolerance in maize [77]. We also checked the proteinprotein interaction of TaPAL29 with other co-regulated proteins. Results showed that arogenate dehydratase belongs to the class lyases and is a key enzyme that catalyzes the reaction of L-arogenate into L-phenylalanine [78] and shows interaction with the TaPAL29 (Figure 6).
Phylogenetic analysis, both with ancestral and family species, proposed that the evolution trajectories are like family species (Z. mays, A. thaliana, and O. sativa) and suggested that the PAL gene family converge to a single ancestor. This ancestor might be involved in the evolution of plants with respect to adaptation and resistance. Previously it has been reported that during the evolution of PAL, lineage-specific duplication (to promote the diversity of multi-gene families) occurs in Arabidopsis and other species [79]. The close paralogs of each PAL gene clustered together phylogenetically into clades in T. aestivum, A. thaliana, O. sativa, and Z. mays (Figure 7). In contrast, the PALs from T. aestivum and Z. mays clustered together along with some of the O. sativa genes (OsPAL1, OsPAL5, OsPAL6, and OsPAL8), indicating that the expansion of the common wheat PAL gene family might have occurred after the divergence of eurosids I and eurosids II (approximately 100 million years ago) which was reported by [62,80]. Based on phylogenetic analysis, our 37 TaPAL genes were separated into three different groups as in tea plant (Camellia sinensis) [79] and in other woody plants (Juglans regia L., Salix babylonica, Ornithogalum saundersiae, and Populus trichocarpa) they cluster into two groups [18,21,42,81]. TaPALs showed no expansion events as in Cucumis sativus [26]. The PAL gene family has significant similarities and dissimilarities among various plant species, i.e., ZmPAL3-5 and OsPAL2-4. Among TaPAL genes, TaPAL13, TaPAL31, TaPAL36, and TaPAL37 showed a slight difference in sequence as compared to other 33 PAL genes of T. aestivum (common wheat), which indicated an 80% similarity score in syntenic analysis. This relationship demonstrated that PALs with comparable evolutionary status might play a similar role in plant development, which enabled us to examine the elements of PALs from different families such as Poaceae via utilizing a comparative genomic approach.
PAL gene is strictly involved in controlling the pre-and post-transcriptional stages, which is considered a doorway to the initiation of the phenylpropanoid pathway. Differential expression patterns for PAL genes in higher plants was observed. Moreover, the PAL genes in common wheat (T. aestivum) show distinct patterns of expression in roots. The genes TaPAL11, TaPAL14, TaPAL12, TaPAL29, TaPAL20, TaPAL7, TaPAL1, TaPAL2, TaPAL9, TaPAL15, and TaPAL16 exhibited high expression levels in roots of drought-tolerant genotypes as compared to drought-susceptible genotypes ( Figure 9). These variations in expression level were attributed to the differences in proteins and gene structures, as shown in Figures 2 and 3. The PAL family genes showed diverse expression patterns, which indicated that a complex regulation of the PAL-mediated phenylpropanoid pathways existed during the development of drought-tolerant and drought-sensitive wheat genotypes (Figure 9). A similar expression pattern of the PAL gene family has also been reported in walnut and barrel clover [62,82]. Cis-regulatory elements are also present upstream of the TaPALs (Table 3). Some of the TaPALs from the same evolutionary cluster co-express under stress conditions. This might be due to the presence of Cis elements [16]. Similarly, GdPAL5 is also reported to be an auxin producer which activates plant defense mechanisms during the abiotic stress [83]. Different gene family members usually display abundance disparities in different tissues or under distinct stresses [84].
To overcome the problem of changing climatic conditions of abiotic stress including heat and drought stress on wheat, there is a need to explore the transcriptome profile of this gene family. This study used transcriptomic information of various tissues, at various stages, as shown in Figure 10. The transcript levels of TaPAL37, TaPAL36, TaPAL35, TaPAL33, TaPAL29, TaPAL25, TaPAL24, TaPAL17, TaPAL14, TaPAL11, TaPAL7, TaPAL3, and TaPAL4 were upregulated in roots. The expression levels of TaPAL genes were consistent with previous studies, showing that expression of TaPAL genes is higher in roots as compared to other tissues of plants such as Hordeum vulagare [85], Solanum tuberosum [86], Arabidopsis thaliana [19], and Juglans regia [87]. The higher expression of the TaPAL gene family in drought-tolerant genotypes as compared to drought-sensitive genotypes may be due to high level of lignification, which is part of normal root development [88]. Furthermore, publicly available transcriptomic data which we used was validated by qRT-PCR [89,90].

Conclusions
In this study, we have identified 37 TaPAL gene family members, which were distributed onto 11 chromosomes. These TaPAL genes were found to be involved in droughtstress response mechanisms as they showed high expression in root tissues. Since a few PAL genes are reported in wheat, this is the first detailed study of the PAL gene family in wheat. We also find 27 putative miRNAs targeting TaPAL genes. Some questions are still to be answered, such as what is the exact role of each TaPAL gene, and how is the expression of each TaPAL controlled in different phases of development and in reaction to distinct stress or hormone signals? Therefore, to further our knowledge of the TaPAL family, more molecular, biochemical, and physiological studies are expected. Due to the potential roles of TaPAL in the growth of common wheat (T. aestivum), it may provide prospective targets for molecular high-quality grain breeding.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/agronomy11122511/s1, Figure S1: Ancestral relationship (phylogenetic tree) of PAL genes between Triticum aestivum (Ta), Hordeum vulgare (Hv), Solanum tuberosum (St), and Triticum urartu (Tu); File S1: Protein sequences of identified TaPAL in this study; File S2: Multiple sequence alignment in wheat; Supplementary Sheet S1:The details of the materials and treatments for the retrieved expression values; Sheet S2:List of genes in wheat to explore the gene duplication within the TaPAL gene family and ancestral species of wheat; Sheet S3: Expression level of genes in different conditions; Sheet S4:Putative miRNAs targeting the TaPAL genes. Data Availability Statement: The data and materials presented in this study are mentioned in the main text as well as in the supplementary files, further data will be provided on request from the corresponding author.