Analysis of Carica papaya Informs Lineage-Specific Evolution of the Aquaporin (AQP) Family in Brassicales

Aquaporins (AQPs), a type of intrinsic membrane proteins that transport water and small solutes across biological membranes, play crucial roles in plant growth and development. This study presents a first genome-wide identification and comparative analysis of the AQP gene family in papaya (Carica papaya L.), an economically and nutritionally important fruit tree of tropical and subtropical regions. A total of 29 CpAQP genes were identified, which represent five subfamilies, i.e., nine plasma intrinsic membrane proteins (PIPs), eight tonoplast intrinsic proteins (TIPs), seven NOD26-like intrinsic proteins (NIPs), two X intrinsic proteins (XIPs), and three small basic intrinsic proteins (SIPs). Although the family is smaller than the 35 members reported in Arabidopsis, it is highly diverse, and the presence of CpXIP genes as well as orthologs in Moringa oleifera and Bretschneidera sinensis implies that the complete loss of the XIP subfamily in Arabidopsis is lineage-specific, sometime after its split with papaya but before Brassicaceae–Cleomaceae divergence. Reciprocal best hit-based sequence comparison of 530 AQPs and synteny analyses revealed that CpAQP genes belong to 29 out of 61 identified orthogroups, and lineage-specific evolution was frequently observed in Brassicales. Significantly, the well-characterized NIP3 group was completely lost; lineage-specific loss of the NIP8 group in Brassicaceae occurred sometime before the divergence with Cleomaceae, and lineage-specific loss of NIP2 and SIP3 groups in Brassicaceae occurred sometime after the split with Cleomaceae. In contrast to a predominant role of recent whole-genome duplications (WGDs) on the family expansion in B. sinensis, Tarenaya hassleriana, and Brassicaceae plants, no recent AQP repeats were identified in papaya, and ancient WGD repeats are mainly confined to the PIP subfamily. Subfamily even group-specific evolution was uncovered via comparing exon–intron structures, conserved motifs, the aromatic/arginine selectivity filter, and gene expression profiles. Moreover, down-regulation during fruit ripening and expression divergence of duplicated CpAQP genes were frequently observed in papaya. These findings will not only improve our knowledge on lineage-specific family evolution in Brassicales, but also provide valuable information for further studies of AQP genes in papaya and species beyond.


Introduction
Papaya (Carica papaya L., 2n = 18), which is more likely to origin in South Mexico and/or Central America, is an economically and nutritionally important tree fruit crop widely cultivated in tropical and subtropical areas [1,2].Papaya is sweet, flavorful, brightly colored, and uniquely rich in vitamin C and carotenoids, making it rank first on nutritional scores among 38 common fruits and also rank first among fruits consumed [3,4].Papaya also has valuable medical and industrial applications, including the production of a well-known proteolytic enzyme, papain [5][6][7][8][9].Water balance is particularly important for this special species because a large amount of water is essential for the crop yield and quality [10,11].Papaya is a member of the Caricaceae family within Brassicales, the same Plants 2023, 12, 3847 2 of 23 order as the well-known model plant Arabidopsis (Arabidopsis thaliana), which may share a common ancestor from about 72 million years ago (Mya) [2,11,12].The papaya genome was estimated to be 372.0Mb by using flow cytometry [13], and the first draft genome was reported in 2008, which is derived from SunUp, the first commercial virus-resistant transgenic variety [3].This assembly spans about 370. 4 Mb fragmented in 17,766 scaffolds (Scfs) [3].More recently, chromosomal-level assemblies were also described for SunUp and its progenitor Sunset, spanning 351.5 and 350.3 Mb in nine chromosomes (Chrs), respectively [14].Although papaya harbors a considerably bigger genome size than Arabidopsis (more than twofold), its predicted protein-coding genes of 22,394 (SunUp)/22,416 (Sunset) are smaller than 27,655 present in Arabidopsis (Araport11), implying lineage-specific gene evolution and reflecting the occurrence of two additional whole-genome duplications (WGDs, known as At-β and At-α) followed by huge chromosomal rearrangement and massive gene loss occurring in Arabidopsis after the split with papaya [14 -16].
Aquaporins (AQPs), a group of widely found intrinsic membrane proteins transporting water and/or small solutes across biological membranes, play crucial roles in plant growth, development, and stress responses [17,18].AQPs belong to the ancient major intrinsic protein (MIP) superfamily, and usually share several typical structural characteristics, including the presence of six transmembrane helices (i.e., TM1-TM6) connected by five loops (i.e., LA-LE), two highly conserved Asn-Pro-Ala (NPA) motifs as well as the so-called aromatic/arginine (ar/R) selectivity filter (i.e., H2 at the TM2, H5 at the TM5, LE1 and LE2 at the LE) [17,19,20].Whereas dual NPA motifs act as a size barrier of the pore via creating an electrostatic repulsion of protons, the ar/R filter determines the substrate specificity via rendering the pore constriction site diverse in both size and hydrophobicity [19,21,22].Additionally, an H residue corresponding to H 131 at the LC of AtTIP2;1 was proven to be essential for NH 3 permeability [23], whereas a T residue corresponding to T 109 at the TM1 of rice (Oryza sativa) Lsi1 (or OsNIP2;1) as well as the NPA spacing of 108 amino acids (AA) were shown to be essential for silicon transport [24,25].In the algae kingdom, AQPs are present as one or a few copies.However, more than 19 family members were found in terrestrial plants [26][27][28][29][30].Moreover, according to sequence similarity, AQPs identified in land plant lineages were clustered into seven evolutionary subfamilies including plasma intrinsic membrane protein (PIP), tonoplast intrinsic protein (TIP), NOD26-like intrinsic protein (NIP), X intrinsic protein (XIP), small basic intrinsic protein (SIP), GlpF-like intrinsic protein (GIP), and hybrid intrinsic protein (HIP) [27,29].Among them, GIPs, which may be obtained through horizontal gene transfer, have been completely lost in spikemoss (Selaginella moellendorffii) and vascular plants, whereas HIPs were only reported in moss (Physcomitrella patens) and spikemoss [26][27][28].Interestingly, the widely distributed XIPs were shown to be absent from monocots and Brassicaceae species including Arabidopsis [29,31,32].Genome-wide comparison also revealed a key role of recent WGDs on the expansion of the AQP gene family [29,30,32,33].Among 24 AQP repeats identified in poplar (Populus trichocarpa), 20 were shown to arise from the Salicaceae-specific p WGD [33,34].In cassava (Manihot esculenta), 13 out of 14 identified AQP repeats were shown to arise from the recent ρ WGD that was shared by rubber tree (Hevea brasiliensis) [30].In Arabidopsis, 10 out of 17 identified AQP repeats were shown to result from two recent WGDs [35].Nevertheless, whether XIPs are present in other Brassicales families beyond Brassicaceae and lineage-specific evolution patterns of the whole gene family in Brassicales still need to be studied.

Identification, Evolutionary Analysis, and Evolution Patterns of 29 AQP Family Genes in Papaya
As shown in Table 1, the search of papaya genome sequences resulted in 29 loci encoding AQP genes from both Sunset and SunUp (ASGPBv0.4).In Sunset, these genes were shown to unevenly distribute across nine chromosomess, varying from a single one of Chr3/-7 to six of Chr2 (Figure 1), in contrast to interspersing among 26 scaffolds in SunUp (Table 1).To facilitate synteny analysis, CpAQP genes identified in Sunset were used for further analyses.To uncover their evolutionary relationships, an unrooted evolutionary tree was constructed using full-length CpAQP proteins together with published AQPs, i.e., 35 AtAQPs, 37 RcAQPs, 31 JcAQPs, 42 MeAQPs, 48 HbAQPs, and 55 PtAQPs.As shown in Figure 2, these AQPs were clustered into five subfamilies, i.e., PIP, TIP, NIP, SIP, and XIP.Moreover, each subfamily could be divided into two to eight groups, i.e., PIP1-2, TIP1-6, NIP1-8, SIP1-3, and XIP1-3.Interestingly, despite the absence of the XIP subfamily in Brassicaceae plants, two XIPs, which belong to the XIP1 and XIP2 groups, respectively, were identified in papaya.Additionally, the NIP2 group, which was not identified in Arabidopsis, is also present in papaya as well as all of the Malpighiales species compared in this study (Figure 2 and Table S1).By contrast, no NIP3 homolog was found in either papaya or Arabidopsis, though two evolutionary subgroups were identified in several Euphorbiaceae species, e.g., physic nut and cassava.Compared with previous studies, three novel groups, denoted as TIP6, NIP8, and SIP3, were proposed in this study, which only share 67.6/72.8,60.6/57.6,and 58.2/50.0%sequence identities with their closest homologs (i.e., TIP2, NIP4, and SIP1) at the nucleotide and protein levels, respectively (Figure 2, Tables 1 and S3), implying their early divergence.Notably, no SIP3 homolog was found in Arabidopsis, though it is broadly present in other tested species (Figure 2 and Table S1).To uncover their evolutionary relationships, an unrooted evolutionary tree was constructed using full-length CpAQP proteins together with published AQPs, i.e., 35 AtAQPs, 37 RcAQPs, 31 JcAQPs, 42 MeAQPs, 48 HbAQPs, and 55 PtAQPs.As shown in Figure 2, these AQPs were clustered into five subfamilies, i.e., PIP, TIP, NIP, SIP, and XIP.Moreover, each subfamily could be divided into two to eight groups, i.e., PIP1-2, TIP1-6, NIP1-8, SIP1-3, and XIP1-3.Interestingly, despite the absence of the XIP subfamily in Brassicaceae plants, two XIPs, which belong to the XIP1 and XIP2 groups, respectively, were identified in papaya.Additionally, the NIP2 group, which was not identified in Arabidopsis, is also present in papaya as well as all of the Malpighiales species compared in this study (Figure 2 and Table S1).By contrast, no NIP3 homolog was found in either papaya or Arabidopsis, though two evolutionary subgroups were identified in several Euphorbiaceae species, e.g., physic nut and cassava.Compared with previous studies, three novel groups, denoted as TIP6, NIP8, and SIP3, were proposed in this study, which only share 67.6/72.8,60.6/57.6,and 58.2/50.0%sequence identities with their closest homologs (i.e., TIP2, NIP4, and SIP1) at the nucleotide and protein levels, respectively (Figure 2, Tables 1 and S3), implying their early divergence.Notably, no SIP3 homolog was found in Arabidopsis, though it is broadly present in other tested species (Figure 2 and Table S1).Deduced CpAQP peptides consisted of 236-322 AA with an MW value of 25.09-34.73kDa, a GRAVY value of 0.327-1.063,and an AI value of 92.83-119.27,which is consistent with their amphipathic feature.Except for CpTIP5;1 (8.71), other TIPs were shown to be acidic with a pI value of less than 7.By contrast, members of other four subfamilies  , A. thaliana, R. communis, J. curcas, M. esculenta, H. brasiliensis, and P. trichocarpa.Sequence alignment was performed using MUSCLE, and the unrooted evolutionary tree was constructed using the bootstrap maximum likelihood tree (1000 replicates) method of MEGA 6.0.The distance scale denotes the number of amino acid substitutions per site, and the name of each subfamily/group is indicated next to the corresponding cluster.(AQP: aquaporin; At: A. thaliana; Cp: C. papaya; Hb: H. brasiliensis; Jc: J. curcas; Me: M. esculenta; NIP: NOD26-like intrinsic protein; PIP: plasma intrinsic membrane protein; Pt: P. trichocarpa; Rc: R. communis; SIP: small basic intrinsic protein; TIP: tonoplast intrinsic protein; XIP: X intrinsic protein).
Deduced CpAQP peptides consisted of 236-322 AA with an MW value of 25.09-34.73kDa, a GRAVY value of 0.327-1.063,and an AI value of 92.83-119.27,which is consistent with their amphipathic feature.Except for CpTIP5;1 (8.71), other TIPs were shown to be acidic with a pI value of less than 7.By contrast, members of other four subfamilies are usually basic with the pI value varying from 7.63 to 10.42, though CpNIP4;1 possesses a small pI value of 5.35.Except for CpNIP8;1 (47.11), all other peptides are likely stable with the II value of less than 40.00 (Table 1).Despite sharing one conserved MIP domain and six TMs connected by five loops (Table S1, File S1 and Figure S1), the sequence similarity between different CpAQP family members varies from 20.1% to 95.8% (Table S2).In accordance with the evolutionary analysis, the SIP subfamily is distant, exhibiting 23.7-30.0%,28.4-39.0%,22.4-28.3%,and 20.1-27.3%sequence similarities with the PIP, TIP, NIP, and XIP subfamilies, respectively; the XIP subfamily shares 31.2-38.9%,29.2-36.6%,and 25.2-37.6%similarities with the PIP, TIP, and NIP subfamilies, respectively, supporting its close relationship with the PIP subfamily.Compared with other subfamilies, higher sequence similarities were observed within PIP subfamily members, ranging from 65.4% to 95.8%, implying their highly conserved evolution.On the contrary, the NIP and TIP subfamilies are more diverse, having evolved into eight and six evolutionary groups with sequence similarities of 43.1-74.5% and 55.9-85.6%,respectively (Figure 2 and Table S2).Nevertheless, in most cases, the sequence similarities within subfamilies are somewhat higher than those between subfamilies (Figure S2).

Identification of AQP Genes in Representative Plant Species and Insight into Lineage-Specific Family Evolution in Brassicales
The presence of NIP2, SIP3, XIP1, and XIP2 groups in papaya implies that their loss in Arabidopsis is lineage-specific, and may occur sometime after the Arabidopsis-papaya divergence.However, despite the wide presence of the NIP3 group in Malpighiales plants, no homolog was detected in either papaya or Arabidopsis, implying that lineage-specific loss occurred sometime before the Arabidopsis-papaya divergence.To learn more about lineage-specific family evolution in Brassicales, AQP genes were also identified from A. coerulea, horseradish, B. sinensis, spider flower, saltwater cress, A. lyrata, and A. halleri.As a basal eudicot, A. coerulea was shown to encode 29 AQP genes, the same as papaya (Table S1).Among them, AcTIP7;1, a dispersed repeat of AcTIP6;1, was named for clustering with Ac-TIP5;1 (Figure S3), possessing the similar ar/R filter (see below) but sharing lower sequence similarity with AcTIP2;1 and AcTIP6;1 (i.e., 62.5% and 63.1% vs. 63.4%).In horseradish, a Moringaceae plant within Brassicales, a total of 28 AQP family genes were identified, which is comparative to that of papaya.Nevertheless, one recent tandem repeat was found in the XIP2 group and no orthologs were identified for either CpNIP8;1 or CpSIP3;1 in this species (see below).In B. sinensis, an Akaniaceae plant within Brassicales, a high number of 53 AQP genes (also including six other pseudogenes) were identified, nearly twice that of papaya and reflecting the occurrence of one independent recent WGD.Significantly, one XIP1, two XIP2s, two SIP3s, and one NIP8 were identified and the NIP5 group has extensively expanded in this species (Table S1 and Figure S3).In spider flower, which belongs to the Brassicaceae sister family Cleomaceae and experienced one independent recent WGT termed Th-α, a total of 36 AQP genes were identified.Interestingly, despite the absence of the whole XIP subfamily and the NIP8 group, one NIP2 and one SIP3 were identified in this species.In A. lyrata, A. halleri, and saltwater cress, a number of 35, 36, and 35 AQP family genes were identified, respectively (Table S1).Despite their close relationships, compared with Arabidopsis, no AtNIP1;1 ortholog was found in saltwater cress, the AtPIP2;8 ortholog became a pseudogene, and the NIP4 group expanded via tandem duplication (Table S1).Interestingly, in saltwater cress and A. halleri, two orthologs were identified for AtTIP2;1, i.e., EsTIP2;1, EsTIP2;2, AhTIP2;1, and AhTIP2;2, which were characterized as WGD repeats (Table S1).Species-specific distribution of gene numbers in five subfamilies is summarized in Table 3.Generally, AQP family numbers are positively related to the total gene amounts in a genome but not always to the genome size, especially those that possess a high proportion of repetitive sequences such as rubber tree and B. sinensis (Figure 3).Arabidopsis was shown to possess the maximum density of AQP genes, rubber tree and B. sinensis harbor the minimum, and the density in papaya is comparative to A. coerulea and cassava (Table 3).Significantly, compared with A. coerulea, the PIP subfamily has highly expanded during the radiation of core eudicots, whereas the XIP subfamily is absent from spider flower and Brassicaceae plants (Table 3 and Table S1), implying that its lineage-specific loss occurred sometime before the Brassicaceae-Cleomaceae divergence.To infer species/lineage-specific evolution, the BRH-based sequence comparison was further used to identify OGs.As shown in Table S3, a total of 61 OGs were obtained, and each evolutionary group possessed one to 14 OGs.A total of 29 AcAQP genes belonged to 21 OGs, eight of which had expanded via WGD (TIP1a and NIP1a), tandem duplication (NIP4 and XIP2), proximal duplication (XIP2), and dispersed duplication To infer species/lineage-specific evolution, the BRH-based sequence comparison was further used to identify OGs.As shown in Table S3, a total of 61 OGs were obtained, and each evolutionary group possessed one to 14 OGs.A total of 29 AcAQP genes belonged to 21 OGs, eight of which had expanded via WGD (TIP1a and NIP1a), tandem duplication (NIP4 and XIP2), proximal duplication (XIP2), and dispersed duplication (TIP4, TIP6, and XIP2).Moreover, TIP1a/-2, TIP1d/-3, NIP1a/-2, NIP5/-6, and SIP1/-2 were characterized as transposed repeats, whereas TIP2/-4/-5/-6 and TIP1a/-d were characterized as dispersed repeats.This means that the early eudicot ancestor contained at least 21 members, i.e., two PIP1s, two PIP2s, two TIP1s, one TIP2, one TIP3, one TIP4, one TIP5, one TIP6, one NIP1, one NIP2, one NIP3, one NIP4, one NIP5, one NIP6, one NIP7, one XIP2, one SIP1, and one SIP2.During later evolution, significant expansion of several groups was observed in core eudicots, likely contributed by the γ WGD, i.e., PIP1, PIP2, TIP1, NIP4/-8, and XIP1/-2/-3.Except for NIP3a that is also found in Malpighiales plants, papaya shares all of the 20 other OGs identified in A. coerulea (Table S3).Moreover, genes belonging to 12 out of these 20 OGs were shown to be located in syntenic blocks, exhibiting onevs.-one,one-vs.-two,one-vs.-three,and two-vs.-one(Figure 4A), implying a conserved evolution of these genes even after the γ WGD.As for other genes, species-specific transposition or chromosomal rearrangement could be speculated.By contrast, more genes are located in syntenic blocks of Brassicales species, reflecting a relatively short time of evolution.Significantly, the majority of orthologs identified between papaya and horseradish are located in syntenic regions, only excluding CpTIP2;1/MoTIP2;1, CpTIP3;1/MoTIP3;1, CpXIP2;1/MoXIP2;2, and CpSIP2;1/MoSIP2;1 (Figure 4B), though the horseradish genome is fragmented in 33,332 Scfs.It is worth noting that MoXIP2;2 was a tandem repeat of MoXIP2;1, whereas CpTIP2;1, MoTIP2;1, CpTIP3;1, MoTIP3;1, CpSIP2;1, and MoSIP2;1 were characterized as transposed repeats of CpTIP1;1, MoTIP1;1, CpTIP1;2, MoTIP1;1, CpSIP1;1, and MoSIP1;1, respectively (Table S1).Since CpTIP2;1/BsTIP2;1, CpXIP1;1/BsXIP1;1, and CpXIP2;1/BsXIP2;1 are located in syntenic blocks (Figure 4C), possible transposition or chromosomal rearrangement may occur in other species tested.Compared with A. coerulea (4), papaya (13), and horseradish (11), more duplicate genes identified in B. sinensis (35), spider flower (16), and Arabidopsis (20) are located in syntenic blocks (Figure 4), reflecting the occurrence of one to three additional WGD after the γ WGD.Although the spider flower genome is fragmented in 12,249 Scfs and Brassicaceae plants experienced the independent At-α WGD, the majority of duplicate genes identified in spider flower and Arabidopsis were shown to be located in syntenic blocks (Figure S4).Duplicated genes that are not located in syntenic blocks are as follows: AtPIP2;3, a tandem repeat of AtPIP2;2; AtPIP2;4, a syntelog of CpPIP2;2; AtTIP1;2 and AtTIP1;3, two transposed repeats of AtTIP1;1; AtNIP4;2, a tandem repeat of AtNIP4;1; ThPIP1;4, a syntelog of CpPIP1;2; ThPIP1;5, an ortholog of CpPIP1;3; ThPIP2;2, a WGD repeat of ThPIP2;1; ThTIP1;2, a transposed repeat of ThTIP1;1; ThTIP1;3, a syntelog of CpTIP1;3; ThTIP3;2, a dispersed repeat of ThTIP3;1; ThNIP2;1, a syntelog of CpNIP2;1; ThSIP3;1, an ortholog of CpSIP3;1 (Figure 4 and Table S3).The presence of SIP3 and NIP2 members in spider flower implies (Table S3) that their loss is Brassicaceae-specific, sometime after the split with spider flower.

Structural and Functional Inference of CpAQPs
As shown in Table S1, gene structures were analyzed on the basis of curated gene models, and an example of comparing A. coerulea, papaya, and Arabidopsis AQP genes is shown in Figure 5B.Results revealed that the exon-intron structure is usually conserved within an evolutionary group and even subfamily, but variable between different subfamilies.Generally, TIP, SIP, and XIP feature two introns, whereas PIP and NIP possess three and four introns, respectively.Among three groups present in the SIP subfamily, SIP3 is the sole group without an intron.Differing from XIP2 and XIP3 within the XIP subfamily, XIP1 features a single intron, though RcXIP1;3 was shown to be intronless.Besides the whole NIP5 group, AtNIP1;3 and AtNIP1;4 within the NIP subfamily also contain three introns, which appear to be conserved in Brassicaceae plants (Table S1).TIP6 within the TIP subfamily features two introns.However, both AtTIP6;1 and AtTIP6;2 possess a single intron, which is also found in other Brassicaceae plants (Table S1).The intron numbers in TIP1 vary from zero to three: TIP1a features a single intron, though AcTIP1;2 was shown to contain three introns; TIP1d features two introns.However, AtTIP1;3 and its orthologs in Brassicaceae plants are intronless; TIP1c is also intronless (Figure 5B and Table S1).

Structural and Functional Inference of CpAQPs
As shown in Table S1, gene structures were analyzed on the basis of curated gene models, and an example of comparing A. coerulea, papaya, and Arabidopsis AQP genes is shown in Figure 5B.Results revealed that the exon-intron structure is usually conserved within an evolutionary group and even subfamily, but variable between different sub- contain three introns, which appear to be conserved in Brassicaceae plants (Table S1).TIP6 within the TIP subfamily features two introns.However, both AtTIP6;1 and At-TIP6;2 possess a single intron, which is also found in other Brassicaceae plants (Table S1).The intron numbers in TIP1 vary from zero to three: TIP1a features a single intron, though AcTIP1;2 was shown to contain three introns; TIP1d features two introns.However, AtTIP1;3 and its orthologs in Brassicaceae plants are intronless; TIP1c is also intronless (Figure 5B and Table S1).Structural and possible functional divergence was investigated based on conserved motifs as well as dual NPA motifs, the NPA spacing, and the ar/R filter.According to the MEME analysis, 25 identified motifs were shown to be conserved within five subfamilies but distinct between different subfamilies (Figure 5C), reflecting their early divergence and a long time of evolution.Among them, PIP2s within the PIP subfamily possess 10 motifs, i.e., Motifs 1-7, 9, 12, 14, and 17, whereas PIP1s harbor one more (i.e., Motif 17) that is located at their extended N-terminus.TIPs usually contain 10 motifs, i.e., Motifs 1-3, 8, 10, 11, 16, 19, 21, and 24, though Motif 24 is absent from AcTIP1;1, AtTIP1;2, and three evolutionary groups (i.e., TIP2, TIP6, and TIP5) within the TIP subfamily.NIPs usually contain nine motifs, i.e., Motifs 1-3, 5, 13, 15, 18, 22, and 23, though loss of certain motifs was frequently found in three evolutionary groups (i.e., NIP5, NIP6, and NIP7) within the NIP subfamily.By contrast, subfamilies XIP and SIP were shown to contain considerably fewer motifs, i.e., five and four, respectively.Significantly, only two motifs, i.e., Motifs 3 and 20, were identified for both AtSIP2;1 and CpSIP3;1.Motif 3, which is found in all AQPs, is located in TM6 as well as TM3 in most members of subfamilies PIP, TIP, NIP, and XIP.Another two widely present motifs, i.e., Motifs 1 and 2, usually appear in two copies like Motif 3. Motif 1 spans TM2-HB and TM5-HE, which include the NPA motif and the ar/R filter.It is worth noting that the second copy of Motif 1 was replaced by Motif 25 and 20 in XIPs and SIPs, respectively.By contrast, little is known about Motif 2, which is usually located in TM1 and TM4 in contrast to being highly variable in the SIP subfamily.In addition to Motif 17, six other motifs are also specific to PIPs, i.e., Motifs 4, 6, 7, 9, 12, and 14: Motif 4 spans TM4-LD, which is replaced by Motif 8 or Motif 15 in TIPs and NIPs, respectively; Motif 6 spans TM1-LA-TM2, which is replaced by Motif 16 or Motif 22 in TIPs and NIPs, respectively; Motif 7 spans TM3-LC, which is replaced by Motif 10 (including an H residue at the position corresponding to H 131 identified in AtTIP2;1) or Motif 23 in TIPs and NIPs, respectively; Motif 9 is located in TM6, which is replaced by Motif 19 or Motif 13 in TIPs and NIPs, respectively; Motif 12 is located in TM1, which is replaced by Motif 24 in TIPs; Motif 14 is located in LB and includes a putative phosphorylation site at the position corresponding to S115 identified in SoPIP2;1, which is replaced by Motif 21 or Motif 18 in TIPs and NIPs, respectively.In addition to Motifs 8, 10, 16, 21, and 24, Motif 11 is also specific to TIPs, which is located in LE and replaced by Motif 5 in four other subfamilies (Figure 5C).
In contrast to typical dual NPA motifs, NPV, NPT, NPI, NPS, NPL, NPT, NPC, NPG, and HPA variants were also observed, which are widely present in subfamilies SIP and XIP as well as four NIP groups (i.e., NIP1, NIP5, NIP6, and NIP7).The NPA spacing varies from 108 AA to 135 AA, where XIP features a relatively long NPA spacing.The spacing of 108 AA, which was proposed to be essential for silicon permeability, is not only found in NIP2 but also in most members of NIP5, NIP6, and NIP7 (Figure 5D).However, a T residue at the position corresponding to T 109 in OsNIP2;1 was only found in NIPs (File S1), though the usual ar/R filter G-S-G-R was placed by G-V-G-R in AcNIP2;1 (Figure 5D).Corresponding to their substrate specificity, different families were shown to possess distinct ar/R filters.While the F-H-T-R ar/R filter is highly conserved in PIPs, TIPs usually harbor the H-I-A-V/R filter, though N/S-V/F-G-C variants were also observed; NIPs usually have the W-V/I-A-R filter, though A/G/S/T-I/V/S-G/A/S-R variants were also observed; XIP and SIP possess the V/I-V/I/T-V/A-R/K and V/S/A/I-V/H/K/T/F-P/G-I/S/N/A filters, respectively (Figure 5D).As shown in File S1, two highly conserved C residues were also identified in LC and HE of two CpXIPs; putative phosphorylation sites corresponding to S 274 in SoPIP2;1 and S 262 in GmNOD26 were found in all five CpPIP2s and CpNIP1;1/-4;1/-8;1, respectively; an H residue at the position corresponding to H 131 in AtTIP2;1, which was proven to be essential for NH 3 permeability, was found in CpTIP2;1, CpTIP4;1, and CpTIP6;1 (File S1).

Expression Patterns of CpAQP Genes
To uncover the expression evolution of CpAQP genes, their expression profiles were investigated based on RNA-seq data representing four main tissues that include two typical stages of developmental fruit, i.e., root, leaf, sap, and flesh of young and mature fruits.As shown in Figure 6, despite the expression of most CpAQP genes, their transcript levels were highly diverse.The total transcripts of the whole family were most abundant in the flesh of young fruits (100%), followed by the root (73.8%), moderate in the sap (22.7%) and the leaf (21.7%), and relatively low in the flesh of mature fruits (8.5%).Regardless of the tissue tested in this study, the majority of transcripts were contributed by PIPs and TIPs, varying from 82.7% in the leaf to 98.9% in the sap (Figure S5).In the root, 89.2% transcripts were contributed by eight genes, i.e., CpPIP1;1, CpPIP1;3, CpPIP2;2, CpPIP2;3, CpPIP2;4, CpTIP1;1, CpTIP1;2, and CpTIP6;1, where CpTIP1;1 and CpPIP1;3 represent the first and second most expressed AQP genes in this tissue.In the leaf, 81.1% transcripts were contributed by four genes, i.e., CpPIP1;1, CpPIP2;4, CpTIP1;1, and CpXIP2;1, where CpTIP1;1 and CpPIP1;1 represent the first and second most expressed AQP genes in this tissue.In the sap, 93.6% transcripts were contributed by three genes, i.e., CpPIP1;1, CpPIP2;3, and CpTIP2;1, where CpPIP2;3 and CpTIP2;1 represent the first and second most expressed AQP genes in this tissue.In the flesh of young fruits, 83.4% transcripts were contributed by three genes, i.e., CpPIP1;1, CpPIP2;4, and CpTIP1;1, where CpTIP1;1 and CpPIP2;4 represent the first and second most expressed AQP genes in this tissue.In the flesh of mature fruits, 88.6% transcripts were contributed by five genes, i.e., CpPIP1;1, CpPIP2;4, CpTIP1;1, CpTIP1;3, and CpSIP1;1, where CpTIP1;1 and CpPIP2;4 represent the first and second most expressed AQP genes in this tissue.Compared with the flesh of young fruits, the transcripts of most AQP genes were markedly down-regulated in the flesh of mature fruits, whereas CpTIP1;3, CpSIP1;1, and CpTIP6;1 transcripts were significantly up-regulated.According to their expression patterns over different tissues, 29 CpAQP genes were grouped into three main clusters: Cluster I included the six most expressed genes, i.e., CpPIP1;1, CpPIP1;3, CpPIP2;3, CpPIP2;4, CpTIP1;1, and CpTIP2;1; Cluster II was moderately expressed, and included three groups, where IIa was predominantly expressed in the root as well as the flesh of mature fruits, IIb was preferentially expressed in the root, and IIc was typically expressed in the root as well as the leaf and the flesh of young fruits; Cluster III was lowly expressed or tissue-specific, and included four groups, where IIIa was preferentially expressed in the leaf, IIIb was lowly expressed in most tissues, IIIc was rarely expressed in most tissues, and IIId was typically expressed in the root (Figure 6).Interestingly, distinct expression patterns were frequently observed for duplicate pairs identified in this study, where CpPIP1;1, CpPIP1;3, CpPIP2;3, CpPIP2;4, and CpTIP1;1 had evolved into the predominant isoforms, implying their subfunctionalization.By contrast, both CpNIP4;1 and CpNIP8;1 were shown to be rarely expressed in all tissues examined in this study, implying their possible functions in specific tissues and/or stages.

Discussion
Gene duplication, a prevalent phenomenon across the tree of life, has long been recognized as a key contributor to the evolution of genes with new functions [36][37][38].Gene duplicates can arise from WGD as well as tandem, proximal, transposed, dispersed, and segmental duplications [39].WGD, also known as polyploidy, which multiplies the whole genome content, had been proven to play an important role in the diversification of seed plants, angiosperms, as well as core eudicots [36,40].Brassicales, an economically

Discussion
Gene duplication, a prevalent phenomenon across the tree of life, has long been recognized as a key contributor to the evolution of genes with new functions [36][37][38].Gene duplicates can arise from WGD as well as tandem, proximal, transposed, dispersed, and segmental duplications [39].WGD, also known as polyploidy, which multiplies the whole genome content, had been proven to play an important role in the diversification of seed plants, angiosperms, as well as core eudicots [36,40].Brassicales, an economically important order of flowering plants, represents 2.2% of the total extant core eudicot diversity and has been overlooked as a promising system to investigate patterns of disjunct distributions and diversification rates [41].In addition to the γ event shared by all core eudicots [40], multiple independent WGDs have been described in Brassicales, e.g., one recent WGD in B. sinensis, the Th-α event within Cleomaceae, the At-α event at the base of Brassicaceae, and the At-β event near at the base of the order [15,[42][43][44][45].Despite the occurrence of two recent WGDs, Arabidopsis harbors a relatively small genome size of approximately 119.7 Mb and serves as a popular model species for research in many aspects of plant biology [16].Soon after the first version of the Arabidopsis genome was released in 2000 [45], a genome-wide analysis was conducted to provide a systematic nomenclature for plant AQP genes [46].The evolutionary analysis assigned 35 AtAQP genes into four subfamilies, i.e., PIP, TIP, NIP, and SIP, which were further divided into two to seven groups, i.e., PIP1-2, TIP1-5, NIP1-7, and SIP1-2 [46].Surprisingly, a novel but ancient subfamily named XIP was later identified in moss, spikemoss, and a high number of eudicots [26][27][28][29][30][31]33].Interestingly, this special subfamily was also shown to be absent from other Brassicaceae plants [32], implying that its loss may be lineage-specific.However, whether it is present in other Brassicales families is largely unknown.The accessibility of several representative Brassicales genomes beyond Brassicaceae, i.e., papaya in Caricaceae, horseradish in Moringaceae, B. sinensis in Akaniaceae, and spider flower in Cleomaceae [14, 43,44,47], provides a good opportunity to address this issue.
In the current study, genome-wide identification of AQP family genes was performed in papaya as well as seven representative plant species, i.e., horseradish, B. sinensis, spider flower, A. lyrata, A. halleri, saltwater cress, and A. coerulea.In accordance with no recent WGD occurring in papaya and horseradish [3,47], small amounts of 29 and 28 AQP genes were identified from these two species, occupying 0.13% and 0.14% of the total proteincoding genes, respectively.The family numbers are equal or comparative to 29 (0.10%) and 31 (0.11%) members, respectively, found in A. coerulea and physic nut, but less than 35-55 (0.11-0.16%) members present in Arabidopsis, A. lyrata, A. halleri, saltwater cress, castor bean, cassava, rubber tree, B. sinensis, and poplar [29,30,33,35].Like papaya and horseradish, comparative genomics analyses showed that no additional WGD was detected in both physic nut and castor bean after the γ event [48,49].By contrast, the last common ancestor of cassava and rubber tree experienced one recent WGD [30,33,50], whereas poplar was proven to experience one Salicaceae-specific p WGD [34].Interestingly, the amounts of AQP genes were shown to be positively associated with the total gene numbers present in genomes that were mainly contributed by recent WGDs [30,34,44,50], though tandem duplication plays an important role in the family expansion in castor bean [35].Despite the occurrence of one ancient tetraploidization event in A. coerulea [51,52], only two WGD repeats are retained due to a long time of evolution.For species without recent WGDs, i.e., papaya, horseradish, physic nut, and castor bean, six to eight WGD repeats were identified, though a higher E-value cutoff of 1 × 10 −10 instead of 1 × 10 −20 was adopted in this study, which facilitates the detection of microsynteny and ancient WGD repeats [53].By contrast, higher numbers of 11-26 WGD repeats were identified in poplar, cassava, rubber tree, B. sinensis, spider flower, and Brassicaceae species, corresponding to the occurrence of one to three recent WGDs in these species after the γ event [15,34,43,44,50].In Arabidopsis, seven, three, and two repeats were shown to arise from the At-α, At-β, and γ events, respectively, most of which are conserved in Brassicaceae species examined in this study, though A. halleri and saltwater cress have retained one more repeat from the At-α WGD (i.e., AhTIP2;1/-2;2 and EsTIP2;1/-2;2) and the ortholog of AtPIP2;8 in saltwater cress is under fractionation.As reported in other plant species [29,30,33,35], the divergence of duplicate repeats identified in papaya is more likely to be constrained by purifying selection, since their Ka/Ks ratios are below one.Notably, compared with papaya, species-specific gene loss was also observed in its close species horseradish, i.e., orthologs of CpNIP8;1 and CpSIP3;1.
Despite a relatively small family number, CpAQP genes represent all five previously defined subfamilies (i.e., PIP, TIP, NIP, SIP, and XIP) in higher plants [26,29] or 29 out of 61 OGs identified in this study, supporting their high diversity.More than two XIP genes found in papaya, horseradish, and B. sinensis but none in spider flower indicate that the complete loss of the whole XIP subfamily in Arabidopsis is lineage-specific, occurring sometime after its split with papaya but before the Brassicaceae-Cleomaceae divergence.Moreover, based on the comparative analysis of 530 AQP genes identified in 14 representative species, a new nomenclature for subclassification was proposed in this study, which includes 19 groups, i.e., PIP1-2, TIP1-6, NIP1-8, and SIP1-3.In this nomenclature, previously described AtNIP2;1 and -3;1, which were characterized as transposed and WGD repeats of AtNIP1;2 and -1;1, respectively, were assigned into the NIP1 group, whereas the proposed NIP3 group was consistent with that as described in Malpighiales plants [29,30,33,35].The updated NIP2 group was also in accordance with previous studies, which was characterized as a silicon transporter widely present in monocots and eudicots [24,29,30,33,35,54,55].The presence of this group in papaya, horseradish, B. sinensis, as well as spider flower supports the conjecure that its loss in Arabidopsis is Brassicaceae-specific, occurring sometime after the split with Cleomaceae.The previously described AtTIP2;2 and -2;3 were renamed AtTIP6;1 and -6;2, respectively, and belong to the novel but ancient TIP6 group having diverged before the split of basal and core eudicots.Two other novel groups, i.e., NIP8 and SIP3, are widely present in the species examined in this study.The presence of SIP3 in spider flower implies that its loss in Brassicaceae is lineage-specific, occurring sometime after the split with Cleomaceae, whereas the absence of NIP8 from spider flower indicates that its loss in Brassicaceae species occurred sometime after its split with Caricaceae but before its split with Cleomaceae.Interestingly, species-specific loss of these two groups was also found in horseradish and cassava, whereas NIP8 is also absent from physic nut [29,30].
The updated subclassification is not only supported by evolutionary relationships, but also by exon-intron structures and/or conserved motifs/residues.Subfamily PIP, which mainly functions in water transport at the cell membrane [19], includes only two evolutionary groups and features three introns and the F-H-T-R ar/R filter as observed in the pure water channel AqpZ [56].Compared with PIP2s, PIP1s possess longer N-terminal but shorter C-terminal sequences, and include the group-specific Motif 17 at the extended N-terminus.By contrast, PIP2s feature one putative phosphorylation site at the extended C-terminus as observed in SoPIP2;1 [19].Notably, both groups especially PIP2 have highly expanded in core eudicots via WGDs as well as tandem duplication, forming 10 and 14 OGs as identified in this study, respectively, which is in accordance with their importance in adaptation [17,18,20].Compared with PIP, Subfamily TIP, which is also highly permeable to water at the vacuolar membrane [57], is more diverse, possessing six evolutionary groups/16 OGs and harboring the variable ar/R filter of H/N/S-I/V/F-A/G-R/V/C.A total of seven motifs were identified to be specific to TIPs, i.e., Motifs 8,10,11,16,19,21, and 24, though Motif 24 was absent from TIP2, TIP6, TIP5, and two members of TIP1.Similar to TIP, Subfamily NIP is also highly diverse, including eight evolutionary groups and 13 OGs with the ar/R filter of W/A/T/G/S-V/I/S-A/G/S-R. Five motifs identified in this study were shown to be specific to NIPs, i.e., Motifs 13,15,18,22,and 23, though Motifs 13 and 23 were absent from NIP5, NIP6, and NIP7.NIP5, which features the unusual NPS-NPV motifs, represents the unique group with three introns within the NIP subfamily.As for two other distinct subfamilies, Subfamily XIP features longer NPA spacing and two conserved C residues, whereas Subfamily SIP usually possesses a relatively short N-terminus, though both of them feature two introns and a few conserved motifs.Atypical NPL/T/S/C at the first NPA motif was found in all SIPs, which were shown to localize to the endoplasmic reticulum (ER) [58].Among the three evolutionary groups present in Subfamily SIP, SIP3 differed from other groups with no introns, whereas SIP1 and SIP2 favored the NPT and NPL, respectively.Although SIPs possess the distinct and variable ar/R filters from PIPs and TIPs, SIP1 members have been proven to transport water [58,59].By contrast, plasma membrane-localized XIPs, which are widely present in plants, moss, fungi, and protozoan species [31], have been reported to transport glycerol, urea, boric acid, and H 2 O 2 but not water [60][61][62][63].The complete loss of Subfamily XIP in monocots as well as two families within Brassicales, i.e., Brassicaceae and Cleomaceae, may be due to functional redundancy with other AQP subfamilies, e.g., NIP, which was proven to transport water, glycerol, urea, arsenite, selenite, boric acid, lactic acid, silicic acid, NH 3 , and H 2 O 2 [54,[64][65][66][67][68][69][70][71][72].Unlike other subfamilies, the expansion of Subfamily XIP was mainly a result of tandem duplication, composed of three evolutionary groups with NPV/T/I and SPT/I/A/V variants at the first NPA motif.XIP1 differs from other groups with a single intron, implying its possible retrotransposition origin as described in Arabidopsis [46].
Orthologs, which evolved from a common ancestral gene via speciation, usually retain the same functions in the course of evolution in different species [73].The characterization of OGs and the gene expression profiling conducted in this study allow us to infer putative roles of AQP genes in papaya.In agreement with previous studies [29,30,35,74], AQP transcripts in all five samples (i.e., leaf, root, sap, the flesh of young fruits, and the flesh of mature fruits) examined in this study were mainly contributed by PIPs and TIPs, which mediate the water transport at the plasma and vacuolar membranes, respectively [17,20,57].Nevertheless, the expression patterns of different family members appear to be tissuespecific.For example, among five tissues/stages examined in this study, CpAQP transcripts were shown to be most abundant in the flesh of young fruits, corresponding to rapid cell enlargement as well as high water content in early stages of fruit development [75][76][77][78].Moreover, in the flesh of young fruits, most transcripts were contributed by CpTIP1;1, CpPIP2;4, and CpPIP1;1, in contrast to the flesh of mature fruits by CpTIP1;1, CpPIP2;4, CpTIP1;3, and CpPIP1;1.In the leaf, a more important role of CpPIP1;1 was observed, i.e., CpTIP1;1, CpPIP1;1, and CpPIP2;4, in contrast to the root by CpTIP1;1, CpPIP1;3, CpTIP6;1, and CpPIP2;2.By contrast, in the sap, most transcripts were mainly contributed by two members, i.e., CpPIP2;3 and CpTIP2;1, which is highly different from other tissues.It is well known that leaves are photosynthetic organs that regulate water loss through transpiration; roots function in regulating water and nutrient uptake, and the phloem sap is responsible for the movement, distribution, and trafficking of water, nutrients, photoassimilates, and other macromolecules [17,79,80].A more important role of TIP2s than of TIP1s was also described in the root of the rubber tree [30].Whereas TIP1s transport water and urea [57,81], TIP2s and TIP6s facilitate the transport of water and NH 3 [23,82].Compared with PIPs and TIPs, members of three other subfamilies were less expressed, most of which belong to IIIa to IIIc identified in this study.Generally, NIPs and XIPs transport small solutes rather than water [60][61][62][63][64][65][66][67][68][69][70][71][72].Interestingly, a recent study revealed a link between AQP expression and the texture of papaya fruit under different cultivation conditions.Compared with open field conditions, mesocarp cells and intercellular spaces of papaya fruit were larger when cultivated in raised beds, which were shown to correlate with higher expressions of CpTIP2;1, CpTIP4;1, CpSIP1;1 and CpPIP1;3.Moreover, expressions of CpTIP2;1/-4;1 and CpSIP1;1/CpPIP2;5 correlated with the fruit crispness under open field and raised bed conditions, respectively [83].

Identification and Manual Curation of AQP Family Genes
A homology search was performed using published AQP proteins as queries, where the E-value of tBLASTn was set to 1 × 10 −5 .Positive genomic sequences were predicted as previously described [35], and all gene models were further validated with mRNA when available.A homology search for nucleotides or expressed sequence tags (ESTs) was conducted using BLASTn, and read alignment of RNA sequencing (RNA-seq) data was carried out using Bowtie 2 [84].The presence of the conserved MIP domain (Pfam accession number PF00230) in deduced peptides was confirmed using the Pfam search (v35.0,https://pfam.xfam.org/,accessed on 31 June 2023).

Synteny Analysis and Gene Evolution Patterns
Homolog pairs within and between species were identified using the all-to-all BLASTP method with the E-value of 1 × 10 −10 , whereas syntenic blocks (BLAST hits ≥ 5) and gene collinearity were inferred using MCScanX as previously described [85].WGD repeats were defined when homolog pairs are located within syntenic blocks of duplicated Chrs/Scfs, while tandem repeats were considered when two paralogs were consecutive in a genome.Transposed, proximal, and dispersed repeats were identified using the DupGen_finder pipeline as described before [39].To uncover the evolutionary rate of duplicate pairs, Ka (nonsynonymous substitution rate) and Ks (synonymous substitution rate) were calculated by codeml in the PAML package [86].Orthologs across different species were identified using the BRH (best reciprocal hit) method [87] as well as information from synteny analysis, and orthogroups (OGs) were assigned only when they were present in at least two species tested.

Sequence Alignment, Evolutionary Analysis, and Classification
Multiple sequence alignment of full-length AQP proteins was performed using MUS-CLE [88].Unrooted trees were constructed using MEGA 6.0 [89] with the parameters as follows: maximum likelihood method, bootstrap of 1000 replicates, Jones-Taylor-Thornton (JTT) model, uniform rates, complete deletion of gaps, nearest-neighbor interchange (NNI), and making initial tree automatically (Default-NJ/BioNJ).Except for three novel groups identified in this study, the classification of AQPs into subfamilies and groups was performed as described before [35].

Gene Expression Analysis
Expression profiles of CpAQP genes were investigated based on Illumina RNA-seq samples as shown in Table S4.Leaves, roots, and phloem sap were collected from threemonth-old plants of the Maradol Roja variety grown under greenhouse conditions.Immature and mature flesh that were white or yellow in color were obtained from fruits of green (young) and color break (mature) stages, respectively.Quality control of raw RNA-seq reads was performed using Trimmomatic [91], and read mapping was carried out using Bowtie 2 [84].The FPKM (fragments per kilobase of exon per million fragments mapped) method [92] was adopted for expression annotation, and RSEM (v1.2.27) [93] was used to determine differentially expressed genes.Unless specifically stated, the tools in this study were used with default parameters.

Conclusions
To our knowledge, this is the first genome-wide analysis of the AQP gene family in papaya.A relatively small number of 29 members were shown to be highly diverse, representing all five subfamilies, 22 evolutionary groups, and 29 out of 61 OGs identified in this study.A further comprehensive comparison with AQP genes identified from 13 other representative plant species (totaling 530 AQP genes) provides insights into lineage-specific family evolution in Brassicales, including lineage-specific loss of the XIP subfamily and several evolutionary groups such as NIP2, NIP3, NIP8, and SIP3.Moreover, characterization of OGs, the ar/R filter, and gene expression profiles facilitates further functional studies of AQP genes in papaya and other species.

Figure 1 .
Figure 1.Chromosomal locations and duplication events of 29 CpAQP genes.Serial numbers are indicated at the top of each chromosome, and the scale is in Mb.Duplicate pairs identified in this study are connected using lines in different colors, i.e., tandem (purple), transposed (blue), dispersed (gold), and WGD (red).(AQP: aquaporin; Chr: chromosome; Cp: C. papaya; Mb: megabase; NIP: NOD26-like intrinsic protein; PIP: plasma intrinsic membrane protein; SIP: small basic intrinsic protein; TIP: tonoplast intrinsic protein; XIP: X intrinsic protein).

Figure 1 .
Figure 1.Chromosomal locations and duplication events of 29 CpAQP genes.Serial numbers are indicated at the top of each chromosome, and the scale is in Mb.Duplicate pairs identified in this study are connected using lines in different colors, i.e., tandem (purple), transposed (blue), dispersed (gold), and WGD (red).(AQP: aquaporin; Chr: chromosome; Cp: C. papaya; Mb: megabase; NIP: NOD26-like intrinsic protein; PIP: plasma intrinsic membrane protein; SIP: small basic intrinsic protein; TIP: tonoplast intrinsic protein; XIP: X intrinsic protein).

Figure 2 .
Figure 2. Evolutionary analysis of AQPs present in C. papaya, A. thaliana, R. communis, J. curcas, M. esculenta, H. brasiliensis, and P. trichocarpa.Sequence alignment was performed using MUSCLE, and the unrooted evolutionary tree was constructed using the bootstrap maximum likelihood tree (1000 replicates) method of MEGA 6.0.The distance scale denotes the number of amino acid substitutions per site, and the name of each subfamily/group is indicated next to the corresponding cluster.(AQP: aquaporin; At: A. thaliana; Cp: C. papaya; Hb: H. brasiliensis; Jc: J. curcas; Me: M. esculenta; NIP: NOD26-like intrinsic protein; PIP: plasma intrinsic membrane protein; Pt: P. trichocarpa; Rc: R. communis; SIP: small basic intrinsic protein; TIP: tonoplast intrinsic protein; XIP: X intrinsic protein).

Figure 5 .Figure 5 .
Figure 5. Sequence and structural features of C. papaya, A. thaliana, and A. coerulea AQPs.(A) Shown is the unrooted evolutionary tree resulting from full-length AQPs with MEGA 6.0 (MUSCLE, Figure 5. Sequence and structural features of C. papaya, A. thaliana, and A. coerulea AQPs.(A) Shown is the unrooted evolutionary tree resulting from full-length AQPs with MEGA 6.0 (MUSCLE, maximum likelihood method, and bootstrap of 1000 replicates).The distance scale denotes the number of amino acid substitutions per site, and the name of each group is indicated next to the corresponding cluster.(B) Shown are the exon-intron structures displayed using GSDS 2.0.(C) Shown is the distribution of conserved motifs among AQPs, where different motifs are represented by different color blocks as indicated at the left of the figure and the same color block in different proteins indicates a certain motif.(D) Shown are the dual NPA motifs, the NPA spacing, and the ar/R selectivity filter identified in this study.(Ac: A. coerulea; AQP: aquaporin; ar/R: aromatic/arginine; At: A. thaliana; Cp: C. papaya; NIP: NOD26-like intrinsic protein; NPA: Asn-Pro-Ala; PIP: plasma intrinsic membrane protein; SIP: small basic intrinsic protein; TIP: tonoplast intrinsic protein; XIP: X intrinsic protein).

Figure 6 .
Figure 6.Expression patterns of CpAQP genes in various tissues and different stages of developmental fruit.Color scale represents FPKM normalized log2 transformed counts where blue indicates low expression and red indicates high expression.(AQP: aquaporin; Cp: C. papaya; FPKM: fragments per kilobase of exon per million fragments mapped; NIP: NOD26-like intrinsic protein; PIP: plasma intrinsic membrane protein; SIP: small basic intrinsic protein; TIP: tonoplast intrinsic protein; XIP: X intrinsic protein).

Figure 6 .
Figure 6.Expression patterns of CpAQP genes in various tissues and different stages of developmental fruit.Color scale represents FPKM normalized log 2 transformed counts where blue indicates low expression and red indicates high expression.(AQP: aquaporin; Cp: C. papaya; FPKM: fragments per kilobase of exon per million fragments mapped; NIP: NOD26-like intrinsic protein; PIP: plasma intrinsic membrane protein; SIP: small basic intrinsic protein; TIP: tonoplast intrinsic protein; XIP: X intrinsic protein).

Plants 2023, 12, 3847 5 of 23 Plants 2023, 12, x FOR PEER REVIEW 5 of 24 Figure 2. Evolutionary
analysis of AQPs present in C. papaya, A. thaliana, R. communis, J. curcas, M. esculenta, H. brasiliensis, and P. trichocarpa.Sequence alignment was performed using MUSCLE, and the unrooted evolutionary tree was constructed using the bootstrap maximum likelihood tree (1000 replicates) method of MEGA 6.0.The distance scale denotes the number of amino acid substitutions per site, and the name of each subfamily/group is indicated next to the corresponding cluster.