Genome-Wide Identification and Analysis of U-Box E3 Ubiquitin-Protein Ligase Gene Family in Banana

The U-box gene family is a family of genes which encode U-box domain-containing proteins. However, little is known about U-box genes in banana (Musa acuminata). In this study, 91 U-box genes were identified in banana based on its genome sequence. The banana U-box genes were distributed across all 12 chromosomes at different densities. Phylogenetic analysis of U-box genes from banana, Arabidopsis, and rice suggested that they can be clustered into seven subgroups (I–VII), and most U-box genes had a closer relationship between banana and rice relative to Arabidopsis. Typical U-box domains were found in all identified MaU-box genes through the analysis of conserved motifs. Four conserved domains were found in major banana U-box proteins. The MaU-box gene family had the highest expression in the roots at the initial fruit developmental stage. The MaU-box genes exhibited stronger response to drought than to salt and low temperatures. To the best of our knowledge, this report is the first to perform genome-wide identification and analysis of the U-box gene family in banana, and the results should provide valuable information for better understanding of the function of U-box in banana.


Introduction
The ubiquitin/26S proteasome (UPS) pathway degrades ubiquitinated substrate proteins and is extensively involved in various cellular processes [1]. The diverse aspects of plant growth and development and the degradation of short-lived regulatory proteins can be regulated by the UPS [2][3][4]. E1 Ub-activating enzyme, E2 Ub-conjugating enzyme, and E3 Ub ligase are necessary for ubiquitin activation and transfer [5]. First, E1 activates the ubiquitin molecule in an ATP-dependent manner, and then E2 facilitates the attachment of ubiquitin molecule to the target protein in the presence of E3 [6]. E3 ligase plays an important role in protein ubiquitination because E3 can identify target proteins for modification [7]. A single protein or a protein complex joins the ubiquitin reaction, which could be conferred by E3 ligase [8,9]. Ubiquitin E3 ligases facilitate the covalent attachment of ubiquitin to target proteins in eukaryotes [10]. HECT, RING finger, and U-box domain proteins are three types of single-protein E3 ligases [11]. U-box proteins are found in yeast, plants, and animals [12][13][14][15]. The U-box domain is composed of approximately 75 amino acids (aa) [16,17]. Many U-box proteins had function of E3 ligases [18,19]. The genome of Arabidopsis thaliana has more than 60 U-box genes, which have many functions in plants [16]. A previous study has identified the functions of U-box E3 ligases in parsley, tomato, tobacco, and rice [20]. OsU-box gene 51 negatively regulates cell death signaling according to cell death assay [17]. The U-box E3 ligase NtACRE276 of tobacco may play a role in Cf9/Avr9-elicited defense [21]. On the basis of protein domains, eight groups of Plant U-box (PUB) genes are present in the 125 PUB genes of soybean [22]. The flowering condition could be changed in GmPUB8-overexpressing Arabidopsis, which flowered earlier under middle-and short-day conditions but later under long-day conditions [22]. Inactivation of the Arabidopsis PUB13 also results in spontaneous cell death, enhanced levels of the defence hormone SA, and early flowering [23]. In grapevine, the PUB gene significantly regulates the accumulation of resistance proteins under both biotic and abiotic stresses [20].
Banana (Musa spp.) is one of the world's most important fruits [24,25]. The sequencing of the whole genome of banana (Musa acuminata) provides a good platform for the development of banana molecular biology [26]. Until now, the U-box gene family of banana is rarely studied. U-box genes may play important roles in the growth and development of banana, so investigating the E3 gene family in banana is necessary. In this study, the whole genome of the banana U-box gene was determined and analyzed. The conserved domain structure, subgroup classification, evolutionary relationship, intron and exon structure, gene expansion, chromosome mapping, and expression profile analysis were studied, providing a theoretical basis for the analysis of U-box gene functions.

Identification and Chromosomal Localization of U-Box Gene Family Members
In this study, 91 PUB genes are found in banana genome ( Table 1). The MaU-box protein contains a 60-70 aa U-box conserved domain. The length of MaU-box was from 660 (MaU-box69) to 6279 bp (MaU-box57), and the average length was 1789 bp. The predicted protein product range was 219-2092 aa, with an average length of 595 aa. The relative molecular weight (MW) ranged from 23.38 kD to 223.93 kD, with an average of 64.85. The isoelectric point (PI) was in the range of 4.96 (MaU-box78) to 9.57 (MaU-box13). Subcellular localization analysis indicated that 93% of the MaU-box proteins were located in the nucleus and that only six were located in the cytoplasm (Table 1). These findings suggested that the vast majority of MaU-box function in the nucleus.
A MaU-box chromosomal localization map was plotted ( Figure 1). Ninety genes from 91 MaU-box genes were located on chromosomes. Chromosome 3, where the largest number of MaU-box genes was found, contained 11 MaU-box genes. It is followed by chromosomes 4, 5, and 11, which contained 10 MaU-box genes. Nine MaU-box genes were located in chromosomes 7, 9, and 10; 7 MaU-box genes were found in chromosome 8; 6 MaU-box genes were observed in chromosome 1; 5 MaU-box genes were localized in chromosome 2; and only 4 MaU-box genes were detected in chromosome 11.

Gene Structure and Phylogenetic Analysis of U-Box Gene Family Members
By comparing the full-length cDNA sequence with the corresponding genomic DNA sequence, the exon-intron structure of each MaU-box was determined. The number of exons in MaU-box genes ranged from 1 to 18 ( Figure 2). To study the evolutionary relationship of banana U-box proteins, a neighbor-joining (NJ) tree was constructed with U-box proteins from banana, rice, and Arabidopsis ( Figure 3). The aa sequences of the U-box of 91 proteins from banana, 61 from Arabidopsis, and 77 from rice were used. Phylogenetic analysis showed that all identified U-box proteins from banana together with Arabidopsis and rice were clearly divided into seven subgroups. Subgroups I, II, III, IV, V, VI, and VII contain 8, 2, 10, 8, 26, 32, and 5 gene family members, respectively. In general, the U-box from banana had a closer relationship with rice compared with Arabidopsis. Interestingly, these MaU-box genes with similar genetic structures are clustered together. For example, MaU-box51/65/84/91 of subgroup I each contain 11 exons, MaU-box66/87 of subgroup II each contain 18 exons, and MaU-box4/19/20/39/45/54 of subgroup III each contain 1 exon.

Analysis of MaU-Box Gene Family Conserved Motifs
To investigate the structural diversity and predict the function of MaU-box proteins, 20 conserved motifs in banana U-box were identified using the MEME motif search tool and annotated using SMART tools (Figures 4 and 5). Among the 91 U-box genes, 45 (50%) contained U-box conservative motifs without ARM motifs, 22 (24%) contained ARM conservative motifs without U-box motifs, while 24 (26%) contained both U-box conserved motifs and ARM conserved motifs (         Figure 9 shows that 60 MaU-box genes responded to drought, salt, and low-temperature stressors. Among these stressors, the MaU-box gene family showed the strongest response to drought. A total of 55 MaU-box genes exhibited the highest expression under this stressor, during which 45 genes were upregulated by more than tenfold. The MaU-box gene family showed the highest expression at 24 h, during which 54 genes exhibited the highest expression. Salt stress also resulted in the high regulation of the MaU-box gene family, and this stressor led to the highest expression of four genes (MaU-box63/65/71/78) and the upregulation of two genes (MaU-box63/65) by more than tenfold (p < 0.05).

Discussion
The characteristics and functions of the U-box gene family has been studied in several plants [27,28]. In the present study, systematic phylogenetic analyses were conducted to obtain a detailed classification and nomenclature of the banana U-box. We found 91 PUB (Plant U-box) genes in banana genome. Similarly, 61 U-box proteins of Arabidopsis [12] and 77 U-box-containing proteins of rice had been identified and analyzed [17]. In total, 125 soybean PUB (GmPUB) genes, which encode proteins containing the U-box domain, have been identified [22]. The distribution of U-box proteins among species of different kingdoms is uneven [17]. Our data showed that the banana U-box genes were distributed across all 11 chromosomes at different densities. Phylogenetic analysis of the U-box from banana, Arabidopsis, and rice suggested that the U-box could be clustered into seven subgroups (I-VII). A similar study in soybean found that 125 GmPUB proteins were classified into six groups by phylogenetic analysis [22]. In this study, most banana U-box proteins show closer phylogenetic distance to their putative banana homologs than to their corresponding putative rice and Arabidopsis orthologs. Moreover, the U-box from banana had a closer relationship with rice compared with Arabidopsis. Interestingly, banana proteins MaU-box56, MaU-box78, MaU-box83 and MaU-box84 showed a closer phylogenetic relationship to the rice proteins than to their banana paralogs, suggesting that these banana proteins and their corresponding rice orthologs have evolved from a common ancestor before the speciation of the two species [17]. In the present study, conserved motif analysis showed that all identified MaU-box had typical U-box domains. Generally, a protein-protein interaction domain in E3 ubiquitin ligases interacts with their substrates for ubiquitination [29], and a complete U-box domain was found in all PUB proteins [30][31][32]. The proteins that contained conserved motifs had low sequence similarity, suggesting that mutations were accumulated during evolution [22]. The U-box in banana are found in combination with a variation of domains including armadillo (ARM) repeats, WD40 repeats, the tetratricopeptide (TPR) domain. The ARM repeats have been shown mostly to mediate the interaction with substrates, indicating that interaction renders substrates available for ubiquitination [23]. So the U-box proteins without ARM repeats in banana might have different interactions of E3 ubiquitin ligases with their substrates compared with the U-box proteins containing ARM repeats. The MaU-box gene family was differentially expressed in various tissues of banana. Similarly, several AtPUB-ARM genes were widely expressed in different tissues [33]. The MaU-box gene family had the highest expression in the roots. In a previous study, 12 MaE2 genes had the highest expression levels in roots [24]. These results suggested that MaU-box genes might be involved in the formation of the root system. PUB proteins play important roles in regulating plant growth and development [34]. In the present study, the 29 25), which could be explained by the fact that the highly expressed genes usually play important roles in plant development [35], suggesting ubiquitination activation through the first stages of fruit development. Of note, the expression of eight MaE2 genes [23] and three MaU-box genes decreased gradually with prolonged developmental time. In strawberry fruit, all the genes decreased gradually after the flowering stage [36]. These data indicated that some genes (e.g., the MaU-box gene) might play important roles for the growth and development of fruits. Studies have shown that the U-box protein is involved in the response to various environmental stresses [9,20,37]. The U-box protein gene quickly responded to both biotic stress and abiotic stress and significantly influenced the accumulation of resistance related proteins in grapevine [20]. Silencing tomato U-box E3 ligase ACRE74 lead to break down of Cf9-especified resistance against Cladosporium fulvum leaf mold [20]. The U-box genes of rice might be involved in the defense against diseases [17]. A previous study observed differential expression patterns in nine soybean genes under drought stress [22]. In the present study, the MaU-box genes exhibited stronger response to drought than to salt and low temperature. Under drought stress, 45 MaU-box genes were upregulated by more than tenfold. OsPUB57 showed a stronger expression only in the resistant plants carrying the Pi9-resistant gene [17]. Consistent with this study, in our study, the MaPUB84 and MaPUB91 genes which showed closer phylogenetic distance to OsPUB57 had high expressions under stress. These results indicated that PUB genes might have key functions in responding to drought stress in plant.

Plant Materials and Treatment
The test material "Brazil" banana was obtained from the banana plantation of the National Banana Industry Technical System of Zhanjiang Comprehensive Test Station, South Subtropical Crops Institute, Chinese Academy of Tropical Agricultural Sciences (Zhanjiang, Guangdong, China). Different organs (roots, stems, leaves, female flowers, male flowers) were collected to study the temporal and spatial expression patterns of bananas. The fruits were collected at different developmental stages (25,45,65, and 85 days after florescence) to study fruit development. A healthy and consistent banana seedling with four leaves was selected for stress experiments. The banana seedlings were treated with 20% PEG 6000 (drought stress treatment) and 200 mM NaCl (salt stress treatment) and harvested at different time points (1, 6, and 24 h) after treatment [23]. The experiments were performed in triplicate.
All samples were frozen in liquid nitrogen and stored at −80 • C for the purpose of extracting RNA for expression analysis.

Genome Identification of Banana U-Box Gene Family Members
To identify the potential members of the banana U-box protein family, publish the Arabidopsis thaliana and rice U-box protein sequences as seed sequences, and used BLASTP method search the banana genome database (Banana Genome Hub, available online: http://banana-genome.cirad.fr/ content/download-dh-pahang) and Phytozome (available online: http://www.phytozome.net/) database. All candidate U-box genes were further verified by using SMART conserved domain search tools (available online: http://smart.embl-heidelberg.de/), eliminating repeat sequences, and deleting genes without the U-box domain. The MW and PI prediction of all U-box proteins was performed using the ProtParam tool (available online: http://web.expasy.org/orgparam/). Information on the MaU-box gene, including chromosomal location, DNA sequence, CDS sequence, and aa length, was obtained from phyome12 (available online: https://phytozome.jgi.doe.gov/pz/portal.html#!info? alias=Org_Gmax). The MW and theoretical PI of the candidate MaU-box protein were obtained using the ExPASy Online Tool (available online: http://expasy.org/tools/). The subcellular localization of banana U-box protein was predicted by using the online software, Plant-mPLoc (available online: http://www.csbio.sjtu.edu.cn/bioinf/plant-multi/#). Finally, chromosome mapping was performed using the MapInspect tool according to the position of the U-box on the chromosome. For convenience, the MaU-box genes were numbered MaU-box1-MaU-box91 according to the order of chromosome 1 to 11. The structure of each gene was visualized using the Gene Structure Display Server (available online: http://gsds.cbi.pku.edu.cn/).

MaU-Box Protein Conserved Motif and Phylogenetic Analysis
The protein conserved motif of the MaU-box gene family was analyzed using MEME Suite 4.11.4 (available online: http://meme.nbcr.net/meme/) software. The maximum number of protein motifs was 20, and the length of the motifs was 6 to 200 aa.
To understand the evolutionary relationship of the U-box gene, we used the Clustal X version 1.83 software (lllkirch, France) with default parameters to compare the sequences of Arabidopsis thaliana, rice, and banana U-box gene family members. The phylogenetic tree was constructed by comparing the results with MEGA6.0 software (state college, PA, USA). The parameters of the software were set as follows: NJ method as the adjacency method and Poisson correction, paired delete, and bootstrap (1000 repetitions).

Gene Expression Analysis
The MaActin fragment of the banana was selected as the internal reference, and the primers were designed according to the registered sequence. All the MaU-box genes secific primers were designed according to the coding sequences by Primer5 software (PREMIER Biosoft International, Palo Alto, CA, USA) and checked using Blast in NCBI (available online: https://www.ncbi.nlm.nih.gov/). The relative expression level of the U-box gene was calculated using Equation 2 −∆∆ Ct.

Conclusions
Ninety-one U-box genes of the banana genome were classified into seven subgroups. Typical U-box domains were found in all identified MaU-box. The MaU-box gene family had the highest expression in the roots, and the strongest expression was found at the first developmental stage. The MaU-box genes exhibited stronger response to drought than to salt and low temperature. The results of this study provide information on the evolution and functions of the MaU-box genes.