Next Article in Journal
Characteristics of Mango Leaf Photosynthetic Inhibition by Enhanced UV-B Radiation
Previous Article in Journal
Phytochemicals, Antioxidant and Antidiabetic Activities of Extracts from Miliusa velutina Flowers
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Genome-Scale Computational Identification and Characterization of UTR Introns in Atalantia buxifolia

1
College of Horticulture, Shanxi Agricultural University, Jinzhong 030801, China
2
College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou 350002, China
*
Authors to whom correspondence should be addressed.
Horticulturae 2021, 7(12), 556; https://doi.org/10.3390/horticulturae7120556
Submission received: 31 October 2021 / Revised: 29 November 2021 / Accepted: 4 December 2021 / Published: 7 December 2021
(This article belongs to the Section Genetics, Genomics, Breeding, and Biotechnology (G2B2))

Abstract

:
Accumulated evidence has shown that CDS introns (CIs) play important roles in regulating gene expression. However, research on UTR introns (UIs) is limited. In this study, UIs (including 5′UTR and 3′UTR introns (5UIs and 3UIs)) were identified from the Atalantia buxifolia genome. The length and nucleotide distribution characteristics of both 5UIs and 3UIs and the distributions of cis-acting elements and transcription factor binding sites (TFBSs) in 5UIs were investigated. Moreover, PageMan enrichment analysis was applied to show the possible roles of transcripts containing UIs (UI-Ts). In total, 1077 5UIs and 866 3UIs were identified from 897 5UI-Ts and 670 3UI-Ts, respectively. Among them, 765 (85.28%) 5UI-Ts and 527 (78.66%) 3UI-Ts contained only one UI, and 94 (6.38%) UI-Ts contained both 5UI and 3UI. The UI density was lower than that of CDS introns, but their mean and median intron sizes were ~2 times those of the CDS introns. The A. buxifolia 5UIs were rich in gene-expression-enhancement-related elements and contained many TFBSs for BBR-BPC, MIKC_MADS, AP2 and Dof TFs, indicating that 5UIs play a role in regulating or enhancing the expression of downstream genes. Enrichment analysis revealed that UI-Ts involved in ‘not assigned’ and ‘RNA’ pathways were significantly enriched. Noteworthily, 119 (85.61%) of the 3UI-Ts were genes encoding pentatricopeptide (PPR) repeat-containing proteins. These results will be helpful for the future study of the regulatory roles of UIs in A. buxifolia.

1. Introduction

Introns, the removed genomic sequences from corresponding RNA transcripts, have been intensely studied since their first discovery [1,2]. They can be generally divided into three groups (Groups I–III). Group I and II introns are of self-splicing activity and are both widely identified in some bacterial and organellar genomes [3,4], while group III introns are spliceosomal introns mainly found in the nuclear genomes of eukaryotes, and the excision of this kind of introns is spliceosome-dependent [5]. Accumulated evidence has shown that the presence of introns and the behaviors of spliceosomes affect almost every step of gene expression [6,7]. Some introns have the effect of boosting gene expression [8,9,10,11], and this intronic effect is called intron-mediated enhancement (IME). The addition of alcohol dehydrogenase-1 (Adh1) first intron increased the expression of a maize chimeric chloramphenicol acetyltransferase (CAT) gene for 100-fold [12]. The Shrunken-1 (Sh1) intron 1 could enhance chimeric gene expression by approximately 100-fold, and the combined Sh1 first exon and intron 1 could enhance report gene expression by more than 1000-fold [13]. The expression of petunia small subunit of ribulose bisphosphate carboxylase (rbcS) in transgenic tobacco overexpressing its gDNA expressed about five-fold higher than that in a transgenic plant overexpressing its cDNA [14]. The first intron of Arabidopsis elongation factor 1 beta gene (AteEF-1β) was proved to be required for the gene’s high expression due to the enhancer-like element that existed in this coding sequence intron (CI) [15].
Except in the CDS, introns are also located within the untranslated regions (UTRs) of a gene [16]. Much evidence has proved that 5UIs function in regulating gene expression [17,18,19,20,21]. The 5UI of the soybean polyubiquitin (Gmubi) gene seems to act as a promoter of regulatory elements and is closely associated with the gene’s high expression [22]. The 5UI of Arabidopsis glyoxylate aminotransferase 1 (GGT1) has been proved to contribute to maximum transcript abundance [10,23]. The addition of Gladiolus polyubiquitin (GUBQ1) 5UI to its promoter led to enhanced GUS expression in transgenic Gladiolus and Arabidopsis plants compared with transgenic plants only overexpressing its promoter [24]. The sesame FAD2 5UI was enhanced to 100-fold of GUS expression as compared with 5UI-less controls in transgenic Arabidopsis tissues [25].
Transcripts containing 3UI were generally considered non-functional because they could stimulate mRNA degradation by nonsense-mediated decay (NMD) [26,27,28,29,30]. However, numerous studies have verified that 3UIs also play a part in normal gene expression modulation [29,31,32,33,34,35]. For example, the yeast HAC1 transcript containing a retained 3′UTR intron could block its mRNA translation [32]. Moreover, many functional transcripts containing 3UI either boosted expression or differential exon usage upon NMD inhibition [36,37].
Over the past decade, whole genome-wide structure and sequence characterizations of UIs have been successfully identified from several species. In the genomes of A. thaliana, Drosophila melanogaster, human, and mouse, CIs and UIs were identified and the 5UIs sizes were roughly twice of CIs and 3UIs [38]. In addition, 5UIs could significantly enhance gene expression, and the intron length greatly influences the gene expression level in Arabidopsis [39]. Roy et al. [40] conducted the evolutionary conservation analysis of UTR splicing in Cryptococcus neoformans and found that the splicing boundaries in 5′UTRs were more conserved than in 3′UTRs. Cenik et al. [41] discovered that human genes with regulatory roles were surprisingly rich in 5UI. Shi et al. [42] analyzed the UIs characteristics and revealed UIs’ differential expression in different organs of sweet orange. These findings can provide information for evolution and functional significances of UIs.
Chinese box orange (Atalantia buxifolia, once named as Severinia buxifolia) is an evergreen citrus plant native to China and some Asian countries [43]. Its roots and branches are rich in various bioactive compounds [44,45]. Its genome has been published with accurate UTR information (http://citrus.hzau.edu.cn/, accessed on 1 March 2020), which facilitated the genome-wide identification of UIs. In this study, based on the A. buxifolia genome data, we identified and characterized the introns, including 5UIs, 3UIs and CIs, in A. buxilolia. Furthermore, transcripts containing UIs were further bioinformatically analyzed to show their functions. The results obtained in this study will be helpful for understanding the regulatory roles of UIs in gene expression in A. buxifolia.

2. Materials and Methods

2.1. Data Preparation

The A. buxifolia genome data file was downloaded from the Orange Annotation Project (http://citrus.hzau.edu.cn/orange/download/index.php/, accessed on 1 March 2020) [46]. All the A. buxifolia CDS sequences were submitted to Mercator v.3.6 (https://plabipd.de/portal/mercator-sequence-annotation, accessed on 2 March 2020) to obtain the mapping file used for Mapman analysis.

2.2. Genome-Wide Identification of A. buxifolia UIs

Based on the annotation information of the A. buxifolia genome, introns in CDSs, 5′UTR and 3′UTR were separately extracted. Then, the 5UIs and 3UIs were identified from the A. buxifolia genome according to the method described by Shi et al. [42]. Introns between UTR exons were extracted according to the genome annotation file, and introns showed retention in the exon region of any transcripts were excluded to ensure the identified UIs strictly exist in the intron regions. Information of the identified 5UIs and 3UIs were shown in Additional file Tables S1 and S2, respectively. UI density, position preference, length and nucleotide composition statistical analysis were performed using Perl, and figures were drawn using ggplot2 [42].

2.3. Gene Pathway-Enrichment Analysis of UI-Ts

To annotate and illustrate the A. buxifolia genes containing UIs, we conducted PageMan pathway-enrichment analysis for all the UI-Ts, 5UI-Ts and 3UI-Ts, respectively. Briefly, the genes containing 5UI and/or 3UI were first subjected to Mapman analysis based on the abovementioned A. buxifolia mapping file. Then, pathway-enrichment analysis was performed using PageMan embedded in MapMan [47]. By applying the Benjamini and Hochberg adjustment, pathways with corrected p value < 0.05 were considered significantly enriched by UI-Ts.

2.4. Cis-Acting Element and Transcription Factor Binding Sites (TFBS) Prediction Analysis of 5UI Sequences

PlantCARE (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/, accessed on 27 March 2020) was applied to predict the possible cis-acting elements in the sequences of 1077 A. buxifolia 5UIs [48]. The TFBSs in all the 5UI sequences were categorized and matched with the Citrus sinensis TFs library using PlantTFDB (http://plantregmap.cbi.pku.edu.cn/, accessed on 27 March 2020). Additionally, the parameters used for TFBSs prediction were set as q-value ≤ 0.05 and p-value ≤ 10−7.

3. Results

3.1. Identification of Introns in A. buxifolia CDSs and 5′ and 3′ Untranslated Regions (UTRs)

In total, we identified 16,218 5′UTRs, 16,337 3′UTRs and 28,412 CDSs from the A. buxifolia genome. Additionally, 597 5′UTRs (accounting for 3.68% of 5′UTRs), 452 3′UTRs (2.77% of 3′UTRs) and 21, 005 CDSs (73.93% of CDSs) were found to contain introns. The intron-harboring ratios for 5′UTRs and 3′UTRs were significantly lower than those in CDSs, which may well explain why the UIs were often overlooked. After normalizing the intron density to the average number of introns per nucleotide of each gene transcript sequence, we found that the intron density followed the order: CDS > 5′UTR > 3′UTR, which was 3.11 × 10−3, 8.42 × 10−5 and 3.08 × 10−5, respectively (Table 1). The 5UI density is ~2.7 times higher than 3UI’s, which is only ~2.7% of the CIs.
In total, 1077 5UIs and 866 3UIs were identified from 897 5′UTR-containing gene transcripts (corresponding to 597 genes) and 670 3′UTR-containing gene transcripts (452 genes), respectively. Among these UI-containing gene transcripts (UI-Ts), 94 transcripts (91 genes) contained both 5UI and 3UI. Approximately 85.28% of the 5UI-Ts and 78.66% of the 3UI-Ts contain only one 5UI or 3UI (Table 2). The proportion of UI-Ts containing two or more UIs dropped dramatically. Only 11.82% of the 5UI-Ts contain two 5UIs, and 13.43% of the 3UI-Ts contain two 3UIs (Table 2). The proportion of UI-Ts containing more than two 5UIs or 3UIs accounted for about 2.98% and 5.23%, respectively (Table 2). Some UI-Ts were found to contain several UIs (Table 2), for example, one cysteine-rich receptor-like protein kinase 8 gene (CRK8, sb36468.1) had eight 5UIs; an unknown gene transcript (sb14804.1) had seven 5UIs; one PPRP gene (sb26751.1) had six 5UIs; a PPRP gene (sb25651.1) and a LRR and NB-ARC domains-containing disease resistance protein gene (sb31448.1) had six 3UIs; an unknown gene transcript (sb31747.1) contained five 5UIs and five 3UIs; a AP2/ERF and B3 domain-containing transcription factor (sb27690.1) and a basic helix-loop-helix (bHLH) DNA-binding superfamily protein (sb28046.1) both had five 5UIs; a bromodomain transcription factor (sb13718.1), an extra-large G-protein 1 (sb31879.1) and an unknown gene transcript (sb10627.1) had five 3UIs; a TEOSINTE BRANCHED 1, cycloidea and PCF transcription factor 2 (sb19593.1), a Leucine-rich repeat protein kinase family protein (sb20478.1), a putative clathrin assembly protein (sb12332.1) and an unknown gene (sb28722.1) had four 5UIs; and three PPRP genes (sb17114.1, sb26143.1 and sb25688.1) and a Class II aminoacyl-tRNA and biotin synthetases superfamily protein gene (SYNC3, sb31386.1) had four 3UIs.

3.2. Intron Sizes and Distributions within UTRs and CDSs

The introns within 5′UTRs, CDSs and 3′UTRs of A. buxifolia varied greatly in amounts and lengths (Figure 1). The average length distributions of A. buxifolia 5UIs and 3UIs were more similar (5′UTR: n = 1077, the mean, median, LQ, UQ and SD length correspond to 823.74, 409, 165, 851 and 2599.52 nucleotides, respectively; 3′UTR: n = 866, the mean, median, LQ, UQ and SD length is 838.85, 452, 141, 932.5 and 1823.39 nucleotides, respectively). Their mean and median intron sizes were ~2 times higher than the CIs (n = 319,907, mean = 430.4 nucleotides, median = 175 nucleotides, LQ = 102 nucleotides, UQ = 464 nucleotides and SD = 1698.36 nucleotides). The frequency of CIs with lengths ranging from 100 to 300 nucleotides was significantly higher than 5UIs and 3UIs. However, the relative frequencies of short introns < 50 nucleotides and introns > 300 nucleotides of 3UIs and 5UIs were higher than CIs. Similar to sweet orange [42], the A. buxifolia 5UIs and 3UIs are more preferentially located at the stop ends of 5′UTRs and at the beginning of 3′UTRs, respectively.

3.3. Nucleotide Conservation around the Splice Junctions

To show the nucleotide bias around the donor and acceptor sites of 5UIs, CIs and 3UIs, the sequence logos were used [49]. Results showed that both the A. buxifolia UIs and CIs possess A/T-rich element around both donor and receptor sites [42]. Moreover, GT-AG was found to be the major splice site pair in both A. buxifolia 5′UTRs (98.32%) and 3′UTRs (98.26%), followed by GC-AG splice site pair (accounting for 1.67% of 5′UTRs and 1.73% of 3′UTRs) (Figure 2).

3.4. Cis-Acting Elements and TFBS Prediction Analysis of 5UIs

In total, 46,543 cis-acting elements belonging to 47 element types were identified from all 5UI sequences; each 5UI contains 43.22 elements on average (Table 3, Additional file Table S3). About 82.92% and 88.02% of 5UIs contained ‘core promoter element around −30 of transcription start’ and ‘common cis-acting element in promoter and enhancer regions’, respectively. Additionally, these two kinds of elements took the largest part, respectively, accounting for 26.01% and 11.47% of the total elements. More than 73.00% 5UI-Ts contain both the two elements. Besides, many light-related elements were identified in 5UI sequences.
TFBS prediction analysis identified 1,092 binding sites of 21 TFs from 90 (44.78%) input 5UI sequences (Table 4, Additional file Table S4). Additionally, TFBSs for BARLEY B RECOMBINANT/ BASIC PENTACYSTEINE (BBR-BPC) transcription factors took up the largest part (24.54%, the matched sequences are GAGA/ CTCT), followed by MADS intervening keratin-like and C-terminal -type MADS (MIKC_MADS) (24.35%), APETALA2/Ethylene-Responsive factor (AP2) (16.12%), DNA binding with one finger (Dof) (13.64%), etc. Among these UIs, the first 5UI (175 bp in length) of homeobox protein 24 gene (HB24, sb16080.1.5UTR_intron.1) contained the largest TFBSs (88), among which the BBR-BPC BSs accounted for 39.08%.

3.5. Gene Pathway-Enrichment Analysis of UI-Containing Transcripts (UI-Ts)

1171 (79.50%) of the UI-Ts were successfully annotated by Mapman. PageMan enrichment analysis of these annotated UI-Ts showed that they were significantly enriched in six pathways (p-value ≤ 0.05), i.e., ‘not assigned. no ontology. pentatricopeptide (PPR) repeat-containing protein’, ‘not assigned. no ontology’, ‘not assigned’, ‘RNA. regulation of transcription’ and ‘RNA. RNA binding’ (Table 5). 5UI-Ts and 3UI-Ts were also respectively subjected to PageMan enrichment analysis. The ‘not assigned’, ‘RNA’, ‘RNA. regulation of transcription’, ‘protein’, ‘protein degradation’, ‘signaling’ and ‘stress’ pathways ranked top 10 enriched pathways for both 5UI-Ts and 3UI-Ts (Table 6).
Intriguingly, 85.61% (119/139) of the 3UI-Ts were found to be members of PPRP gene family. There are, respectively, 108 (corresponding to 65 genes), 11 (corresponding to 11 genes) and 20 (corresponding to 14 genes) PPRPs containing only 3UI, both 5UI and 3UI, and only 5UI. PPRP gene family can be further divided into two subfamilies: P subfamily and PLS subfamily [50]. Among the 119 PPRP 3UI-Ts, there are 85 P subfamily members and 34 PLS subfamily members. About 48.24% (41/85) of the P subfamily PPRP 3UI-Ts and 64.71% (22/34) PLS subfamily PPRP 3UI-Ts possess only one UI. However, all the UI containing PPRPs have no intron in CDS regions.

3.6. ‘RNA’ Related UI-Ts

In this study, a total of 230 ‘RNA’ related UI-Ts, including 159 5UI-Ts and 85 3UI-Ts, were identified. The ‘RNA’ pathway of Mapman consisted of ‘RNA. processing’, ‘RNA. transcription’, ‘RNA. regulation of transcription’ and ‘RNA. RNA binding’ (Table 7), among which ‘RNA. regulation of transcription’ and ‘RNA. RNA binding’ were significantly enriched pathways for UI-Ts as mentioned above (Table 5). In total, 199 UI-Ts (13.51% of all UI-Ts), including 137 5UI-Ts (9.30% of all UI-Ts) and 48 3UI-Ts (3.26% of all UI-Ts), were ‘RNA. regulation of transcription’-related. Most of these UI-Ts were transcription factor genes such as genes encoding bHLH, SET-domain transcriptional regulator, TCP, C2H2, GRAS, Trihelix and AP2 transcription factors, and so on. The significantly enriched ‘RNA. RNA binding’-related UI-Ts were all 3UI-Ts, including two genes encoding UBP1-associated protein 2A (UBA2A) (sb20247.1 and sb20247.2), two genes encoding U12-type spliceosomal protein U11/U12-31K (sb31320.1 and sb31320.3), five genes encoding D111/G-patch domain-containing protein (sb32422.1, sb32422.2, sb32422.3, sb32422.4 and sb32422.5) and two genes encoding RNA-binding (RRM/RBD/RNP motifs) family protein (sb37383.1 and sb37383.2).

4. Discussion

In the present study, based on the A. buxifolia genome data, we identified the introns existing in the UTR and CDS regions. Similar to sweet orange [42], more than 70% of A. buxifolia CDSs were found to contain introns. Unlike the CDSs, few UTRs were found to own intron. Only 3.68% of 5′UTRs and 2.77% of 3′UTRs were intron-containing. This might well explain why UTR introns are often neglected. Bioinformatic analysis of UIs and UI-containing transcripts (UI-Ts) was then performed. Additionally, the results obtained in this study were shown as follows.

4.1. The Lengths of A. buxifolia UIs Were Less Conserved Than CIs, and Most UI-Ts Contain Only One UI

Although the density of UIs was significantly lower than that of CIs, their mean and median intron sizes were ~2 times those of CIs. Similar to Arabidopsis and C. sinensis [39,42], the frequency of A. buxifolia CIs with 100~300 nucleotides was significantly higher than that of 5UIs and 3UIs, but the relative frequencies of short introns <50 nucleotides and long introns >300 nucleotides in 5′UTRs and 3′UTRs were higher than those in CDSs, indicating that the UI lengths were less conserved. Moreover, in accordance with previous studies [16,38,41], we found that most 5UI-Ts and 3UI-Ts contain only one UI.

4.2. A/T-Rich Elements around Both Donor Sites and Receptor Sites of A. buxifolia UTRs Are Important for UI Recognition and Removal

Accumulated evidence showed that there were many splicing signals and factors influencing intron removal and mRNA transcription [51]. Among these factors, splice site pairs greatly influenced the effectivity of recruiting splicing machinery [52]. Consistent with previous studies [40,42], the 5′ donor sites and 3′ acceptor sites in A. buxifolia UTRs were also very conserved, and GT-AG and GC-AG were the two major splice site pairs for both A. buxifolia 5′UTRs and 3′UTRs. Moreover, it has been suspected that the A/T-rich elements play a role in intron recognition [42,53,54,55,56,57]. In the present study, an A/T-rich element was found around both donor sites and receptor sites of A. buxifolia UTRs, indicating that they might function in UI recognition and removal.

4.3. A. buxifolia 5UIs Were Rich of Gene-Expression-Enhancement-Related Elements and TFBSs, Indicating That They Might Contribute Greatly to Gene Expression Regulation

Currently, evidence has shown that UTR introns, especially the 5UIs, contribute greatly to gene expression regulation [58,59,60]. Some 5UIs and other 5′-proximal introns with cis-elements were proved to have the ability of enhancing gene transcription [6,61,62]. Kamo et al. [24] demonstrated that the 5UI of GUBQ1 acted as the promoter core sequence to increase GUS translation efficiency in transgenic plants. Lu et al. [58] and Samadder et al. [63] reported that the 5UI of rice rubi3 gene could improve the gene’s expression at both transcriptional and post transcriptional levels. In our present study, the A. buxifolia 5UIs were found to be rich of gene-expression-enhancement-related elements (such as ‘core promoter element around −30 of transcription start’ and ‘common cis-acting element in promoter and enhancer regions’ elements), indicating that these 5UIs may play roles in enhancing the expression of their corresponding genes.
Cenik et al. [41] reported the particular enrichment of 5UIs in genes with regulatory roles. The eukaryotic genes’ transcription is regulated by transcription factor (TFs), which could interact specifically with sequences in the promoter regions of the genes they regulate [53,54]. Consistently, regulatory genes tend to have more transcription factor binding sites (TFBSs) in their 5UIs [21]. In our present study, we found that the A. buxifolia 5UIs contain many TFBSs for BBR-BPC, MIKC_MADS, AP2 and Dof TFs. Functional analyses have revealed the indispensable role of BBR/BPC proteins in the gene expression control of TF genes [55,56,57]. Several Dof proteins have also been proved to contribute to gene expression activation by interacting with some other regulatory proteins [64,65]. The enrichment of these TFBSs in A. buxifolia 5UIs indicated that they play roles in the expression regulation of their corresponding downstream genes.

4.4. Many UI-Ts Are Involved in RNA Metabolism

Pathway-enrichment analysis revealed that UI-Ts involved in ‘RNA’ pathways were significantly enriched. Moreover, these ‘RNA’-pathway-related UI-Ts are more inclined to contain 5UI, which supported the findings of Cenik et al. [41] that genes with regulatory roles were more prone to possess 5UI. In this study, 40.00% (12/30) of the ‘RNA. regulation of transcription. unclassified’-pathway-related 5UI-Ts were genes encoding A20/AN1-like zinc finger (Znf) family proteins. Plant A20/AN1-ZnF family proteins were implicated in the plant responses to various abiotic and biotic stresses [66], suggesting that UIs might function in A. buxifolia stress responses. The bHLH TFs contributed greatly to regulating multiple plant cellular and biological processes [67]. In this study, genes encoding bHLH transcription factors were also enriched by 5UI-Ts. In addition, the ‘RNA binding’ pathway was significantly enriched by 3UI-Ts. The correlations between 3UI-Ts and RNA binding have been proved in previous studies. The expression of 3UI-Ts was more likely to be inhibited by NMD than gene transcripts with no 3UI, and the most significantly enriched NMD-affected gene transcripts were those encoding RNA-binding proteins [36,68]. Eukaryotes RNA-binding proteins (RBPs) play crucial roles in almost all aspects of post-transcriptional gene expression regulation [69]. In this study, two UBP1-associated protein 2A (UBA2A) genes were identified to be 3UI-Ts. It was reported that UBA2A could bind to RNA molecules containing U-rich sequences in 3′UTRs and might make contributions to the stabilization of mRNAs in the nucleus [70]. Thus, it is hypothesized that the 3UIs of the two UBA2As contribute to the mRNA stabilization status in A. buxifolia.

4.5. Most A. buxifolia 3UI-Ts Are Members of PPRP Gene Family, and Many UI-Ts Are Stress-Response-Related or with Unknown Function

Noteworthily, 119 (85.61%) of the 3UI-Ts belong to genes encoding pentatricopeptide (PPR) repeat-containing proteins. In Arabidopsis, rice and Populus trichocarpa, 441, 477 and 626 PPRP gene members were identified, respectively [50,71,72]. PPRP gene family widely exists in plants, especially terrestrial plants, and plays a crucial role in plant growth and development, and stress response processes [73,74,75]. Usually, the PPRP genes contain no intron in their CDSs [76,77]. In our present study, we also found that all the UI-containing PPRPs contained no intron in their CDSs, indicating that alternative splicing events of PPRPs mainly occurred in UTRs, and UIs contributed greatly to the tissue- or organ-specific expression and the regulatory functions of PPRPs.
Stress can affect the efficiency or patterns of splicing and intron retention for stabilizing the transcripts or serving to modify its biological functions [78,79,80]. The 3UI-Ts of sweet orange were significantly enriched in stress pathway [42]. In this study, we also identified many UI-Ts related to ‘stress’ and ‘signaling’, suggesting that UIs may play an extremely important role in plant defense responses [42,78]. Additionally, we found many UI-Ts, which were categorized into the ‘unclassified’ pathway. The functions of these UIs and their corresponding UI-Ts need to be further studied.

5. Conclusions

In this study, we performed a genome-scale computational analysis of UIs in A. buxifolia, investigated their size and nucleotide distribution characteristics, explored the regulatory cis-elements and TFBSs in the 5UI sequences, and explicated the possible functions of UI-Ts. In total, we identified 1077 5UIs and 866 3UIs from the A. buxifolia genome data. The density of 5UIs and 3 UIs was lower than that of CIs, but they were twice as big as CIs. The 5′ donor sites and 3′ acceptor sites in A. buxifolia UTRs were very conserved, and GT-AG was the mostly commonly splice site pair for both 5′UTRs and 3′UTRs. A/T-rich elements functioning in intron recognition and removal were found around both donor sites and receptor sites of A. buxifolia UTRs. Most UI-Ts contained one 5UI or 3UI. Many gene-expression-enhancement-related elements and TFBSs were discovered in the A. buxifolia 5UIs, indicating that 5UIs play a role in regulating or enhancing the expression of their corresponding downstream genes. Consistently, pathway-enrichment analysis revealed that UI-Ts involved in ‘RNA’ pathways were significantly enriched. Notably, more than 85% of the 3UI-Ts were genes encoding pentatricopeptide (PPR) repeat-containing proteins. The results obtained in this study are of great significance for further understanding the regulation of UIs in A. buxifolia gene expression.

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/horticulturae7120556/s1. Additional file Table S1: Information of the 5UIs identified in the Atalantia buxifolia genome and the gene transcripts containing them; Additional file Table S2: Information of the 3UIs identified in the Atalantia buxifolia genome and the gene transcripts containing them; Additional file Table S3: Information of the identified cis-acting elements in 5UI sequences; Additional file Table S4: TFBS prediction results of regulatory 5UI-Ts of transcription factor.

Author Contributions

Conceptualization, C.C. and P.L.; methodology, X.S.; validation, X.S., Y.Z. and C.C.; formal analysis, X.S. and C.C.; data curation, S.X and J.W.; writing—original draft preparation, C.C. and X.S.; writing—review and editing, C.C.; visualization, C.C.; supervision, C.C.; project administration, C.C.; funding acquisition, C.C. and P.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Fund for high-level talents of Shanxi Agricultural University (2021XG010) and the Construction of Plateau Discipline of Fujian Province (102/71201801104).

Data Availability Statement

All the data generated or analyzed during this study are included in this published article and its Supplemental Data.

Acknowledgments

The authors would like to thank Zhenhua Zhuang and Chengdu Life Baseline Company for their assistance in bioinformatics analysis.

Conflicts of Interest

The authors declare that they have no competing interest.

References

  1. Roy, S.W.; Gilbert, W. The evolution of spliceosomal introns: Patterns, puzzles and progress. Nat. Rev. Genet. 2006, 7, 211–221. [Google Scholar]
  2. Sambrook, J. Adenovirus amazes at Cold Spring Harbor. Nature 1977, 268, 101–104. [Google Scholar] [CrossRef]
  3. Bonen, L.; Vogel, J. The ins and outs of group II introns. Trends Genet. 2001, 17, 322–331. [Google Scholar] [CrossRef]
  4. Cannone, J.J.; Subramanian, S.; Schnare, M.N.; Collett, J.R.; D’Souza, L.M.; Du, Y.; Feng, B.; Lin, N.; Madabusi, L.V.; Müller, K.M.; et al. The comparative RNA web (CRW) site: An online database of comparative sequence and structure information for ribosomal, intron, and other RNAs. BMC Bioinform. 2002, 3, 2. [Google Scholar]
  5. Wahl, M.C.; Will, C.L.; Lührmann, R. The spliceosome: Design principles of a dynamic RNP machine. Cell 2009, 136, 701–718. [Google Scholar] [CrossRef] [Green Version]
  6. Le Hir, H.; Nott, A.; Moore, M.J. How introns influence and enhance eukaryotic gene expression. Trends Biochem. Sci. 2003, 28, 215–220. [Google Scholar] [CrossRef]
  7. Moore, M.J.; Proudfoot, N.J. Pre-mRNA processing reaches back to transcription and ahead to translation. Cell 2009, 136, 688–700. [Google Scholar] [CrossRef] [Green Version]
  8. Chorev, M.; Carmel, L. The function of introns. Front. Genet. 2012, 3, 55. [Google Scholar] [CrossRef] [Green Version]
  9. Clark, A.J.; Archibald, A.L.; McClenaghan, M.; Simons, J.P.; Wallace, R.; Whitelaw, C.B. Enhancing the efficiency of transgene expression. Philos. Trans. R. Soc. Lond. B Biol. Sci. 1993, 339, 225–232. [Google Scholar]
  10. Laxa, M. Intron-Mediated Enhancement: A Tool for Heterologous Gene Expression in Plants? Front. Plant Sci. 2016, 7, 1977. [Google Scholar] [CrossRef] [Green Version]
  11. Mascarenhas, D.; Mettler, I.J.; Pierce, D.A.; Lowe, H.W. Intron-mediated enhancement of heterologous gene expression in maize. Plant Mol. Biol. 1990, 15, 913–920. [Google Scholar] [CrossRef] [PubMed]
  12. Callis, J.; Fromm, M.; Walbot, V. Introns increase gene expression in cultured maize cells. Genes Dev. 1987, 1, 1183–1200. [Google Scholar] [CrossRef] [Green Version]
  13. Maas, C.; Laufs, J.; Grant, S.; Korfhage, C.; Werr, W. The combination of a novel stimulatory element in the first exon of the maize Shrunken-1 gene with the following intron 1 enhances reporter gene expression up to 1000-fold. Plant Mol. Biol. 1991, 16, 199–207. [Google Scholar] [CrossRef]
  14. Dean, C.; Favreau, M.; Bond-Nutter, D.; Bedbrook, J.; Dunsmuir, P. Sequences downstream of translation start regulate quantitative expression of two petunia rbcS genes. Plant Cell 1989, 1, 201–208. [Google Scholar] [PubMed] [Green Version]
  15. Gidekel, M.; Jimenez, B.; Herrera-Estrella, L. The first intron of the Arabidopsis thaliana gene coding for elongation factor 1 beta contains an enhancer-like element. Gene 1996, 170, 201–206. [Google Scholar] [CrossRef]
  16. Pesole, G.; Mignone, F.; Gissi, C.; Grillo, G.; Licciulli, F.; Liuni, S. Structural and functional features of eukaryotic mRNA untranslated regions. Gene 2001, 276, 73–81. [Google Scholar] [CrossRef]
  17. Masuda, S.; Das, R.; Cheng, H.; Hurt, E.; Dorman, N.; Reed, R. Recruitment of the human TREX complex to mRNA during splicing. Genes Dev. 2005, 19, 1512–1517. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  18. Nott, A.; Meislin, S.H.; Moore, M.J. A quantitative analysis of intron effects on mammalian gene expression. RNA 2003, 9, 607–617. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  19. Majewski, J.; Ott, J. Distribution and characterization of regulatory elements in the human genome. Genome Res. 2002, 12, 1827–1836. [Google Scholar] [CrossRef] [Green Version]
  20. Furger, A.; O’Sullivan, J.M.; Binnie, A.; Lee, B.A.; Proudfoot, N.J. Promoter proximal splice sites enhance transcription. Genes Dev. 2002, 16, 2792–2799. [Google Scholar] [CrossRef] [Green Version]
  21. Matsumoto, K.; Wassarman, K.M.; Wolffe, A.P. Nuclear history of a pre-mRNA determines the translational activity of cytoplasmic mRNA. EMBO J. 1998, 17, 2107–2121. [Google Scholar] [CrossRef] [Green Version]
  22. Grant, T.N.L.; De La Torre, C.M.; Zhang, N.; Finer, J.J. Synthetic introns help identify sequences in the 5′UTR intron of the Glycine max polyubiquitin (Gmubi) promoter that give increased promoter activity. Planta 2017, 245, 849–860. [Google Scholar] [CrossRef]
  23. Laxa, M.; Müller, K.; Lange, N.; Doering, L.; Pruscha, J.T.; Peterhänsel, C. The 5′UTR Intron of Arabidopsis GGT1 Aminotransferase Enhances Promoter Activity by Recruiting RNA Polymerase II. Plant Physiol. 2016, 172, 313–327. [Google Scholar] [CrossRef] [Green Version]
  24. Kamo, K.; Kim, A.Y.; Park, S.H.; Joung, Y.H. The 5′UTR-intron of the Gladiolus polyubiquitin promoter GUBQ1 enhances translation efficiency in Gladiolus and Arabidopsis. BMC Plant Biol. 2012, 12, 79. [Google Scholar] [CrossRef] [Green Version]
  25. Kim, M.J.; Kim, H.; Shin, J.S.; Chung, C.H.; Ohlrogge, J.B.; Suh, M.C. Seed-specific expression of sesame microsomal oleic acid desaturase is controlled by combinatorial properties between negative cis-regulatory elements in the SeFAD2 promoter and enhancers in the 5′-UTR intron. Mol. Genet. Genom. 2006, 276, 351–368. [Google Scholar] [CrossRef] [PubMed]
  26. Weischenfeldt, J.; Damgaard, I.; Bryder, D.; Theilgaard-Mönch, K.; Thoren, L.A.; Nielsen, F.C.; Jacobsen, S.E.W.; Nerlov, C.; Porse, B.T. NMD is essential for hematopoietic stem and progenitor cells and for eliminating by-products of programmed DNA rearrangements. Genes Dev. 2008, 22, 1381–1396. [Google Scholar] [CrossRef] [Green Version]
  27. Ni, J.Z.; Grate, L.; Donohue, J.P.; Preston, C.; Nobida, N.; O’Brien, G.; Shiue, L.; Clark, T.A.; Blume, J.E.; Ares, M., Jr. Ultraconserved elements are associated with homeostatic control of splicing regulators by alternative splicing and nonsense-mediated decay. Genes Dev. 2007, 21, 708–718. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  28. Chang, Y.F.; Imam, J.S.; Wilkinson, M.F. The nonsense-mediated decay RNA surveillance pathway. Annu. Rev. Biochem. 2007, 76, 51–74. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  29. Mendell, J.T.; Sharifi, N.A.; Meyers, J.L.; Martinez-Murillo, F.; Dietz, H.C. Nonsense surveillance regulates expression of diverse classes of mammalian transcripts and mutes genomic noise. Nat. Genet. 2004, 36, 1073–1078. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  30. Zhang, J.; Sun, X.; Qian, Y.; Maquat, L.E. Intron function in the nonsense-mediated decay of beta-globin mRNA: Indications that pre-mRNA splicing in the nucleus can influence mRNA translation in the cytoplasm. RNA 1998, 4, 801–815. [Google Scholar] [CrossRef]
  31. Yepiskoposyan, H.; Aeschimann, F.; Nilsson, D.; Okoniewski, M.; Mühlemann, O. Autoregulation of the nonsense-mediated mRNA decay pathway in human cells. RNA 2011, 17, 2108–2118. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  32. Rüegsegger, U.; Leber, J.H.; Walter, P. Block of HAC1 mRNA translation by long-range base pairing is released by cytoplasmic splicing upon induction of the unfolded protein response. Cell 2001, 107, 103–114. [Google Scholar] [CrossRef] [Green Version]
  33. Bruno, I.G.; Karam, R.; Huang, L.; Bhardwaj, A.; Lou, C.H.; Shum, E.Y.; Song, H.W.; Corbett, M.A.; Gifford, W.D.; Gecz, J.; et al. Identification of a microRNA that activates gene expression by repressing nonsense-mediated RNA decay. Mol. Cell 2011, 42, 500–510. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  34. McIlwain, D.R.; Pan, Q.; Reilly, P.T.; Elia, A.J.; McCracken, S.; Wakeham, A.C.; Itie-Youten, A.; Blencowe, B.J.; Mak, T.W. Smg1 is required for embryogenesis and regulates diverse genes via alternative splicing coupled to nonsense-mediated mRNA decay. Proc. Natl. Acad. Sci. USA 2010, 107, 12186–12191. [Google Scholar] [CrossRef] [Green Version]
  35. Wittmann, J.; Hol, E.M.; Jäck, H.M. hUPF2 silencing identifies physiologic substrates of mammalian nonsense-mediated mRNA decay. Mol. Cell Biol. 2006, 26, 1272–1287. [Google Scholar] [CrossRef] [Green Version]
  36. Saltzman, A.L.; Pan, Q.; Blencowe, B.J. Regulation of alternative splicing by the core spliceosomal machinery. Genes Dev. 2011, 25, 373–384. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  37. Pan, Q.; Saltzman, A.L.; Yoon, K.K.; Misquitta, C.; Shai, O.; Maquat, L.E.; Frey, B.J.; Blencowe, B.J. Quantitative microarray profiling provides evidence against widespread coupling of alternative splicing with nonsense-mediated mRNA decay to control gene expression. Genes Dev. 2006, 20, 153–158. [Google Scholar] [CrossRef] [Green Version]
  38. Hong, X.; Scofield, D.G.; Lynch, M. Intron size, abundance, and distribution within untranslated regions of genes. Mol. Biol. Evol. 2006, 23, 2392–2404. [Google Scholar] [CrossRef] [Green Version]
  39. Chung, B.Y.W.; Simons, C.; Firth, A.E.; Brown, C.M.; Hellens, R.P. Effect of 5′UTR introns on gene expression in Arabidopsis thaliana. BMC Genom. 2006, 7, 120. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  40. Roy, S.W.; Penny, D.; Neafsey, D.E. Evolutionary conservation of UTR intron boundaries in Cryptococcus. Mol. Biol. Evol. 2007, 24, 1140–1148. [Google Scholar] [CrossRef] [Green Version]
  41. Cenik, C.; Derti, A.; Mellor, J.C.; Berriz, G.F.; Roth, F.P. Genome-wide functional analysis of human 5′ untranslated region introns. Genome Biol. 2010, 11, R29. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  42. Shi, X.; Wu, J.; Mensah, R.A.; Tian, N.; Liu, J.; Liu, F.; Chen, J.; Che, J.; Guo, Y.; Wu, B.; et al. Genome-wide identification and characterization of UTR-introns of Citrus sinensis. Int. J. Mol. Sci. 2020, 21, 3088. [Google Scholar] [CrossRef] [PubMed]
  43. Liang, H.X.; Sun, J.J.; Shen, Z.B.; Yu, B.W.; Cui, H.H.; Yin, Y.Q. A novel alkaloid glycoside isolated from Atalantia buxifolia. Nat. Prod. Res. 2020, 34, 3042–3047. [Google Scholar] [CrossRef]
  44. Chang, F.R.; Li, P.S.; Huang Liu, R.; Hu, H.C.; Hwang, T.L.; Lee, J.C.; Chen, S.L.; Wu, Y.C.; Cheng, Y.B. Bioactive Phenolic Components from the Twigs of Atalantia buxifolia. J. Nat. Prod. 2018, 81, 1534–1539. [Google Scholar] [CrossRef]
  45. Yang, Y.Y.; Yang, W.; Zuo, W.J.; Zeng, Y.B.; Liu, S.B.; Mei, W.L.; Dai, H.F. Two new acridone alkaloids from the branch of Atalantia buxifolia and their biological activity. J. Asian Nat. Prod. Res. 2013, 15, 899–904. [Google Scholar] [CrossRef] [PubMed]
  46. Wu, G.A.; Prochnik, S.; Jenkins, J.; Salse, J.; Hellsten, U.; Murat, F.; Perrier, X.; Ruiz, M.; Scalabrin, S.; Terol, J.; et al. Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication. Nat. Biotechnol. 2014, 32, 656–662. [Google Scholar] [CrossRef] [PubMed]
  47. Thimm, O.; Bläsing, O.; Gibon, Y.; Nagel, A.; Meyer, S.; Krüger, P.; Selbig, J.; Müller, L.A.; Rhee, S.Y.; Stitt, M. MAPMAN: A user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. Plant J. 2004, 37, 914–939. [Google Scholar] [CrossRef]
  48. Lescot, M.; Déhais, P.; Thijs, G.; Marchal, K.; Moreau, Y.; Van De Peer, Y.; Rouzé, P.; Rombauts, S. PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences. Nucleic Acids Res. 2002, 30, 325–327. [Google Scholar] [CrossRef]
  49. Crooks, G.E.; Hon, G.; Chandonia, J.M.; Brenner, S.E. WebLogo: A sequence logo generator. Genome Res. 2004, 14, 1188–1190. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  50. Lurin, C.; Andrés, C.; Aubourg, S.; Bellaoui, M.; Bitton, F.; Bruyère, C.; Caboche, M.; Debast, C.; Gualberto, J.; Hoffmann, B.; et al. Genome-wide analysis of Arabidopsis pentatricopeptide repeat proteins reveals their essential role in organelle biogenesis. Plant Cell 2004, 16, 2089–2103. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  51. Brown, J.W.; Simpson, C.G.; Thow, G.; Clark, G.P.; Jennings, S.N.; Medina-Escobar, N.; Haupt, S.; Chapman, S.C.; Oparka, K.J. Splicing signals and factors in plant intron removal. Biochem. Soc. Trans. 2002, 30, 146–149. [Google Scholar] [CrossRef]
  52. Goguel, V.; Rosbash, M. Splice site choice and splicing efficiency are positively influenced by pre-mRNA intramolecular base pairing in yeast. Cell 1993, 72, 893–901. [Google Scholar]
  53. Rohs, R.; West, S.M.; Sosinsky, A.; Liu, P.; Mann, R.S.; Honig, B. The role of DNA shape in protein-DNA recognition. Nature 2009, 461, 1248–1253. [Google Scholar] [CrossRef] [Green Version]
  54. Badis, G.; Berger, M.F.; Philippakis, A.A.; Talukder, S.; Gehrke, A.R.; Jaeger, S.A.; Chan, E.T.; Metzler, G.; Vedenko, A.; Chen, X.; et al. Diversity and complexity in DNA recognition by transcription factors. Science 2009, 324, 1720–1723. [Google Scholar] [CrossRef] [Green Version]
  55. Hecker, A.; Brand, L.H.; Peter, S.; Simoncello, N.; Kilian, J.; Harter, K.; Gaudin, V.; Wanke, D. The Arabidopsis GAGA-Binding Factor Basic Pentacysteine6 Recruits the Polycomb-Repressive Complex1 Component Like Heterochromatin Protein1 to GAGA DNA Motifs. Plant Physiol. 2015, 168, 1013–1024. [Google Scholar] [CrossRef]
  56. Simonini, S.; Kater, M.M. Class I Basic Pentacysteine factors regulate Homeobox genes involved in meristem size maintenance. J. Exp. Bot. 2014, 65, 1455–1465. [Google Scholar] [CrossRef] [Green Version]
  57. Santi, L.; Wang, Y.; Stile, M.R.; Berendzen, K.; Wanke, D.; Roig, C.; Pozzi, C.; Müller, K.; Müller, J.; Rohde, W.; et al. The GA octodinucleotide repeat binding factor BBR participates in the transcriptional regulation of the homeobox gene Bkn3. Plant J. 2003, 34, 813–826. [Google Scholar] [CrossRef] [PubMed]
  58. Lu, J.; Sivamani, E.; Azhakanandam, K.; Samadder, P.; Li, X.; Qu, R. Gene expression enhancement mediated by the 5′UTR intron of the rice rubi3 gene varied remarkably among tissues in transgenic rice plants. Mol. Genet. Genom. 2008, 279, 563–572. [Google Scholar] [CrossRef]
  59. Wang, J.; Oard, J.H. Rice ubiquitin promoters: Deletion analysis and potential usefulness in plant transformation systems. Plant Cell Rep. 2003, 22, 129–134. [Google Scholar] [CrossRef] [PubMed]
  60. McElroy, D.; Zhang, W.; Cao, J.; Wu, R. Isolation of an efficient actin promoter for use in rice transformation. Plant Cell 1990, 2, 163–171. [Google Scholar]
  61. Rose, A.B.; Elfersi, T.; Parra, G.; Korf, I. Promoter-proximal introns in Arabidopsis thaliana are enriched in dispersed signals that elevate gene expression. Plant Cell 2008, 20, 543–551. [Google Scholar] [CrossRef] [Green Version]
  62. Rose, A.B. The effect of intron location on intron-mediated enhancement of gene expression in Arabidopsis. Plant J. 2004, 40, 744–751. [Google Scholar] [CrossRef]
  63. Samadder, P.; Sivamani, E.; Lu, J.; Li, X.; Qu, R. Transcriptional and post-transcriptional enhancement of gene expression by the 5′ UTR intron of rice rubi3 gene in transgenic rice cells. Mol. Genet. Genom. 2008, 279, 429–439. [Google Scholar] [CrossRef] [PubMed]
  64. Le Hir, R.; Bellini, C. The plant-specific dof transcription factors family: New players involved in vascular system development and functioning in Arabidopsis. Front. Plant Sci. 2013, 4, 164. [Google Scholar] [CrossRef] [Green Version]
  65. Yanagisawa, S. The Dof family of plant transcription factors. Trends Plant Sci. 2002, 7, 555–560. [Google Scholar] [CrossRef]
  66. Kanneganti, V.; Gupta, A.K. Overexpression of OsiSAP8, a member of stress associated protein (SAP) gene family of rice confers tolerance to salt, drought and cold stress in transgenic tobacco and rice. Plant Mol. Biol. 2008, 66, 445–462. [Google Scholar] [CrossRef]
  67. Duek, P.D.; Fankhauser, C. bHLH class transcription factors take centre stage in phytochrome signalling. Trends Plant Sci. 2005, 10, 51–54. [Google Scholar] [CrossRef]
  68. Bicknell, A.A.; Cenik, C.; Chua, H.N.; Roth, F.P.; Moore, M.J. Introns in UTRs: Why we should stop ignoring them. Bioessays 2012, 34, 1025–1034. [Google Scholar] [CrossRef]
  69. Lorković, Z.J. Role of plant RNA-binding proteins in development, stress response and genome organization. Trends Plant Sci. 2009, 14, 229–236. [Google Scholar] [CrossRef]
  70. Lambermon, M.H.L.; Fu, Y.; Kirk, D.A.W.; Dupasquier, M.; Filipowicz, W.; Lorković, Z.J. UBA1 and UBA2, two proteins that interact with UBP1, a multifunctional effector of pre-mRNA maturation in plants. Mol. Cell Biol. 2002, 22, 4346–4357. [Google Scholar] [CrossRef] [Green Version]
  71. Xing, H.; Fu, X.; Yang, C.; Tang, X.; Guo, L.; Li, C.; Xu, C.; Luo, K. Genome-wide investigation of pentatricopeptide repeat gene family in poplar and their expression analysis in response to biotic and abiotic stresses. Sci. Rep. 2018, 8, 2817. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  72. Schmitz-Linneweber, C.; Small, I. Pentatricopeptide repeat proteins: A socket set for organelle gene expression. Trends Plant Sci. 2008, 13, 663–670. [Google Scholar] [CrossRef]
  73. Laluk, K.; Abuqamar, S.; Mengiste, T. The Arabidopsis mitochondria-localized pentatricopeptide repeat protein PGN functions in defense against necrotrophic fungi and abiotic stress tolerance. Plant Physiol. 2011, 156, 2053–2068. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  74. Saha, D.; Prasad, A.M.; Srinivasan, R. Pentatricopeptide repeat proteins and their emerging roles in plants. Plant Physiol. Biochem. 2007, 45, 521–534. [Google Scholar] [CrossRef] [PubMed]
  75. Wang, Z.; Zou, Y.; Li, X.; Zhang, Q.; Chen, L.; Wu, H.; Su, D.; Chen, Y.; Guo, J.; Luo, D.; et al. Cytoplasmic male sterility of rice with boro II cytoplasm is caused by a cytotoxic peptide and is restored by two related PPR motif genes via distinct modes of mRNA silencing. Plant Cell 2006, 18, 676–687. [Google Scholar] [CrossRef] [Green Version]
  76. Chen, L.; An, Y.; Li, Y.X.; Li, C.; Shi, Y.; Song, Y.; Zhang, D.; Wang, T.; Li, Y. Candidate Loci for Yield-Related Traits in Maize Revealed by a Combination of MetaQTL Analysis and Regional Association Mapping. Front. Plant Sci. 2017, 8, 2190. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  77. Yuan, Y.W.; Liu, C.; Marx, H.E.; Olmstead, R.G. The pentatricopeptide repeat (PPR) gene family, a tremendous resource for plant phylogenetic studies. New Phytol. 2009, 182, 272–283. [Google Scholar] [CrossRef] [PubMed]
  78. Ner-Gaon, H.; Halachmi, R.; Savaldi-Goldstein, S.; Rubin, E.; Ophir, R.; Fluhr, R. Intron retention is a major phenomenon in alternative splicing in Arabidopsis. Plant J. 2004, 39, 877–885. [Google Scholar] [CrossRef]
  79. Simpson, G.G.; Filipowicz, W. Splicing of precursors to mRNA in higher plants: Mechanism, regulation and sub-nuclear organisation of the spliceosomal machinery. Plant Mol. Biol. 1996, 32, 1–41. [Google Scholar] [CrossRef]
  80. Luehrsen, K.R.; Taha, S.; Walbot, V. Nuclear pre-mRNA processing in higher plants. Prog. Nucleic Acid Res. Mol. Biol. 1994, 47, 149–193. [Google Scholar] [PubMed]
Figure 1. Length distributions of A. buxifolia 5UIs, 3UIs and CIs. The horizontal axis and the vertical axis represent the size and relative frequencies of introns, respectively.
Figure 1. Length distributions of A. buxifolia 5UIs, 3UIs and CIs. The horizontal axis and the vertical axis represent the size and relative frequencies of introns, respectively.
Horticulturae 07 00556 g001
Figure 2. Nucleotide bias around the donor sites (A) and acceptor sites (B) of A. buxifolia 5UIs, CIs and 3UIs. The x-axis refers to nucleotides around the beginnings or ends of introns, and the letter height reflects the nucleotide deviation.
Figure 2. Nucleotide bias around the donor sites (A) and acceptor sites (B) of A. buxifolia 5UIs, CIs and 3UIs. The x-axis refers to nucleotides around the beginnings or ends of introns, and the letter height reflects the nucleotide deviation.
Horticulturae 07 00556 g002
Table 1. Statistics information of 5′ UTR, CDS and 3′ UTR in A. buxifolia.
Table 1. Statistics information of 5′ UTR, CDS and 3′ UTR in A. buxifolia.
No. of SequencesSequences with IntronsTotal Bases (Genomic)Intron/SequenceNo. of Introns/Nucleotide (mRNA)
5′UTR16,2185971.28 × 1070.078.42 × 105
CDS28,41221,0053.64 × 1073.993.11 × 103
3′UTR16,3374522.81 × 1070.053.08 × 10−5
Table 2. The UI numbers in the A. buxifolia gene transcripts containing 5UI (5UI-Ts) and 3UI (3UI-Ts). There are 897 and 670 gene transcripts containing 5UI and 3UI, respectively. NS: not shown.
Table 2. The UI numbers in the A. buxifolia gene transcripts containing 5UI (5UI-Ts) and 3UI (3UI-Ts). There are 897 and 670 gene transcripts containing 5UI and 3UI, respectively. NS: not shown.
UI Number5UI-Ts Number/Percentage (Gene ID)3UI-Ts Number/Percentage (Gene ID)
1 UI765/85.28% (NS)527/78.66% (NS)
2 UIs106/11.82% (NS)90/13.43% (NS)
3 UIs16/1.78% (sb11639.1, sb32006.1, sb10151.1, sb18930.1, sb18698.1, sb31744.1, sb37205.1, sb20998.1, sb14514.1, sb29193.1, sb21675.1, sb20177.1, sb34312.1, sb34492.1, sb36282.1, sb12383.1)25/3.73% (sb13064.1, sb32006.1, sb37841.1, sb18145.1, sb18385.1, sb34575.1, sb33241.1, sb37027.1, sb32422.1, sb12195.1, sb24162.1, sb10905.1, sb20663.1, sb32886.1, sb22221.1, sb35222.1, sb26419.1, sb26292.1, sb26045.1, sb22509.1, sb37886.1, sb18573.1, sb20840.1, sb15172.1, sb28503.1)
4 UIs4/0.45% (sb19593.1, sb20478.1, sb12332.1, sb28722.1)4/0.60% (sb17114.1, sb31386.1, sb26143.1, sb25688.1)
5 UIs3/0.33% (sb27690.1, sb31747.1, sb28046.1)4/0.60% (sb13718.1, sb31879.1, sb10627.1, sb31747.1)
6 UIs1/0.11% (sb26751.1)2/0.30% (sb25651.1, sb31448.1)
7 UIs1/0.11% (sb14804.1)0
8 UIs1/0.11% (sb36468.1)0
Table 3. Predicted cis-acting elements in 5UI sequences. Only the elements with number higher than 300 were shown.
Table 3. Predicted cis-acting elements in 5UI sequences. Only the elements with number higher than 300 were shown.
FunctionElements Number5UI Number (Ratio)
core promoter element around −30 of transcription start12,106893 (82.92%)
common cis-acting element in promoter and enhancer regions5538948 (88.02%)
light responsive element1297271 (25.16%)
part of a conserved DNA module involved in light responsiveness1003489 (45.40%)
part of a light responsive element904491 (45.59%)
cis-acting regulatory element essential for the anaerobic induction768426 (39.55%)
cis-acting regulatory element involved in the MeJA-responsiveness678252 (23.40%)
cis-acting regulatory element involved in light responsiveness486270 (25.7%)
cis-acting element involved in the abscisic acid responsiveness430246 (22.84%)
light responsive element392271 (25.16%)
Table 4. Predicted transcription factor binding sites (TFBSs) in 5UI sequences. Only the TFBSs with number higher than 10 were shown.
Table 4. Predicted transcription factor binding sites (TFBSs) in 5UI sequences. Only the TFBSs with number higher than 10 were shown.
FamilyNumber (Ratio)
BARLEY B RECOMBINANT/BASIC PENTACYSTEINE (BBR-BPC)268 (24.54%)
MADS intervening keratin-like and C-terminal -type MADS (MIKC_MADS)266 (24.35%)
APETALA2/Ethylene-Responsive factor (AP2)176 (16.12%)
DNA binding with one finger (Dof)149 (13.64%)
The three-amino-acid-loop-extension (TALE)53 (4.85%)
GIBBERELLIC-ACID INSENSITIVE, REPRESSOR of GAI and SCARECROW (GRAS)50 (4.58%)
GATA25 (2.29%)
The Cys 2 His 2-like fold group (C2H2)24 (2.20%)
NODULE-INCEPTION-like (Nin-like)22 (2.01%)
Leafy (LFY)10 (0.92%)
Table 5. Six significantly enriched pathways of UI-Ts in A. buxifolia. BIN is the unit used in Mapman graphs denoting a pathway, and each BIN has been assigned a specific ID in Mapman. UI-Ts: UI-containing gene transcripts.
Table 5. Six significantly enriched pathways of UI-Ts in A. buxifolia. BIN is the unit used in Mapman graphs denoting a pathway, and each BIN has been assigned a specific ID in Mapman. UI-Ts: UI-containing gene transcripts.
BINPathway NameNumberp-Value
35.1.5not assigned. no ontology. pentatricopeptide (PPR) repeat-containing protein1392.45 × 10−22
35.1not assigned. no ontology2173.40 × 10−16
35not assigned7332.57 × 10−4
27.3RNA. regulation of transcription1992.77 × 10−4
27.4RNA. RNA binding114.14 × 10−3
Table 6. The top 10 enriched pathways for 5UI-Ts and 3UI-Ts. Enriched pathways ranked top 10 for both 5UI-Ts and 3UI-Ts were underlined. The pathway ‘not assigned’ is not shown.
Table 6. The top 10 enriched pathways for 5UI-Ts and 3UI-Ts. Enriched pathways ranked top 10 for both 5UI-Ts and 3UI-Ts were underlined. The pathway ‘not assigned’ is not shown.
BINPathway Name5UI NumberBINPathway Name3UI Number
35.2not assigned. unknown32835.2not assigned. unknown221
27RNA16035.1not assigned. no ontology157
27.3RNA. regulation of transcription15135.1.5not assigned. no ontology. pentatricopeptide (PPR) repeat-containing protein119
29protein12027RNA85
35.1not assigned. no ontology7727.3RNA. regulation of transcription62
29.5protein. degradation6029protein57
29.5.11protein. degradation. ubiquitin5530signaling32
30signaling4820stress31
29.5.11.4protein. degradation. ubiquitin. E34529.5protein. degradation30
20stress4329.5.11protein. degradation. ubiquitin25
Table 7. ‘RNA’ pathway related UI-Ts identified in this study. 5UI-Ts represent transcripts containing only 5UI; 3UI-Ts represent transcripts containing only 3 UIs; and 5+3UI-Ts represent transcripts containing both 5UI and 3UI.
Table 7. ‘RNA’ pathway related UI-Ts identified in this study. 5UI-Ts represent transcripts containing only 5UI; 3UI-Ts represent transcripts containing only 3 UIs; and 5+3UI-Ts represent transcripts containing both 5UI and 3UI.
BINPathway Name5UI-Ts3UI-Ts5+3UI-Ts
27.3.99RNA. regulation of transcription. unclassified30122
27.3.6RNA. regulation of transcription. bHLH, basic Helix-Loop-Helix family1541
27.3.69RNA. regulation of transcription. SET-domain transcriptional regulator family1283
27.3.29RNA. regulation of transcription. TCP transcription factor family1162
27.3.11RNA. regulation of transcription. C2H2 zinc finger family1131
27.3.21RNA. regulation of transcription. GRAS transcription factor family1031
27.4RNA. RNA binding0110
27.3.67RNA. regulation of transcription. putative transcription regulator932
27.3.30RNA. regulation of transcription. Trihelix, Triple-Helix transcription factor family460
27.3.3RNA. regulation of transcription. AP2/EREBP, APETALA2/Ethylene-responsive element binding protein family530
27.3.12RNA. regulation of transcription. C3H zinc finger family611
27.1RNA. processing430
27.1.1RNA. processing. splicing050
27.3.25RNA. regulation of transcription. MYB domain transcription factor family500
27.3.37RNA. regulation of transcription. AS2, Lateral Organ Boundaries Gene Family500
27.3.16RNA. regulation of transcription. CCAAT box binding factor family, HAP5040
27.3.19RNA. regulation of transcription. EIN3-like (EIL) transcription factor family400
27.3.68RNA. regulation of transcription. PWWP domain protein411
27.3.8RNA. regulation of transcription. C2C2 (Zn) DOF zinc finger family310
27.2RNA. transcription300
27.1.2RNA. processing. RNA helicase030
27.3.1RNA. regulation of transcription. ABI3/VP1-related B3-domain-containing transcription factor family300
27.3.20RNA. regulation of transcription. G2-like transcription factor family, GARP030
27.3.24RNA. regulation of transcription. MADS box transcription factor family300
27.3.52RNA. regulation of transcription. Global transcription factor group300
27.3.80RNA. regulation of transcription. zf-HD300
27.1.19RNA. processing. ribonucleases111
27.3.35RNA. regulation of transcription. bZIP transcription factor family200
27.3.15RNA. regulation of transcription. CCAAT box binding factor family, HAP3010
27.3.26RNA. regulation of transcription. MYB-related transcription factor family010
27.3.49RNA. regulation of transcription. GeBP like010
27.3.73RNA. regulation of transcription. Zn-finger (CCHC)100
27.3.75RNA. regulation of transcription. GRP010
27.3.84RNA. regulation of transcription. BBR/BPC100
27.3.89RNA. regulation of transcription. ovate family OFP100
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Cheng, C.; Shi, X.; Wu, J.; Zhang, Y.; Lü, P. Genome-Scale Computational Identification and Characterization of UTR Introns in Atalantia buxifolia. Horticulturae 2021, 7, 556. https://doi.org/10.3390/horticulturae7120556

AMA Style

Cheng C, Shi X, Wu J, Zhang Y, Lü P. Genome-Scale Computational Identification and Characterization of UTR Introns in Atalantia buxifolia. Horticulturae. 2021; 7(12):556. https://doi.org/10.3390/horticulturae7120556

Chicago/Turabian Style

Cheng, Chunzhen, Xiaobao Shi, Junwei Wu, Yongyan Zhang, and Peitao Lü. 2021. "Genome-Scale Computational Identification and Characterization of UTR Introns in Atalantia buxifolia" Horticulturae 7, no. 12: 556. https://doi.org/10.3390/horticulturae7120556

APA Style

Cheng, C., Shi, X., Wu, J., Zhang, Y., & Lü, P. (2021). Genome-Scale Computational Identification and Characterization of UTR Introns in Atalantia buxifolia. Horticulturae, 7(12), 556. https://doi.org/10.3390/horticulturae7120556

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop