Intraspecific Variation within the Utricularia amethystina Species Morphotypes Based on Chloroplast Genomes

Utricularia amethystina Salzm. ex A.St.-Hil. & Girard (Lentibulariaceae) is a highly polymorphic carnivorous plant taxonomically rearranged many times throughout history. Herein, the complete chloroplast genomes (cpDNA) of three U. amethystina morphotypes: purple-, white-, and yellow-flowered, were sequenced, compared, and putative markers for systematic, populations, and evolutionary studies were uncovered. In addition, RNA-Seq and RNA-editing analysis were employed for functional cpDNA evaluation. The cpDNA of three U. amethystina morphotypes exhibits typical quadripartite structure. Fine-grained sequence comparison revealed a high degree of intraspecific genetic variability in all morphotypes, including an exclusive inversion in the psbM and petN genes in U. amethystina yellow. Phylogenetic analyses indicate that U. amethystina morphotypes are monophyletic. Furthermore, in contrast to the terrestrial Utricularia reniformis cpDNA, the U. amethystina morphotypes retain all the plastid NAD(P)H-dehydrogenase (ndh) complex genes. This observation supports the hypothesis that the ndhs in terrestrial Utricularia were independently lost and regained, also suggesting that different habitats (aquatic and terrestrial) are not related to the absence of Utricularia ndhs gene repertoire as previously assumed. Moreover, RNA-Seq analyses recovered similar patterns, including nonsynonymous RNA-editing sites (e.g., rps14 and petB). Collectively, our results bring new insights into the chloroplast genome architecture and evolution of the photosynthesis machinery in the Lentibulariaceae.


Introduction
The species of the carnivorous plant family Lentibulariaceae are grouped in three genera: Pinguicula L., Genlisea A.St-Hil., and Utricularia L. [1,2], and are increasingly becoming important plant models mainly due to their alternative nutrient uptake system, their morphological non-orthodox body The chloroplast genomes of most angiosperms have conserved quadripartite structure separated in Large and Small Single Copy regions (LSC and SSC, respectively) and two inverted repetitive regions (IRs) [16]. However, comparative analyses indicate that some plants, such as parasitic [17], This intraspecific morphological variation resulted in several taxonomic rearrangements since the earliest descriptions at nineteenth century [10,11] and even now there is much controversy about if the species is one or more [12,13]. Taylor (1989) [7] struggled to separate the species based on reproductive characters, such as corolla shape, pedicel sizes, palynological characters, and calyx indumentum, but he couldn't find traits for enough taxonomical circumscriptions to split the different morphotypes. Indeed, in his Utricularia taxonomic monograph, he synonymized 31 taxa under the binomial "Utricularia amethystina" and he wrote "U. amethystina is a most 'difficult' and excessively polymorphic species..." (Taylor, 1989 [7], p. 291). Therefore, he assumed one name for the species, as he was unable to find discontinuities to support taxa separation due to the high degree of polymorphism between populations. However, to date, there is no proper taxonomic treatment to solve this question. In addition, only a few genetic differences have been explored [13], such as the chloroplast regions rps16, trnL-F, trnD-T, and nuclear ITS, but these markers were not able to give enough resolution to distinguish them all. In this context, chloroplast genomes are a valuable resource for phylogenies, and the study of their structure and content can provide clues for improving inter-and intraspecific studies, such as population biology [14], and even the discovery of new species [15].
The chloroplast genomes of most angiosperms have conserved quadripartite structure separated in Large and Small Single Copy regions (LSC and SSC, respectively) and two inverted repetitive regions (IRs) [16]. However, comparative analyses indicate that some plants, such as parasitic [17], mycoheterotrophic (e.g., in [18,19]), and species of carnivorous plants from the order Caryophyllales [20,21], have suffered substantial rearrangement and gene losses throughout plant evolution. For example, across diverse lineages of plants, chloroplast genomes lack NAD(P)H-dehydrogenase (ndh) complex genes, genes that could have been involved in the transition from aquatic to terrestrial habit thought plant evolutionary history [22,23].
All published Utricularia cpDNAs have the typical quadripartite structure and the same genes as most angiosperms. However, some species exhibit variation, such as two complete copies of ycf 1 and ndhF in U. gibba, and U. reniformis, which has reduced chloroplast size due to several losses in all ndhs genes repertoire that could not be integrally found either in the mitochondrial genome [29] nor in the nuclear DNA (unpublished data). Indeed, U. reniformis is a terrestrial species and other assessed Utricularia are aquatic, and based on this observation, Silva et al. (2016) proposed that the loss of ndhs could be related to terrestrial habit [27]. Nonetheless, other studies are still needed for a better understanding of genes especially the evolution of the ndh genes in the genus.
Herein, we present the chloroplast genomes of three Utricularia amethystina morphotypes to assess inter-and intraspecific sequence variability and polymorphic regions that could be used for further phylogenetic studies. In addition, we employed chloroplast transcriptome to assess gene expression and identify the RNA editing sites in each chloroplast. We also compare ndh gene gains and losses across the sequenced Utricularia cpDNA species, to examine the variation of structural changes across the genus.
The SSC regions of Utricularia amethystina morphotypes are similar to most Utricularia, except for U. reniformis, due to the deletion of ndhs genes ( Figure 5), and U. gibba, which have an extra copy of ycf1 and ndhF.  Supplementary Table S5).
The SSC regions of Utricularia amethystina morphotypes are similar to most Utricularia, except for U. reniformis, due to the deletion of ndhs genes ( Figure 5), and U. gibba, which have an extra copy of ycf 1 and ndhF.

Chloroplast Expression
RNAseq clustering analyses indicated distinct expression profiles for each Utricularia amethystina morphotype. The psbA and rbcL genes showed much higher levels of expression, followed by psaA, psaB, psbB, psbC, and psbD, in comparison to other genes in all samples (Figure 7; Supplementary  Table S7).

Chloroplast Expression
RNAseq clustering analyses indicated distinct expression profiles for each Utricularia amethystina morphotype. The psbA and rbcL genes showed much higher levels of expression, followed by psaA, psaB, psbB, psbC, and psbD, in comparison to other genes in all samples (Figure 7; Supplementary  Table S7). The underscore "_1" and "_2" denotes each gene duplicate. The rps12 is a duplicated trans-spliced gene, therefore it was analyzed in three parts.

RNA Edit
The RNA editing analyses were carried out using the PREPACT3 tool, and in-house script to search for validation of editing sites using RNA-Seq mapped reads. The PREPACT3 predicted 154 sites for U. amethystina purple, 140 for U. amethystina white, and 146 for U. amethystina yellow. Most amino acid changes are shared between populations, and comparison between Utricularia amethystina morphotypes showed 22 genes with the same editing sites and 15 genes with differences in the Figure 7. Heat map representation of Utricularia amethystina cpDNA transcript level. The underscore "_1" and "_2" denotes each gene duplicate. The rps12 is a duplicated trans-spliced gene, therefore it was analyzed in three parts.

RNA Edit
The RNA editing analyses were carried out using the PREPACT3 tool, and in-house script to search for validation of editing sites using RNA-Seq mapped reads. The PREPACT3 predicted 154 sites for U. amethystina purple, 140 for U. amethystina white, and 146 for U. amethystina yellow. Most amino acid changes are shared between populations, and comparison between Utricularia amethystina morphotypes showed 22 genes with the same editing sites and 15 genes with differences in the quantity and amino acid composition changes (for more information see Supplementary Tables S8-S10). According to the results, there are 13 types of amino acid transitions in the three U. amethystina. The most prevalent nonsynonymous substitutional changes occurred between Alanine to Valine and Serine to Leucine, followed by Leucine to Phenylalanine, Proline to Serine, Threonine to Isoleucine, Serine to Phenylalanine, Proline to Leucine, Histidine to Tyrosine, Threonine to Methionine, Proline to Phenylalanine, Arginine to Cysteine, Glutamine to Stop codon, and Arginine to Tryptophan (Figure 8). For RNA-Seq-based results, only three nonsynonymous amino acid substitutions were found ( Figure 8; Table 2). Serine to Leucine, followed by Leucine to Phenylalanine, Proline to Serine, Threonine to Isoleucine, Serine to Phenylalanine, Proline to Leucine, Histidine to Tyrosine, Threonine to Methionine, Proline to Phenylalanine, Arginine to Cysteine, Glutamine to Stop codon, and Arginine to Tryptophan ( Figure 8). For RNA-Seq-based results, only three nonsynonymous amino acid substitutions were found ( Figure 8; Table 2). The results from RNA-Seq data showed eight nonsynonymous and eight synonymous sites for the three U. amethystina morphotypes. Regarding nonsynonymous sites, U. amethystina purple and yellow have rps14 and petB genes edition, and for U. amethystina white, edition sites were two in rps14, petB, and ndhB ( Table 2). The same edited sites in the same gene position were found in petB for all U. amethystina samples. In addition, the rps14 gene from purple is the same as rps14 in position 36,572 for white, and rps14 gene from yellow is the same as the rps14 in position 36,497 for white. The results from RNA-Seq data showed eight nonsynonymous and eight synonymous sites for the three U. amethystina morphotypes. Regarding nonsynonymous sites, U. amethystina purple and yellow have rps14 and petB genes edition, and for U. amethystina white, edition sites were two in rps14, petB, and ndhB ( Table 2). The same edited sites in the same gene position were found in petB for all U. amethystina samples. In addition, the rps14 gene from purple is the same as rps14 in position 36,572 for white, and rps14 gene from yellow is the same as the rps14 in position 36,497 for white.

Phylogeny
The phylogenetic tree based on available chloroplast genomes of 15 Lentibulariaceae specimens is shown in Figure 9. Both maximum likelihood (ML) and Bayesian inference (BI) trees exhibited identical phylogenetic topologies, and support values (bootstrap for ML and posterior probabilities for BI) are 100% for all clades. The Lentibulariaceae is known to be monophyletic, and Pinguicula is a sister clade to Utricularia-Genlisea. The results show that the Utricularia genus is monophyletic and the three U. amethystina are closely related to U. reniformis. Also, the U. amethystina yellow is sister to purple, with white as having the same common ancestor (Figure 9).

Phylogeny
The phylogenetic tree based on available chloroplast genomes of 15 Lentibulariaceae specimens is shown in Figure 9. Both maximum likelihood (ML) and Bayesian inference (BI) trees exhibited identical phylogenetic topologies, and support values (bootstrap for ML and posterior probabilities for BI) are 100% for all clades. The Lentibulariaceae is known to be monophyletic, and Pinguicula is a sister clade to Utricularia-Genlisea. The results show that the Utricularia genus is monophyletic and the three U. amethystina are closely related to U. reniformis. Also, the U. amethystina yellow is sister to purple, with white as having the same common ancestor (Figure 9).

Discussion
Due to the plasticity in corolla shape and color, Utricularia amethystina is one of the most polymorphic species within the Utricularia genus. This polymorphism resulted in a historically

Discussion
Due to the plasticity in corolla shape and color, Utricularia amethystina is one of the most polymorphic species within the Utricularia genus. This polymorphism resulted in a historically taxonomic complicated group with its systematics only partially resolved to date. During the last decades, several efforts attempted to separate the different U. amethystina morphotypes into different species, yet without much success [7, 12,13].
In this study, we analyzed the cpDNAs of three morphologically distinct Utricularia amethystina from different populations: the purple, white, and yellow morphotypes, aiming to detect intra-and interspecific variations and phylogenetic signals and provide new cpDNA regions for evolutionary studies. In addition, we evaluated the transcription and RNA editing sites for U. amethystina populations.
In an attempt to diminish the environmental conditions bias, we have collected the specimens from close populations (~2.8 km between purple and white, 0.2 km between white and yellow, and 2.82 km between purple and yellow. The specimens of U. amethystina cpDNA have a typical quadripartite structure present in most land plants and have a similar organization and GC content to other Utricularia [24,27]. Among the three Utricularia amethystina morphotypes, we found an inversion between the petN and psbM genes in U. amethystina yellow, representing the first known gene inversion in LSC region identified in Lentibulariaceae chloroplast genomes. Indeed, the same inversion was detected in the chloroplast genome of species of Cannabaceae [30], and microstructural short inversions of 10 bp were also found in the petN-psbM region in Solanus species [31]. Some comparative cpDNA studies have also identified structural mutations in monilophyte chloroplast genomes, including as many as six inversions and some gene losses (e.g., in [32][33][34]).
In general, chloroplast deletions/losses are observed among Lentibulariaceae. Indeed, Utricularia reniformis suffered a major SSC region retraction due to the losses of NAD(P)H-dehydrogenase (ndh) complex genes [27]. In contrast, all other sequenced Utricularia cpDNAs have complete ndhs gene complexes. These chlororespiratory genes are ndhA, B, C, D, E, F, G, H, I, J, and K, and encode subunits of the NADH dehydrogenase complex in plant chloroplast genomes that play a role in plant signaling in the photosynthesis reaction [35] and the reduction and oxidation of plastoquinones [36]. As U. reniformis is a terrestrial species, it has been proposed that possibly all terrestrial species of Utricularia may lack members from the ndh genes complex [27,28]. However, U. amethystina is terrestrial, and all three morphotypes retain all plastid ndhs complex genes. Therefore, our results now suggest that the ndhs in terrestrial Utricularia were independently lost and regained, thus refuting the hypothesis (at least for Utricularia) that terrestrial species have experienced the loss of ndhs genes.
Chloroplast repeats are important regions for replication and DNA stability [37]. Microsatellites or SSRs are tandem repeats of 1-6 base pairs units long that can be used as genetic markers [38]. They are most commonly found in plants and due to genetic variation in the number of tandem repeats units. Therefore, as they produce polymorphism detectable with PCR-based methods banding pattern and genotyping, the SSRs are widely used in population genetics and evolutionary studies [39]. Utricularia amethystina has high amounts of mononucleotide repeats in the cpSSR, which is similar to other angiosperms, such as Arabidopsis thaliana [40], and other Lentibulariaceae [24,25,27]. Previous results for Utricularia indicated that most of SSR were found in coding regions for U. gibba, U. macrorhiza, Genlisea margaretae, and Pinguicula ehlersiae. However, for U. reniformis, more cpSSR were found in non-coding regions. In U. amethystina, long repeats have similar quantities between populations, and as seen in other Utricularia, most of them are in coding regions [24,27], an uncommon fact for other angiosperms chloroplast genomes (e.g., see [41]), which could indicate high rates of recombination and rearrangement, as discussed in Silva et al. (2016) [27]. Although long repeats could be the cause for gene rearrangements, we could not find repeats in flanking regions of the genes psbM and petN in U. amethystina yellow. Therefore, this indicates that other evolutionary forces were involved with the observed inversion of these genes in this species.
For some Utricularia, DNA barcoding approaches have been considered a difficult task to perform. For instance, the DNA-barcoding markers, such as ITS, rbcL, and matK, could not discriminate all Utricularia accessions at the species and population level due to their low level of polymorphism (e.g., Utricularia sect. Utricularia in Astuti et al., 2019 [42]). Furthermore, rps16, trnL-F, and trnD-T markers cannot discriminate U. amethystina populations [13]. Therefore, it is important to explore regions with high variability at inter-and intraspecies levels that represent potentially useful markers for future studies. Using mVISTA results for the interspecific divergence analysis, it is noticeable that the LSC and SSC regions are more variable than IR regions, corroborating with the results found for identity analyses with Genlisea species [25] and other angiosperms [41]. The results showed highly variable regions between the different species, mostly represented by intergenic spacers, such as trnH-psbA, trnK-rps16, and rps16-trnQ, which could be used for interspecies identification.
It is previously proposed that populations from closely related environments should be less divergent if they are of the same species. However, we observed high intraspecific chloroplast sequence variability, although geographical sampling covered a restricted area. Among the regions with high nucleotide diversity and intraspecific variations, there is the intergenic spacer, trnH-psbA, which is already being used as DNA barcoding in many studies [43]. This study also revealed spots that can be used for populations and phylogenetic analyses due to high variability, such as the genes trnH, psbA, and intergenic regions, such as trnH-psbA, ycf 3-trnS, and rps16-trnQ (see more in the Results section and Supplementary Material S6). Nevertheless, the spots of diversity near the genes petN and psbM should be avoided due to low primer annealing considering that the region could be inverted, as seen in U. amethystina yellow.
The preparation of paired-end libraries was enriched for polyadenylated transcripts which causes the instability of organelle transcripts, therefore there is probably underrepresentation of transcripts [44]. However, we were able to observe that almost all chloroplast protein-coding genes are expressed in all sampled flower tissue of U. amethystina, except for ycf 15, both rpl23 duplicated genes in U. amethystina purple and yellow, and the atpF gene in U. amethystina purple (Supplementary Table S7).
The expression profile is similar between samples of the same morphotypes and even expression profile clustering corresponds to the phylogenetic hypothesis proposed in this research. The rbcL gene was one of the most highly expressed genes and encodes for one of the most abundant enzymes in nature, the large subunit of ribulose-1-5-biphosphate carboxylase [45]. This protein is involved in fixing CO 2 and photorespiration [46] . Moreover, high levels of gene expression were found in Photosystem I (PSI) and II genes (PSII), such as psaA and psaB, and psbA, psbB, psbC, and psbD, these proteins are involved in photosynthesis [47]. Studies of barley leaf activities showed that dark-grown plants were deficient in PSI and PSII proteins [48]. Moreover, Klein et al. (1988) showed that the elongation of translation in psaA, psaB, psbA, and rbcL are regulated by light [49,50]. Therefore, considering that the corollas were collected during the day, our results are congruent with the hypothesis of a protein exhibiting light-induced translation.
Interestingly, the petN and psbM genes are expressed in all Utricularia amethystina biological samples, indicating that, the inversion observed in U. amethystina yellow did not affect the expression of these genes.
RNA editing sites are common features of a plant chloroplast. These mutations usually occur from C-to-U in mRNA molecules, and thus have an important role in the differential amino acid generation that can lead to different proteins originated from the same gene [51]. RNA-Seq-based results showed that there is a sum of eight editing sites for all U. amethystina morphotypes ( Table 2).
The PREPACT3 s prediction showed that most nonsynonymous substitutions were characterized as Alanine to Valine and Serine to Leucine. Both lead to protein variations ( Table 2, Tables S8-S10), whereas amino acid changes from Alanine to Valine, Histidine to Tyrosine, Leucine to Phenylalanine, Proline to Phenylalanine, Proline to Leucine, Proline to Serine, Arginine to Tryptophan, Threonine to Isoleucine, and Threonine to Methionine result in no physicochemical properties changes in protein. In addition, the Arginine to Cysteine, Serine to Leucine, and Serine to Phenylalanine mutations can modify protein formation due to hydrophilic (Serine and Arginine) to hydrophobic (Leucine, Phenylalanine, and Cysteine) molecule changes [52,53]. Moreover, PREPACT3 has predicted that the genes rps2 and rpl32 can be edited from Glutamine into a Stop codon, and despite they could be polycistronic genes as in other plants [54], these genes are still transcribed according to RNA-Seq data.
The presented evolutionary history, based on whole chloroplast DNA genomes, and reconstructed by ML and BI approaches supported the same relationship within the Lentibulariaceae when compared with one or few loci-loci analyses ( Figure 8) [2,55]. These analyses and many other studies indicated that Utricularia amethystina can be paraphyletic [13]. However, in this study, despite the differences in the specimen, they are still a monophyletic taxon. This indicated that U. amethystina morphotypes have a common ancestry, but the sampling of other species from sect. Foliosa (U. tricolor and U. tridentata) and species from the close phylogenetically related sect. Psyllosperma would be necessary to shine this issue.
Our results support that the sampling based on three different morphotypes proved to be insufficient to allow firm conclusions on the U. amethystina species separation, considering we sampled one individual per morphotype. However, the scenario presented here based on chloroplast genomes suggests that U. amethystina morphotypes may be different species as previous studies based on morphometric approach [12] and phylogeny with few loci [13], but with more populations, have suggested.
Moreover, the comparative and functional analyses provided by this study bring new insights into the Utricularia chloroplast genome architecture, in particular, the evolutionary history of ndh complex genes and other important photosynthesis-related genes. Taken together, these results prove that we are just in the beginning for the understanding of the evolution of chloroplast photosynthesis machinery in the Lentibulariaceae.

Sampling and DNA Extraction
The three Utricularia amethystina morphotypes were collected from natural populations geographically close to each other (~2.8 km between purple and white; 0.2 km between white and yellow; and 2.82 km between purple and yellow). The samples were preserved in silica gel and stored at room temperature. Vouchers were deposited at Universidade Estadual Paulista (UNESP) in the Herbarium JABU (Table S1). The total genomic DNA was extracted from approximately 0.1 µg of flowers with Qiagen DNeasy Plant Mini extraction kit (Qiagen, Hilden, Germany) following manufacturer's protocol. The quality and quantity of DNA were estimated with Nanodrop 2000 (Thermo Scientific, MA, USA) and Qubit fluorometer (Life Technologies, CA, USA), respectively.

Organellar Genome Sequencing, Assembly and Annotation
Sequence libraries were quantified using Bioanalyzer 2100 (Agilent, CA, USA). The paired-end libraries were prepared using Illumina library preparation manufacturer's protocol, and genomic DNA of 2 × 100 bp and insert size of~200 bp was sequenced using Illumina MiSeq platform (Illumina, San Diego, CA, USA) The produced paired-end reads filtered for adapters, low-quality bases (Phred score Q > 24) and size (length cutoff for 50 bp), and possible contaminants using Trimmomatic v. 0.38 [56]. The resulting paired-end reads were mapped in the search for discarding mitochondria and nuclear reads with Bowtie2 v.2.2.3 [57] using very sensitive local and -N 1 parameters and Utricularia spp. (NC_021449, KY025562, and KT336489) chloroplasts as reference genomes. The resulting reads were assembled using SPAdes v. 3.7.1 [58] and regions with assembly uncertainties were extended using iterative read mapping performed using MITObim v.1.8 [59].
The organelles genomes were primarily annotated using DOGMA [60], cross checked with GeSeq [61], and start and stop codons were adjusted manually for annotation refinements. The tRNAs were annotated using tRNA-scan [62], implemented in DOGMA and Aragorn [63]. The rRNAs were annotated using RNAmmer and BLASTn searches with available Utricularia cpDNA genomes. The cp genome map was constructed using the Organellar Genome Draw program [64].

Repeats and SSR Analyses
To avoid redundant results, only one IR of each Utricularia amethystina cpDNA was used and direct, forward, reverse, and palindromic repeats were identified using REPuter [65] with a minimal size of 30 pb and Hamming distance of 3. Simple sequence repeats (SSR) were detected using MISA-web [66] by setting the minimum number of repeats to 7, 4, 4, 3, 3, and 3, for mono-, di-, tri-, tetra-, penta-, and hexanucleotides, respectively.

Intraspecific Polymorphism Analyses
For polymorphism analyses, the Utricularia amethystina chloroplasts were aligned using MAFFT v.7, with default parameters. Based on the cpDNA multiple alignments, polymorphism analysis was conducted for coding genes, introns, and intergenic spacers. The nucleotide diversity was calculated using Tassel v.5.2.54 [71] with a sliding window of 500 bp length.

RNA Extraction, Sequencing and RNA Editing Site Analyses
The corollas of Utricularia amethystina were stored in RNAlater ® (Thermo Fisher Scientific, MA, USA) from each analyzed population and were used as plant tissues for RNA-Seq. The corollas (~5 per specimen) were pooled in three replicates for U. amethystina white and purple and two for U. amethystina yellow, and total RNA was extracted using PureLink RNA MiniKit (Thermo Fisher Scientific, MA, USA), according to manufacturer's protocol. The extracted RNA was analyzed with Agilent 2100 Bioanalyzer and Qubit 2.0 Fluorometer for quality and quantity assessment, and only samples with RNA Integrity Number (RIN) > 7.0 were used for the sequencing.
The eight libraries (3 libraries for each U. amethystina purple and white and 2 for U. amethystina yellow) were constructed following the TruSeq Stranded mRNA LS Protocol sample preparation protocol. The paired-end (2 × 100 pb) sequencing was performed in one lane on an Illumina platform (HiSeq 2500) following supplier-provided protocols (Illumina, San Diego, CA, USA).
The raw sequencing data, was preprocessing with high stringency using the following steps. (1) For the 3 end, the adapter and low-quality reads were removed using Scythe (https://github.com/ vsbuffalo/scythe; default parameters, except for -n 5 and -M 15); (2) for the 5 end, the removal of adapter and low quality reads were performed with Cutadapt [72]; default parameters, except for -overlap 5; -minimum-length = 15; -times = 2); (3) to filter reads with more than 30% of unknown base (Ns), polyA/T tails we used the software Prinseq [73].
Filtered RNA-seq reads were mapped against the assembled chloroplast genome using STAR version 2.7.2a [74], using default parameters except for adjusted parameters to perform an end-to-end mapping, diminish multiple mapping of the same reads, minimum and maximum size of introns and the number of allowed mismatches (-outFilterMultiMapMax = 3; -outFilterMismatchNmax = 2; -outFilterMismatchNoverLmax = 0.1; -outSJfilterReads = Unique; -alignEndsType = EndtoEnd; -alignIntronMin = 70; -alignIntronMax = 2500). To estimate differential transcripts abundance between biological replicates, normalized count data was obtained using relative log expression (RLE) method in DESeq2 version 3.9 [75] and results were showed following with log2(norm. counts+1). The rps12 is a duplicated trans-spliced gene, therefore it was analyzed in three parts and "_2" and "_3" represent the duplicated regions. The putative RNA edit sites were predicted following PREPACT3 software [76] with BlastX searches (using default parameters) against the Nicotiana tabacum, as reference. The prediction results were compared with the results obtained with an in-house script that counts the number of editing sites according to the previous STAR mapped reads, except for the number of mapped reads, which was set to 1 (only uniquely mapped reads). All of the sites were inspected for C to U nucleotide substitutions by a custom Perl script, with the use of the following parameters; presence in at least two of the biological replicates, editing set with a minimum coverage of 10×.