Differentially Expressed Genes Shared by Two Distinct Cytoplasmic Male Sterility (CMS) Types of Silene vulgaris Suggest the Importance of Oxidative Stress in Pollen Abortion

Cytoplasmic male sterility (CMS), encoded by the interacting mitochondrial and nuclear genes, causes pollen abortion or non-viability. CMS is widely used in agriculture and extensively studied in crops. Much less is known about CMS in wild species. We performed a comparative transcriptomic analysis of male sterile and fertile individuals of Silene vulgaris, a model plant for the study of gynodioecy, to reveal the genes responsible for pollen abortion in this species. We used RNA-seq datasets previously employed for the analysis of mitochondrial and plastid transcriptomes of female and hermaphrodite flower buds, making it possible to compare the transcriptomes derived from three genomes in the same RNA specimen. We assembled de novo transcriptomes for two haplotypes of S. vulgaris and identified differentially expressed genes between the females and hermaphrodites, associated with stress response or pollen development. The gene for alternative oxidase was downregulated in females. The genetic pathways controlling CMS in S. vulgaris are similar to those in crops. The high number of the differentially expressed nuclear genes contrasts with the uniformity of organellar transcriptomes across genders, which suggests these pathways are evolutionarily conserved and that selective mechanisms may shield organellar transcription against changes in the cytoplasmic transcriptome.


Introduction
Cytoplasmic male sterility (CMS) is a special case of mitochondrial-nuclear interaction in flowering plants, leading to the inability to produce viable pollen [1,2]. CMS is caused by mitochondrial genes, which encode proteins that are harmful for mitochondrial function in the course of pollen development. CMS genes are often chimeric, composed of several pieces of other mitochondrial genes. Their expression is influenced by nuclear factors (Rf), capable of restoring male fertility [3]. Plant individuals not producing pollen develop a female (F) gender, and those with restored male fertility become hermaphrodites (H). These two genders-females and hermaphrodites-constitute the basis of the plant breeding system gynodioecy [4], the second most widespread angiosperm reproduction system after hermaphroditism [5,6].

Trimming and Quality Control of Paired-End Reads
We used RNA-seq data from total RNA extracted from flower buds of three F and three H plants of S. vulgaris KRA [12] and KOV [24]. The data were deposited within the NCBI Sequence Read Archive BioProject PRJNA321915. Reads from H and F plants of the KRA haplotype are saved under the accessions SRS2438489 and SRS2438490, respectively; reads from H and F plants of the KOV haplotype are saved under the accessions SRS2113743 and SRS2113744, respectively.
To obtain the reads derived from the cytoplasmic part of S. vulgaris, custom shell and Python scripts were used to remove all mitochondrial and plastid reads from the total paired RNA reads. In detail, all reads aligned to the mitochondrial and chloroplast/plastid genome were obtained from the corresponding BAM files, converted to FASTQ format and then, the names of these reads were saved in a list file and sorted. The Python program (filter_reads.py) was then used to compare names in the list and in the total reads file. All reads in the list were subsequently removed from the total reads files, as the resulting read files may have single (unpaired) reads, which will break pairing, these singletons were removed, too. This part was done with Python program fastqCombinePairedEnd.py and resulting paired reads saved in FASTQ format.
In the next step Rcorrector [26], which is an error correction method based on a De Bruijn graph to compactly represent all trusted k-mers in the input reads, was used, as sequencing errors in short RNA-seq reads may complicate bioinformatics analyses, e.g., alignment and assembly. Rcorrector has a higher accuracy than or comparable to existing methods, including the only other method (SEECER) designed for RNA-seq reads, and is more time and memory efficient.
After reads were error-corrected, TrimGalore, a wrapper script for quality and adapter trimming as well as quality control, was applied. Here we used TrimGalore instead of FastQC as FastQC is registering adapter contamination only if the entire sequence length (some 12 or 13 bp) is detected. The adapter trimming within Trim Galore however also removes trailing bases from reads that look like adapter sequences.
A ribosomal RNA filtering step was performed with the quality and adapter trimmed reads using SortMeRNA [27], a program tool for filtering, mapping and OTU-picking in next generation sequencing (NGS) reads. SortMeRNA takes as input a file of reads (fasta or fastq format) and one or multiple rRNA database file(s), and sorts separate rRNA from total RNA into two files specified by the user. It can provide high quality local alignments of rRNA reads against an rRNA database. We used the eukaryotic and rfam databases.

De Novo Assembly
The resulting reads were used for de novo assembly with Trinity v2.9.0 [28] with the in silico normalization. To remove redundancies in the assembly CD-Hit [29] was employed with an identity cutoff of 99.9%.
Finally, we used the EvidentialGene [30] tr2aacds.pl pipeline script to classify not-clustered transcripts from the Trinity assembly into valid, non-redundant coding sequences (the 'okay' set), and redundant, fragmented or non-coding transcripts, using the parameters-m -MINCDS = 50. We used this okay set for further analyses as this best represents the biologically real set of transcripts captured by our data.

Evaluation of Assemblies
For the evaluation of the different assemblies several methods were applied. First, assembly stats of the resulting assemblies were examined with the TrinityStats.pl script and are summarized in Table 1. Second, BUSCO [31], a tool to assess transcriptome completeness, based on evolutionarily-informed expectations of gene content from near-universal single-copy orthologs selected from OrthoDB v9, was used with the embryophyta_odb9 set. The results were visualized with the provided script generate_plot.py. Third, Detonate (DE novo TranscriptOme rNa-seq Assembly with or without the Truth Evaluation) RSEM-EVAL [32] was employed to further evaluate de novo transcriptome assemblies. The last method performed, was completeness and contiguity evaluation according to [33]. Completeness measures the degree to which the transcriptome is covered by the assembled contigs and is estimated by calculating the percentage of genes in the annotated gene catalog that are covered at >80% of the gene length, using an established set of previously annotated genes of the same or a closely related organism. Here, we used the gene set of A. thaliana (APVO SSC genes).
Contiguity measures how likely a known gene is to be assembled into a single contig covering the full length of the gene and was estimated by calculating the percentage of complete genes covered by a single contig to >80% of the gene length.

Estimation of Transcript Quantification and Differential Gene Expression
Transcript quantification was done with Trinity and the included scripts for downstream analyses [34]. We used the alignment based estimation method RSEM [35] with bowtie2 [36] and the alignment-free method Salmon [37] with the strand specific option-SS_lib_type RF. Followed by differential gene expression using the Bioconductor packages DESeq2 [38] and edgeR [39]. Differentially expressed features were identified with biological replicates, three for hermaphrodites and females, each. Extraction of differentially expressed genes was done within the Trinity pipeline and parameters-C 5e-2 (cutoff for FDR)-P 0 (min abs (log2(a/b)) fold change). We decided to use the statistically more rigorous FDR, improving the integrity of the set of detected DEG and minimizing the proportion of false positives below the 0.05 (5%) threshold. This value is directly related to the probability that the DEG detected are true DEG or not, while an applied log-fold change threshold (0) is not directly connected to this and does not account for the variability of the expression values. The resulting differentially expressed genes from all four methods were compared using a custom R script and visualized in a Venn diagram.
Next, blastx based homology searches (BLAST + 2.9.0) for both the KRA and KOV S. vulgaris de novo transcriptome assemblies against the NCBI nr protein database were performed. The cutoff E-value was set to <10 −4 , and maximum number of allowed hits was set to 10. The OmicsBox program v. 1.3.3 (BioBam Bioinformatics S.L., Valencia, Spain) was then used to annotate unigenes on the basis of gene ontology (GO) terms, InterProScan and nr database annotation. The same program was used for the visualization of the GO functional classification and the distribution of the gene functions.
To identify significantly enriched GO terms Fisher's exact test and Gene Set Enrichment Test of OmicsBox based on an FDR-value cutoff of 0.05 were implemented. To summarize pathway information, the KEGG (Kyoto Encyclopedia of Genes and Genomes) in OmicsBox and Automatic Annotation Server (KAAS) were used to perform pathway annotation. Pathways with a p-value ≤ 0.05 were considered significantly enriched.

Reciprocal Blast Search
For the prediction of putative orthologs between S. vulgaris and A. thaliana, a reciprocal blast search was used. First, a local Blast protein database for the S. vulgaris translated genes was created. This database was searched for hits with the genes of the A. thaliana TAIR10 genome (www.arabidopsis. org). Then a list of the hits was created and these nucleotide sequences selected from the assembly of S. vulgaris using the script faSomeRecords (webpage is github.com/ENCODE-DCC/kentUtils/tree /master/src/utils/faSomeRecords). This reduced dataset was blasted in a second step against the database of A. thaliana TAIR10 proteins. Both blastx results were compared for the best hit and shared pairs picked with the python script recblast.py as best reciprocal pair and putative orthologs. The same script was applied for the search of orthologs between the S. vulgaris haplotypes KOV and KRA, except blastn was used instead of blastx.  [24].
Quantitative PCR was performed using the LightCycler 480 SYBR Green I Master Kit and the LightCycler 480 instrument (Roche Applied Science). The reaction mixture contained 5.0 µL 2× MasterMix, primers in specific concentrations (Table S1), and 2.5 µL of 20× diluted first-strand cDNA in a total volume of 10 µL. The LightCycler was programmed with an initial denaturation for 5 min at 95 • C followed by 50 cycles of 10 s at 95 • C, 10 s at 58-60 • C (based on primers), and 15 s at 72 • C. PCR efficiencies were estimated from calibration curves generated from serial dilution of cDNAs. Three technical replicates per each specimen were measured. A calibrator (the same cDNA sample present in all plates) was used to correct for run-to-run variation. Expression values were normalized against the reference gene encoding actin (SvACT), as its stable expression in S. vulgaris was reported by [40].

Statistical Evaluation
The relative expression values were plotted in graphs and the outliers (zero to two in each dataset) were identified, if they were higher than twice the interquartile range (the difference between the first and third quartile). A one-way ANOVA followed by Tukey's test as implemented in PAST v.3.14 (PAleontological STatistics) was applied on each gene expression data set.

Transcriptome Assembly, Reduction of Duplicates
We used Illumina raw reads from the RNA-seq data sets obtained from three female and three hermaphrodite flower buds of S. vulgaris haplotype KRA and haplotype KOV deposited under the Short Read Archive accession number PRJNA321915. As this data set was meant to analyze the organellar transcriptomes, which often do not possess polyA tails, polyA RNA selection was not performed [41]. Instead, rRNA was eliminated by the Ribo-Zero technique prior to cDNA library construction. To assemble the nuclear-encoded transcriptomes, we first removed all reads mapping to the organellar genomes. Then we eliminated reads derived from rRNAs, which survived rRNA elimination step before cDNA library preparation. The number of reads after the filtering steps dropped to 41% and 71% ( Table 1). The decrease was higher in the KRA dataset, which might have been caused Cells 2020, 9, 2700 6 of 24 by the different read length, distinct efficiency of rRNA elimination before the library construction, or different proportion of organellar reads.
To optimize trimming, we performed de novo assembly with Trinity under various minimum length of trimmed reads, and compared the resulting assemblies based on the Detonate, BUSCO, contiguity, and completeness scores. We selected the assembly constructed from the reads with a minimum length of 146 nt (k-mer 32) for the KRA cytoplasmic transcriptome ( Figure S1) and the reads with a minimum length of 65 nt (k-mer 26) for the KOV cytoplasmic transcriptome ( Figure S2).
To reduce the number of duplicates we used CD-Hit, but the resulting assemblies were still highly redundant as documented by BUSCO. We have therefore applied EvidentialGene which resulted in a dramatic decrease in the number of duplicates (Figure 1), as well as in the total number of assembled genes and transcripts. The reduction was accompanied by a slight loss of information documented by the lower values of completeness and contiguity ( Table 2). Cells 2020, 9, x FOR PEER REVIEW 6 of 23 of assembled genes and transcripts. The reduction was accompanied by a slight loss of information documented by the lower values of completeness and contiguity (Table 2). Genetic diversity and heterozygosity in S. vulgaris are high, as documented e.g., by [42]. This diversity may complicate both transcriptome assembly and read mapping. To ensure the maximum read mapping efficiency, we constructed a separate de novo transcriptome for each of the two S. vulgaris haplotypes KRA and KOV, and used it as a reference for read mapping. Whereas the KOV and KRA reads were mapped to their own assemblies from 81% and 74%, respectively, about 7% fewer reads were properly mapped against the opposite reference transcriptome. We have therefore continued the following analyses with the reference de novo transcriptomes constructed separately for each dataset.  Reduction of the duplicated transcripts in the transcriptome assemblies KRA (Trinity-146_32_cd-hit-plantsonly_KRA_evigene) and KOV (Trinity_CD-Hit_999_KOV_evigene) treated with EvidentialGene.
Genetic diversity and heterozygosity in S. vulgaris are high, as documented e.g., by [42]. This diversity may complicate both transcriptome assembly and read mapping. To ensure the maximum read mapping efficiency, we constructed a separate de novo transcriptome for each of the two S. vulgaris haplotypes KRA and KOV, and used it as a reference for read mapping. Whereas the KOV and KRA reads were mapped to their own assemblies from 81% and 74%, respectively, about 7% fewer reads were properly mapped against the opposite reference transcriptome. We have therefore continued the following analyses with the reference de novo transcriptomes constructed separately for each dataset.

Functional Annotation
The KRA and KOV transcriptomes were annotated by OmicsBox v.1.3.3 on the basis of blastx with E-value <10 −4 . 53% and 45% of Trinity 'genes' returned a blastx hit and 38% and 34% were annotated by B2GO from the KRA and KOV assemblies treated by EvidentialGene, respectively ( Figure 2). The total number of annotated Trinity 'genes' was 32,022 in the KRA assembly and 22,037 in the KOV assembly. Owing to a higher number of reliably annotated genes in the KRA transcriptome, we focused primarily on the KRA assembly in the gene expression analyses.

Functional Annotation
The KRA and KOV transcriptomes were annotated by OmicsBox v.1.3.3 on the basis of blastx with E-value <10 −4 . 53% and 45% of Trinity 'genes' returned a blastx hit and 38% and 34% were annotated by B2GO from the KRA and KOV assemblies treated by EvidentialGene, respectively ( Figure 2). The total number of annotated Trinity 'genes' was 32,022 in the KRA assembly and 22,037 in the KOV assembly. Owing to a higher number of reliably annotated genes in the KRA transcriptome, we focused primarily on the KRA assembly in the gene expression analyses.

Estimation of Differentially Expressed (DE) Genes
The estimation of DE genes depends on the specific method and on the selected parameter threshold. Particular caution shall be taken when the genetic material is diverse. The S. vulgaris plants under study were the descendants of the individuals collected in the field. These highly genetically variable wild plants contrast starkly with the genetically homogenous agricultural cultivars used for the transcriptomic studies of CMS in crops. Our original KOV and KRA datasets [12,24] were generated from the plants cultivated in the greenhouse at different time periods under slightly different conditions. The rate of morphological development of flower buds varied even among full siblings. Lower or higher expression might have been caused by slightly accelerated or delayed floral development.
To identify the most reliably DE genes we applied the combination of four different methods to find DE genes between F and H flower buds (Salmon plus DESeq2, Salmon plus edgeR, RSEM plus DESeq2, RSEM plus edgeR) with FDR <0.05. We estimated 14,660 DE genes by at least one pipeline, and 1823 DE genes by all four pipelines, when we used the KRA Trinity assembly treated by CD-Hit as the reference. However, when we analyzed DE genes with the reduced KRA Trinity assembly treated by EvidentialGene as the reference, we discovered 8811 DE genes identified by at least one pipeline, and 3022 genes concurrently by all four pipelines ( Figure 3b). The four methodical approaches provided much more consistent results, when the assembly was treated by EvidentialGene compared to the untreated assembly. From this reason, we used the EvidentialGene assembly for the subsequent differential expression analyses. The estimation of DE genes in the KOV transcriptomes generated similar results as the KRA transcriptomes (Figure 3a).

Estimation of Differentially Expressed (DE) Genes
The estimation of DE genes depends on the specific method and on the selected parameter threshold. Particular caution shall be taken when the genetic material is diverse. The S. vulgaris plants under study were the descendants of the individuals collected in the field. These highly genetically variable wild plants contrast starkly with the genetically homogenous agricultural cultivars used for the transcriptomic studies of CMS in crops. Our original KOV and KRA datasets [12,24] were generated from the plants cultivated in the greenhouse at different time periods under slightly different conditions. The rate of morphological development of flower buds varied even among full siblings. Lower or higher expression might have been caused by slightly accelerated or delayed floral development.
To identify the most reliably DE genes we applied the combination of four different methods to find DE genes between F and H flower buds (Salmon plus DESeq2, Salmon plus edgeR, RSEM plus DESeq2, RSEM plus edgeR) with FDR <0.05. We estimated 14,660 DE genes by at least one pipeline, and 1823 DE genes by all four pipelines, when we used the KRA Trinity assembly treated by CD-Hit as the reference. However, when we analyzed DE genes with the reduced KRA Trinity assembly treated by EvidentialGene as the reference, we discovered 8811 DE genes identified by at least one pipeline, and 3022 genes concurrently by all four pipelines (Figure 3b). The four methodical approaches provided much more consistent results, when the assembly was treated by EvidentialGene compared to the untreated assembly. From this reason, we used the EvidentialGene assembly for the subsequent differential expression analyses. The estimation of DE genes in the KOV transcriptomes generated similar results as the KRA transcriptomes (Figure 3a).

Enrichment of GO Categories
We applied Fisher's exact test implemented in OmicsBox v.1.3.3 (FDR < 0.05) to find significantly enriched GO terms. We processed all the annotated transcripts identified as significantly downregulated or upregulated in females by all four pipelines as described above. We found 315 enriched GO categories among downregulated DE genes and 174 enriched GO categories among upregulated DE genes in the KRA transcriptomes (Table S2). The most specific GO categories downregulated in females are depicted in Figure S3. They include cell wall modification, sporopolenin biosynthesis, tapetum morphogenesis, sugar transmembrane transporter, fatty acid elongation, triglyceride biosynthesis, and cellular iron ion homeostasis. The most specific GO terms upregulated in females are shown in Figure S4, including responses to salt and oxidative stresses or wounding, leaf senescence, peroxidase activity, abscisic acid signaling pathway, or nitrate transport.
The GO enrichment analysis of the KOV transcriptomes produced similar results as the KRA transcriptomes with the genes upregulated in females, but no enriched GO terms were found in the set of downregulated genes at FDR <0.05. This result may be related to the lower number of transcripts assembled (Table 2) and annotated ( Figure 2) in the KOV transcriptome, constructed from shorter reads with minimum length of 65 nt compared to its KRA counterpart, generated from reads with minimum length of 146 nt.

DE Genes Shared by KRA and KOV Plants
We aimed to find the genes which are activated or inhibited in female flower buds of S. vulgaris regardless of the mitochondrial haplotype. We applied reciprocal blast hit to identify the transcripts orthologous between KRA and KOV haplotypes. 31,081 orthologous pairs were found among 84,004 KRA (38%) and 64,549 KOV Trinity 'genes' (48%) in the final Evigene-treated assemblies. 21,629 (70%) orthologous pairs returned blastx hits, suggesting that the proportion of annotated genes was higher among orthologs between KRA and KOV than in the entire gene set.
We selected the orthologous pairs identified as DE genes by all four methods in both KRA and KOV transcriptomes to get a reliable set of DE genes shared between the KRA and KOV plants. There were 162 downregulated (104 with an informative blastx hit) ( Table 3) and 82 (76 with an informative blastx hit) upregulated orthologs (Table 4). In addition, we compiled a more comprehensive list of 468 annotated orthologous pairs, which were recognized as DE genes by at least one of the four methods in both KRA and KOV plants (Table S3).

Enrichment of GO Categories
We applied Fisher's exact test implemented in OmicsBox v.1.3.3 (FDR < 0.05) to find significantly enriched GO terms. We processed all the annotated transcripts identified as significantly downregulated or upregulated in females by all four pipelines as described above. We found 315 enriched GO categories among downregulated DE genes and 174 enriched GO categories among upregulated DE genes in the KRA transcriptomes (Table S2). The most specific GO categories downregulated in females are depicted in Figure S3. They include cell wall modification, sporopolenin biosynthesis, tapetum morphogenesis, sugar transmembrane transporter, fatty acid elongation, triglyceride biosynthesis, and cellular iron ion homeostasis. The most specific GO terms upregulated in females are shown in Figure S4, including responses to salt and oxidative stresses or wounding, leaf senescence, peroxidase activity, abscisic acid signaling pathway, or nitrate transport.
The GO enrichment analysis of the KOV transcriptomes produced similar results as the KRA transcriptomes with the genes upregulated in females, but no enriched GO terms were found in the set of downregulated genes at FDR <0.05. This result may be related to the lower number of transcripts assembled ( Table 2) and annotated ( Figure 2) in the KOV transcriptome, constructed from shorter reads with minimum length of 65 nt compared to its KRA counterpart, generated from reads with minimum length of 146 nt.

DE Genes Shared by KRA and KOV Plants
We aimed to find the genes which are activated or inhibited in female flower buds of S. vulgaris regardless of the mitochondrial haplotype. We applied reciprocal blast hit to identify the transcripts orthologous between KRA and KOV haplotypes. 31,081 orthologous pairs were found among 84,004 KRA (38%) and 64,549 KOV Trinity 'genes' (48%) in the final Evigene-treated assemblies. 21,629 (70%) orthologous pairs returned blastx hits, suggesting that the proportion of annotated genes was higher among orthologs between KRA and KOV than in the entire gene set.
We selected the orthologous pairs identified as DE genes by all four methods in both KRA and KOV transcriptomes to get a reliable set of DE genes shared between the KRA and KOV plants. There were 162 downregulated (104 with an informative blastx hit) ( Table 3) and 82 (76 with an informative blastx hit) upregulated orthologs (Table 4). In addition, we compiled a more comprehensive list of 468 annotated orthologous pairs, which were recognized as DE genes by at least one of the four methods in both KRA and KOV plants (Table S3).       We also searched for DE genes specific only to one haplotype, either KOV or KRA, with not being DE in the other. However, we found no ortholog identified as a DE gene by all four (or at least three) methods in one transcriptome, which was not at the same time recognized as DE by at least one method in the other transcriptome. This result does not mean that such genes do not exist. They may be present, but they did not meet our criteria. The KOV transcriptomic assembly contained less transcripts with shorter average length (Table 2), which might have complicated the correct ortholog identification, particularly in less expressed genes with a higher risk of transcript misassembly. We are now working on a comprehensive comparison of the KRA and KOV haplotypes with transcriptomes sampled and sequenced at the same time, which also includes backcrossed plants with exchanged mitochondria, harboring KOV mitochondria in the KRA nuclear background and vice versa. This study will reveal the DE genes specific for the particular CMS type, if they exist.
Our results document numerous genes differentially expressed between F and H flower buds, which were shared by the KRA and KOV haplotypes of S. vulgaris carrying distinct CMS types. We describe the putative functions of selected significantly DE genes in the following paragraphs.

Genes Involved in Male Functions Were Downregulated in S. vulgaris Female Flower Buds
Numerous DE genes downregulated in S. vulgaris females were associated with male structures-pollen or anthers (Table 3). For example, the SMALL AUXIN UP RNA (SAUR) 63 gene subfamily represented by DN45544_c4_g1 regulates gibberellin-dependent stamen filament elongation in A. thaliana [43].
Tapetum, the innermost layer of anther ensuring the nutrition of developing microspores and pollen grains [44], is particularly important in pollen development. The abnormal development of tapetal tissue may result in the degradation of microspores [45]. The GO terms 'sporopolenin biosynthesis' or 'tapetum morphogenesis' were enriched among the downregulated genes. The NODULIN INTRINSIC PROTEIN 7;1(NIP7;1) gene (DN49199_c0_g1) codes for a channel supplying boron necessary for pollen cell wall synthesis, which is expressed in the tapetum of A. thaliana [46]. Another downregulated transcript (DN57560_c0_g2) is homologous to BARELY ANY MERISTEM (BAM2) encoding putative leucine-rich repeat receptor-like kinase, which controls tapetal cells' development in A. thaliana [47]. The MYB transcription factor ABORTED MICROSPORES (AMS) (DN59645_c1_g1) regulates tapetal cell development and microspore formation and is associated with genic male sterility in watermelon [48].
In addition to the genes with known function, a homolog of serendip2, (DN62179_c1_g1) specific to the male gender in Silene latifolia [49] was also inhibited in S. vulgaris females. This gene has not been found in any other plant species.
The downregulation of the genes involved in lipid and carbohydrate metabolism may harm the biosynthesis of the pollen cell wall and its outer layer exine, produced by the tapetum [50,51]. We observed the inhibition of several genes related to fatty acid synthesis, wax and cutin formation in S. vulgaris females (two transcripts coding for pectinesterases (DN44106_c0_g1, DN69711_c0_g1), the transcript for UDP-arabinopyranose mutase (DN59301_c0_g4) involved in cell wall modifications [52]). Genes with similar functions were also inhibited in male sterile individuals e.g., in radish and other crops [14][15][16][17][18][19]53]. In contrast, the transcripts coding for catabolic enzymes lipases and esterase (DN57332_c1_g2, DN61032_c0_g1), or caleosin (DN54044_c1_g1), functioning in degradation of storage lipids, were upregulated in females (Table 4).
Besides the roles played in cell wall biosynthesis and modifications, carbohydrate metabolism supplies the energy necessary in the course of pollen development. The transcripts coding for the enzymes hydrolyzing polysaccharides such as alpha-amylase (DN35498_c0_g1) ( Table 4) or beta-amylase1 (DN67781_c0_g1) ( Table S2) were upregulated in S. vulgaris females.

Pollen Abortion Is Associated with Oxidative Stress
Many oxidative stress-induced genes (e.g., encoding ROS scavenging enzymes such as peroxidases) were strongly upregulated in females of S. vulgaris, which suggests the impact by oxidative stress. The example is the transcript DN56053_c0_g3 coding for peroxidase (PER). The RT qPCR estimation of the SvPER transcript levels confirmed its expression in flower buds of both genders, but not in leaves (Figure 4e). This finding may be explained by the elevated production of ROS with subsequent induction of peroxidases in the course of the development of floral organs, but not during the leaf development. ROS in non-green tissues are primarily produced due to excessive electrons generated in mitochondria [45,54]. The increased ROS production in flower buds can be associated with the elevated mitochondrial respiration needed to satisfy the high energy demands of pollen and ovule development.
The impairment of mitochondrial respiration in female plants caused by the activity of mitochondrial CMS genes may lead to accelerated ROS accumulation resulting in pollen abortion [1,2]. The higher ROS concentrations in male sterile than in fertile anthers were reported e.g., by [55] in their study of CMS-D8 in cotton. Similar results were obtained by [56], who described enhanced ROS production interfering with pollen development in CMS-WA in rice.
The signaling function of some ROS ensuring the proper timing of programmed cell death in the course of standard pollen development should also be considered [57]. Transcripts encoding cell death-related proteins were upregulated in females, for example the gene for metacaspase 9 (MC9), a protease involved in cell death regulation [58] (DN35718_c0_g2). SvMC9 upregulation in F plants was confirmed by RT qPCR (Figure 4).
genders, but not in leaves (Figure 4e). This finding may be explained by the elevated production of ROS with subsequent induction of peroxidases in the course of the development of floral organs, but not during the leaf development. ROS in non-green tissues are primarily produced due to excessive electrons generated in mitochondria [45,54]. The increased ROS production in flower buds can be associated with the elevated mitochondrial respiration needed to satisfy the high energy demands of pollen and ovule development.  The response to the oxidative stress is tightly associated with phytohormones. ABA and ethylene among the phytohormones are involved in plants stress responses. The genes responsible for ethylene production or signaling (the transcript DN48287_c2_g1 encoding 1-aminocyclopropane-1-carboxylate oxidase (ACO4), the rate-limiting enzyme in ethylene biosynthesis; the transcript DN62269_c0_g1 coding for ETHYLENE-RESPONSIVE TRANSCRIPTION FACTOR 2 (ERF2), inducible by abiotic stresses), or ABA signaling (the transcript DN48556_c0_g2 encoding C2-DOMAIN ABA-RELATED 4 (CAR4) facilitating ABA signaling [59]) were upregulated in female flower buds of S. vulgaris, most likely induced by oxidative stress. RT qPCR confirmed their increased expression in F flower buds, whereas no differences between the genders were measured in leaves (Figure 4a,c,d). Elevated concentrations of ABA in male sterile flowers of Brassica napus were previously reported by [60].
We also observed the high upregulation of several transcripts encoding germin-like proteins (GLPs) in female flower buds. These proteins are induced by a plethora of biotic and abiotic stresses [61], but their role in CMS or pollen abortion has not been studied yet. The SvGLP gene exhibited very low or zero expression in leaves, as detected by RT qPCR (Figure 4f), which may be associated with lower ROS concentrations in leaves than in flowers.

AOX1 Gene Was Downregulated in Female Flower Buds in S. vulgaris
Alternative oxidase 1 (AOX1) is the enzyme involved in the alternative route of transferring electrons to oxygen, which bypasses cytochrome c oxidases and ATP production. AOX1 prevents the accumulation of ROS [62] and is induced by blocking mitochondrial respiration or by environmental conditions [63,64]. AOX1 is considered to be a marker of retrograde signaling directed from mitochondria to the nucleus [65].
In contrast with most stress-related genes, the transcript DN56446_c0_g1 encoding AOX1 was higher in H than in F flower buds. The lowest AOX1 expression was detected by RT qPCR in the leaves of both genders (Figure 4b). The peroxidase gene SvPER or stress-response gene encoding germin-like protein SvGLP also exhibited very low or zero expression in leaves, which may be associated with lower ROS concentrations in leaves than in flowers. However, unlike SvPER or SvGLP genes, AOX1 transcript levels were lower in F flower buds compared with H flower buds. This result contradicts the study of CMS of [66] in cotton, that found higher level of AOX1 in F flowers. The authors of the recent study [67] imported RNA associated with CMS to mitochondria of A. thaliana and described increased or decreased levels of AOX1 depending on the time since RNA import, developmental stage, or light conditions. These reports suggest that the control of AOX1 expression during CMS is complex. It may depend on the developmental stage of flower, the CMS type, or the species. The decrease of AOX1 expression in F flower buds in two distinct CMS types in S. vulgaris may trigger ROS accumulation, which launches untimely degradation of the tapetum leading finally to pollen abortion and male sterility. As AOX1 participates in retrograde signaling, our results indicate a prominent role of this process in CMS, in line with [68,69].

Differences between Nuclear-Encoded Transcriptomes of F and H Flower Buds of S. vulgaris Contrast with the Uniformity of Their Organellar Transcriptomes
Transcripts coding for proteins targeted to plastids or mitochondria were frequently present among DE genes between F and H flower buds. They included several transcripts involved in lipid metabolism targeted to plastids, as well as the transcripts responsible for plastoquinone synthesis (DN50701_c1_g1, DN64967_c1_g1). DE transcripts involved e.g., in amino acid biosynthesis (DN70324_c1_g1), or oxidation-reduction processes (DN64912_c1_g4) were targeted to mitochondria. The oxidation resistance protein 1 (DN65942_c2_g2), a mitochondrial protein induced by oxidative stress [70], was downregulated in F plants, similarly to AOX1.
We analyzed the cytoplasmic portions of the RNA-seq data sets, from which organellar transcriptomes were mapped previously. We might have therefore directly compared the transcriptomes from three compartments-mitochondrial, plastid, and cytoplasmic-constructed from the same RNA samples. The mitochondrial transcriptomes exhibited only very few differences in coverage or editing between F and H [12,24], the plastid transcriptomes were completely identical between F and H [25]. Our observations contradict some studies, e.g., [19], which described numerous mitochondrial DE genes between male sterile and male fertile individuals. This discrepancy may be sometimes explained Cells 2020, 9, 2700 20 of 24 by experimental artifacts. Liu et al. [19] used polyA enrichment for the construction of cDNA library, but this method cannot detect organellar transcripts that lack poly A [41] and may lead to the underestimation of gene coverage.
Considering thousands of nuclear-encoded genes DE between F and H flower buds of S. vulgaris, including many genes coding for plastid and mitochondria targeted proteins, the uniformity of organellar transcriptomes of S. vulgaris is surprising. Niazi et al. [67] described the stability of the mitochondrial transcriptome of A. thaliana, which was capable of buffering the changes associated with knocking down the selected mitochondrial genes. Accordingly, our results support the existence of mechanisms balancing the steady transcription profiles of organellar genes of S. vulgaris even under conditions leading to pollen abortion.

Conclusions
We revealed numerous nuclear-encoded DE genes between F and H flower buds of the gynodioecious species S. vulgaris. Our findings contrast with near-zero differences in gene expression in mitochondria or plastids of S. vulgaris, when estimated in the same RNA samples, suggesting the existence of a mechanism shielding organellar transcription against the changes in the cytoplasmic transcriptome.
We observed a general similarity between the genetic pathways controlling CMS in crops and the wild gynodioecious species S. vulgaris. However, we also discovered some DE genes, not reported or discussed in previous CMS studies, e.g., serendip or the genes coding for germin-like proteins. The AOX1 transcripts were reduced in F flower buds, suggesting that AOX1 downregulation may contribute to ROS accumulation, which ultimately results in pollen abortion.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4409/9/12/2700/s1, Figure S1: The comparison of the assemblies made of various minimum length of trimmed reads of the KRA dataset; Figure S2: The comparison of the assemblies made of various minimum length of trimmed reads of the KOV dataset; Figure S3: The enriched GO categories downregulated in female flower buds of Silene vulgaris; Figure S4: The enriched GO categories upregulated in female flower buds of Silene vulgaris; Table S1: Primer sequences and PCR conditions used in this study; Table S2: GO categories downregulated and upregulated in female flower buds of Silene vulgaris; Table S3: The genes downregulated and upregulated in female flower buds of Silene vulgaris, identified by at least one of four bioinformatic pipelines. Funding: This project was funded by the grant of the Grant Agency of the Czech Republic 16-09220S to HŠ. Additional support was provided by European Regional Development Fund-Project "Centre for Experimental Plant Biology" (no. CZ.02.1.01/0.0/0.0/16_019/0000738). Funders provided financial support only, they had no role in the design of the study, analysis and interpretation of data, decision to publish, or the preparation of the manuscript.