The Alternative Splicing Landscape of Brassica napus Infected with Leptosphaeria maculans

Alternative splicing (AS) is a post-transcriptional regulatory process that enhances transcriptome diversity, thereby affecting plant growth, development, and stress responses. To identify the new transcripts and changes in the isoform-level AS landscape of rapeseed (Brassica napus) infected with the fungal pathogen Leptosphaeria maculans, we compared eight RNA-seq libraries prepared from mock-inoculated and inoculated B. napus cotyledons and stems. The AS events that occurred in stems were almost the same as those in cotyledons, with intron retention representing the most common AS pattern. We identified 1892 differentially spliced genes between inoculated and uninoculated plants. We performed a weighted gene co-expression network analysis (WGCNA) to identify eight co-expression modules and their Hub genes, which are the genes most connected with other genes within each module. There are nine Hub genes, encoding nine transcription factors, which represent key regulators of each module, including members of the NAC, WRKY, TRAF, AP2/ERF-ERF, C2H2, C2C2-GATA, HMG, bHLH, and C2C2-CO-like families. Finally, 52 and 117 alternatively spliced genes in cotyledons and stems were also differentially expressed between mock-infected and infected materials, such as HMG and C2C2-Dof; which have dual regulatory mechanisms in response to L. maculans. The splicing of the candidate genes identified in this study could be exploited to improve resistance to L. maculans.


Introduction
Alternative splicing (AS) was first discovered by Gilbert in 1978 [1]. During this post-transcriptional regulatory process, precursor-mRNA (pre-mRNA) produces differentially spliced RNA transcripts that may be translated into diverse protein isoforms [2]. AS events are quite common in many organisms, occurring in 92-94% of intron-containing genes in human (Homo sapiens) [3], 60% in Arabidopsis thaliana [4], 52% in soybean (Glycine max) [5], 40% in Gossypium raimondii [6], 40% in maize (Zea mays) [7], and 33% in rice (Oryza sativa) [8]. The various types of AS events include the formation of skipped exons, retained introns, alternative 5 splice sites, alternative 3 splice sites, mutually exclusive 3 UTRs, tandem UTRs, mutually exclusive 5 exons, and mutually exclusive exons, all of which increase the complexity of the transcriptome and the diversity of the proteome. AS is prevalent in eukaryotic organisms [9], but the types of AS events vary among species. For example, skipped exons are the most common type of AS event, and retained introns are the least common in animals and

Weighted Gene Co-expression Network Analysis of Overlapping Differentially Spliced Genes
To investigate the functions of overlapping differentially spliced genes, WGCNA [40] in R version 3.4.4 was performed, including sample clustering, outlier detection, soft threshold filtering, one-step network construction, module identification, and module relationship analysis. From the above analysis, we gained the weighted values, which represent the relationships between genes in pairs. The weighted values, which are >0.15, were used to perform network analysis and gain the genes "degree (i.e., the number of genes linked to the gene)" using Cytoscape_v3.5.1 [41]. Here, within each module, we selected 30 genes whose degree is the largest as Hub genes. In addition, according to the Cufflinks, we calculated the FPKM (fragments per kilobase of transcript sequence per million base pairs sequenced) including 2 biological replicates. Among the differentially spliced genes, we performed an independent-sample t-test based on the FPKM values of the inoculated and mock samples using the SPSS Statistics 22 [42] to identify the differentially expressed genes. Differences between independent samples with a p-value <0.05 were considered to be significant. We used the average FPKM values of two biological replicates to draw the isoform expression heatmap of the above differentially expressed genes using R version 3.4.4 program.

Functional Enrichment and Clustering
To identify enriched functional terms for the modules identified by WGCNA, the R package topGO was used to perform Gene Ontology (GO) analysis [43]. The enriched KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways of genes in each module were identified using the Omicshare platform (www.omicshare.com/tools).

Putative Transcript Assembly
To identify the transcripts and expression patterns of every isoform, we performed the processes described below to obtain the data required for AS analysis. Using the B. napus reference genome (http://www.genoscope.cns.fr/brassicanapus/data/), the raw data were quality trimmed, followed by transcript assembly and transcript merging using the RNA-seq data analysis pipeline. We identified 8641, 9614, 8829, 9071, 10,193, 10,035, 11,086, and 5698 new transcripts with the class code "u" in these samples: two replicates of non-inoculated cotyledons, inoculated cotyledons, non-inoculated stems, and inoculated stems (Table S1). The genes belonging to class "u" are the new transcripts that do not exist at the reference genome according to the Cufflinks. We also obtained the FPKM values of the genes and isoforms. The average FPKM values of the eight materials examined was 15.93, 15.35, 14.63, 14.68, 13.51, 13.86, 12.05, and 13.66, respectively (Table S2), whereas the average FPKM values of the new transcripts were 6.05, 5.09, 5.39, 5.43, 7.29, 6.98, 6.65, and 7.43, respectively. The gene expression levels were higher in cotyledons than in stems and higher in the mock-inoculated samples than in those inoculated with L. maculans.
We detected 1892 overlapping AS genes in the inoculated materials and identified their homologous genes in A. thaliana (Table S4) [44]. There are 1817 genes exhibiting homology in A. thaliana. Again, IR was the most common AS pattern for these genes, followed by other, AA, AD, and ES events. Therefore, AS patterns are highly conserved among genes from different samples.

WGCNA of Overlapping Differentially Spliced Genes
We used WGCNA to further explore the functions of the 1892 overlapping genes, as such an analysis could be used to uncover candidate genes that function in the pathogen response. This technique uses the topological overlap measure (TOM) to cluster similarly expressed genes into discrete modules based on pairwise correlations between genes [45]. Using a soft threshold of 9 ( Figure 2), we detected eight modules of genes with highly similar expression patterns. Using a weight value of >0.15, we performed network analysis with Cytoscape to display the relationships of each gene in a single module (Table S5). Figure 3A-H shows the red, black, blue, brown, yellow, turquoise, green, and pink modules, containing 220, 196, 263, 260, 257, 274, 236, and 181 genes, respectively. The most and least highly connected genes are shown in purple and yellow, respectively.  To explore the functions of the genes in the eight modules, we performed GO and KEGG analysis of the inoculated materials (Tables S6 and S7, Figures S1 and S2) to identify genes related to metabolism. The significant KEGG pathways (p < 0.05) and major GO terms (p < 0.01) of genes in the blue module include homologous recombination, citrate cycle (TCA cycle), fructose metabolism, cellular response to gamma and ionizing radiation, leaf formation, adventitious root development, and organic acid biosynthetic process. The pathways/terms of genes in the brown module include cyanoamino acid metabolism, nitrogen compound transport, intracellular protein transport, and cellular protein localization. For the green module, these pathways/terms include RNA degradation and lipoic acid metabolism, protein autoubiquitination, telomeric loop formation, modulation by immune response organism, and modulation by symbiont of host immune response ( Figure 4 and Table 1). For the pink module, these pathways/terms include insulin resistance, amino sugar and nucleotide sugar metabolism, regulation of auxin polar transport, glycerolipid metabolic process, phosphatidylinositol dephosphorylation, and response to osmotic stress. Therefore, the genes in these four modules mainly function in response to stress, whereas the functionally enriched categories of genes in the other modules are related to carbon metabolism, photosynthesis, and plant hormones.

The Hub Genes and TFs in the Modules Identified by WGCNA
Using the highest degree value, 120 Hub genes were identified for the eight modules, encoding proteins such as protein kinase, chitinase, reductase, and oxidase (Table S8). The Hub genes of each module most highly correlated with others in the network represent key factors that function in defense against pathogen attack. The overlapping Hub genes among the four defense-related modules include genes in the DUF family and genes in the protein kinase superfamily. The DUF family includes BnaCnng76620D, BnaA09g51150D, BnaAnng22520D, and BnaC09g38570D. These four genes, which have not previously been reported in B. napus, might play regulatory roles in response to pathogens.

Identification of Differentially Expressed Genes in Mock-Infected and Infected Samples among Differentially Spliced Genes
The regulatory relationship of gene expression and AS is unknown. Current research indicates that the overlap between differentially expressed genes and differentially alternatively spliced genes is small [46], which makes these overlapping genes particularly worthy of further research. Among the 1892 differentially alternatively spliced genes, we identified significant differentially expressed genes between mock-infected and infected materials. There were 52 and 117 such genes in cotyledons and stems, respectively. To explore their role in defending the pathogen, we constructed a heatmap of the isoforms (134 in cotyledons and 303 in stems) of these differentially expressed genes based on expression patterns ( Figure 5A,B). As shown in the heatmap, the different isoforms of a single gene almost always had the same expression pattern; however, there was always a predominant isoform.
We identified four overlapping differentially expressed genes (BnaA03g18030D, BnaA03g54830D, BnaCnng49050D, and BnaCnng52760D) in mock-infected versus infected materials in both cotyledons and stems, suggesting that these four genes play important roles in the pathogen response. Based on the results of WGCNA, these four genes belong to the blue, green, black, and green modules. The isoform 5686.1 of BnaA03g18030D (C2C2-Dof ), expressed at a significantly higher level than isoform 5686.2, has the same expression pattern in cotyledons and stems. Among the isoforms of BnaCnng49050D (HMG), 65,273.1 and 65,273.2 were expressed at high levels in stems and at low levels in cotyledons, suggesting that these isoforms function differently in different tissues. In addition, the isoforms of BnaA03g54830D and BnaCnng52760D share similar expression patterns. There were two TF genes among the four overlapping differentially expressed genes: BnaA03g54830D (C2C2-Dof ) and BnaCnng49050D (HMG).

Discussion
The defense response of B. napus to L. maculans can be induced by pathogen elicitors, which are involved in processes such as plant cell wall degradation [47] and toxin biosynthesis [48]. Based on many studies of the defense response of B. napus to L. maculans, at least 18 major R genes have been identified, and adult plant resistance mediated by the quantitative effects of R genes has been described [49]. Here, we explored new transcripts and changes in the isoform-level AS landscape of B. napus infected with L. maculans.
Among the 1892 differentially spliced genes identified in this study, we identified their homologous genes in A. thaliana to gain their AS events in Riken database (http://rarge.gsc.riken.jp/a_splicing/index. pl) [50]. There are 133 A. thaliana genes producing AS (Table S4), which can validate our study to some extent.

Enriched Pathways and Hub Genes Identified by WGCNA
WGCNA is a method used to classify genes with similar expression patterns. In this study, based on the 1892 differentially spliced genes between mock-inoculated and inoculated materials, we performed WGCNA to identify eight modules of genes and 30 Hub genes that connect all the genes together. In addition, we performed KEGG and GO analyses to predict the functions of genes in each module. In the four modules with genes related to defense to stress, the clustered pathways mainly included fructose metabolism, amino sugar metabolic process, citrate cycle, nitrogen compound transport, glycerolipid metabolic process, and others, which corresponds with previously reported results. The Hub genes identified in this study might play important roles in plant defense, such as genes in the categories DUF, protein kinase, chitinase, reductase, oxidase, and metal transport protein.
Chitin, a component of the fungal cell wall, is a pathogen elicitor that induces B. napus to produce nitric oxide and hydrogen peroxide in epidermal cells [51]. The identification of a Hub gene in the chitinase category suggests that plants might degrade pathogen elicitors as a defense response. Furthermore, conserved metabolic pathways such as the glyoxylate cycle, amino acid biosynthesis, and glycerolipid metabolism are essential for pathogenic processes [52]. Lipids play important roles in primary metabolism and represent the main storage form carbohydrates in fungal spores [53]. In addition, L. maculans produces a wide range of cell-wall-degrading enzymes [54]. In this study, we identified Hub genes related to the metabolism of fructose, a basic component of the cell wall, suggesting that the invasion of L. maculans might lead to the degradation of the cell wall, which may stimulate plants to synthesize new cell wall or alter the cell wall of healthy plant tissues to prevent further pathogen penetration.

The Roles of TFs in the Defense Response to Pathogens among the AS Genes
TFs play crucial roles in plant growth, development, and responses to stress [59], including biotic and abiotic stress. For example, the TF C2C2-GATA, which contains a highly conserved zinc finger DNA-binding domain, is a key regulator of the response of sweet orange (Citrus sinensis) to Xanthomonas campestris pv. Vesicatoria infection [60]. TFs not only initiate or repress gene transcription, they also undergo constitutive and inducible AS. For example, the different isoforms of OsWRKY62 and OsWRKY76 play different roles and perform AS-mediated feedback regulation in the defense response to the blast fungus and the leaf blight bacterium [26]. In this study, we focused on TF genes among the differentially spliced genes, including differentially expressed TF genes and the Hub TFs identified by WGCNA.
We identified 9 additional TF genes identified as Hub genes by WGCNA, which might function by regulating the lower hierarchical genes to defend the plant against pathogen attack. We focused on the modules related to pathogen defense, especially the Hub TFs BnaA06g34140D, BnaC07g09430D, BnaC07g02890D, BnaC03g14960D, and BnaA09g18800D. Their homologs in A. thaliana are AT2G02450 (AtNAC035), AT1G29860 (AtWRKY71), AT5G45110 (AtNPR3), AT5G53290 (AtCRF3), and AT2G01940 (AtSGR5), respectively. The overexpression of AtWRKY71 can affect the defense response to Pseudomonas syringae [62], and AtNPR3 functions as an activator to stimulate disease resistance in plants [63]. The Hub genes and expression modules identified in this study provide insight into an agronomically important plant-pathogen interaction and could be modulated through AS to improve resistance of B. napus to L. maculans.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4425/10/4/296/s1. Table S1: FPKM values of the new transcripts of non-inoculated and inoculated cotyledons and stems, with two replicates. Table S2: FPKM values of genes and isoforms in non-inoculated and inoculated cotyledons and stems, with two replicates.  Figure S1: The enriched GO biological process categories of the eight modules. Figure S2: Enriched KEGG pathways categories of the