Transcriptome Profiling Identifies Candidate Genes Contributing to Male and Female Gamete Development in Synthetic Brassica Allohexaploids

Polyploidy plays a crucial role in plant evolution and speciation. The development of male and female gametes is essential to the reproductive capacity of polyploids, but their gene expression pattern has not been fully explored in newly established polyploids. The present study aimed to reveal a detailed atlas of gene expression for gamete development in newly synthetic Brassica allohexaploids that are not naturally existing species. Comparative transcriptome profiling between developing anthers (staged from meiosis to mature pollen) and ovules (staged from meiosis to mature embryo sac) was performed using RNA-Seq analysis. A total of 8676, 9775 and 4553 upregulated differentially expressed genes (DEGs) were identified for the development of both gametes, for male-only, and for female-only gamete development, respectively, in the synthetic Brassica allohexaploids. By combining gene ontology (GO) biological process analysis and data from the published literature, we identified 37 candidate genes for DNA double-strand break formation, synapsis and the crossover of homologous recombination during male and female meiosis and 51 candidate genes for tapetum development, sporopollenin biosynthesis and pollen wall development in male gamete development. Furthermore, 23 candidate genes for mitotic progression, nuclear positioning and cell specification and development were enriched in female gamete development. This study lays a good foundation for revealing the molecular regulation of genes related to male and female gamete development in Brassica allohexaploids and provides more resourceful genetic information on the reproductive biology of Brassica polyploid breeding.


Introduction
Polyploidy has long been recognized as an important evolutionary force in plants [1][2][3]. The formation of polyploids is attributed to genomic plasticity, and polyploid-induced changes can result in new genetic diversity and advantageous adaptations to the environment [4,5]. Polyploids generally have multiple phenotypes and greater growth vigor compared with their parents [6]. Polyploidy also leads to an increase in plant organs to improve outputs and adapt to various biological and abiotic stresses in plant breeding [7,8]. Allopolyploidy results concomitantly from the genome double after the hybridization of two or more species [9]. Brassica has long been considered as a model to explore polyploidization. Three basic diploid species in Brassica, Brassica rapa (AA, 2n = 20), Brassica nigra (BB, 2n = 16) and Brassica oleracea (CC, 2n = 18), have hybridized to give rise to three allotetraploid species, Brassica juncea (AABB, 2n = 36), Brassica napus (AACC, 2n = 38) and Brassica carinata (BBCC, 2n = 34). The combination of genetic variation from six species in Brassica could result in crops with increased adaptation and agronomic potential as well as improved heterosis from the contribution of alleles [10][11][12]. Brassica allohexaploids (2n = AABBCC) do not exist in nature but can be synthesized by hybridization between diploid and/or allotetraploid species of Brassica. The cross between B. carinata and B. rapa is the most commonly attempted and successful method in the five possible species combinations that can produce Brassica allohexaploids [13]. Hybridization and genomic doubling may lead to extensive transcriptomic changes in the synthesized trigenomic Brassica allohexaploids relative to their parents [14]. Brassica allohexaploids not only provide an effective way to improve the genetic diversity of Brassica and the intergenic hybridization of oilseed crops in the future, but they also offer excellent material for the study of polyploid plants.
So far, studies on Brassica allohexaploids have mainly focused on agronomic traits, meiotic behaviors and subgenome stabilities [15,16]. The development of male and female gametes is crucial to the reproductive capacity of sexually reproducing organisms, especially polyploids, and is strictly regulated by complex processes. However, the genes involved in male and female gamete development are not fully understood in Brassica allohexaploids. Therefore, it is very important to establish a reproductive transcriptome profiling of Brassica allohexaploids.
The male germline matures within the anther, whereas the female germline develops within the ovule [17]. The male and female gametophyte are the important reproductive units of angiosperms and are essential for sexual reproduction [18,19]. Both male and female germline development of angiosperms consist of two main stages: microsporogenesis and microgametogenesis, giving rise to male gametes, and megasporogenesis and megagametogenesis, leading to the formation of female gametes. Microsporogenesis and megasporogenesis are key to male and female reproduction and require the completion of meiosis to form microspores and functional megaspores (FM). Microgametogenesis includes an asymmetric division to form vegetative and generative cells, and the generative cells produce two male gametes after a mitotic division [20]. FM undergoes three rounds of mitosis, nuclear migration, cellularization and differentiation to form a mature seven-celled embryo sac, containing three antipodal cells, two synergid cells, one egg cell and one central cell that contains two polar nuclei, which complete megagametogenesis [19,21]. During gamete generation, male and female gametes have similar ploidy transition and cell cycle progressions, while they show many differences in gamete development and specialization.
RNA-Seq technology is used to analyze the structure and function of genes at the organismal level and to explore a range of biological pathways [22,23]. The establishment of Brassica transcriptome databases, such as the B. napus transcriptome database BnTIR, provide a lot of useful resources for the study of Brassica polyploidy at the transcriptional level [24]. In recent years, RNA-Seq has been successfully used in anther and ovule development in many species. In Brassica napus L., lipid metabolism genes involved in pollen extine formation, elaioplast and tapetosome biosynthesis were preferentially expressed in early anthers, and carbohydrate metabolism genes to form pollen intine and to accumulate starch in mature pollen grains were preferentially expressed in late anthers [25]. Comparative analysis of differential gene expression revealed multiple signal pathways during flowering of autotetraploid B. rapa [26]. The molecular processes involved in the development of female gametes in plants are much less understood than those involved in the development of male gametes due to the small number and difficult availability of female gametophytes. Using high-throughput sequencing analysis, female gamete development in Arabidopsis has been intensively studied, and a number of key genes regulating megasporogenesis and megagametogenesis have been found, offering fundamental knowledge of these developmental processes [21,27]. The aim of the study was to reveal a detailed atlas of gene expression for gamete development in synthetic Brassica allohexaploids. We performed differential expression analysis to identify common and preferential genes that regulate the developmental events of male and female gametes. This study provides rich genetic resources for the cloning and functional verification of genes related to male and female gamete development and lays a good foundation for revealing the molecular regulatory mechanism of reproductive development in Brassica allohexaploids.

Transcriptome Sequencing and Sequence Alignment
RNA-Seq of anthers from meiosis to the mature pollen stage (Anther), ovules from meiosis to mature embryo sac stage (Ovule) and young leaves (Leaf) was performed by Illumina in synthetic Brassica allohexaploids. Leaf was used as vegetative tissue (organ) control, and each Brassica allohexaploid tissue (organ) received three biological replicates. RNA-Seq results are presented in Tables 1 and 2. The number of clean reads from the nine RNA-Seq libraries ranged from 40,202,856 to 47,450,126 ( Table 1). The Clean Q30 Bases Rate was greater than 94.29%, and the higher the value is, the better the sequencing quality (Table 1). All clean reads were then aligned to B. rapa, B. nigra and B. oleracea genome sequences using HISAT2 software. Mapped genome reads ranged from 21,502,559 to 39,794,184, and genome mapping rates ranged from 53.49% to 85.19% (Table 2). These results suggested that their quality met the requirements for transcriptome sequencing. Fragments per kilobase per million reads (FPKM) were determined for all genes of Anther, Ovule and Leaf in Brassica allohexaploids (Table S1). Correlation heat maps of the transcriptome from Anther, Ovule and Leaf showed that the three biological replicates of each tissue were grouped together with high correlation ( Figure S1). These results indicated that the three biological replicates of per tissue had good repeatability in this study.

Anthers at Male Gamete Development Stage and Ovules at Female Gamete Development Stage Contained More Genes Than Leaves in Brassica Allohexaploids
A total of 100,804 genes were identified in the RNA-Seq data of Brassica allohexaploids, among which 96,286, 86,712 and 79,875 genes were expressed in Anther, Ovule and Leaf, respectively, providing sufficient data for studying male and female gamete development (Figure 1a). Among the 100,804 genes, 74,597 genes were expressed in all three tissues, and 9948, 1870 and 1514 genes were specifically expressed in Anther, Ovule and Leaf, respectively ( Figure 1a). Comparative analysis showed that Anther and Ovule overlapped most, with 9111 genes commonly expressed in these two tissues; 2630 genes were commonly expressed in Anther and Leaf; and 1134 genes were commonly expressed in Ovule and Leaf (Figure 1a). Anther and Ovule contained more genes and more specific genes than vegetative tissue Leaf. In all three tissues, transcript abundance analysis indicated 74.1% to 78.6% genes with low expression (FPKM >0 to ≤10) and 19.6% to 24.0% genes with moderate expression (FPKM >10 to ≤100) but only 1.4% to 1.9% genes with high expression (FPKM > 100) (Figure 1b).

Anthers at Male Gamete Development Stage and Ovules at Female Gamete Development Stage Contained More Genes Than Leaves in Brassica Allohexaploids
A total of 100,804 genes were identified in the RNA-Seq data of Brassica allohexaploids, among which 96,286, 86,712 and 79,875 genes were expressed in Anther, Ovule and Leaf, respectively, providing sufficient data for studying male and female gamete development ( Figure 1a). Among the 100,804 genes, 74,597 genes were expressed in all three tissues, and 9948, 1870 and 1514 genes were specifically expressed in Anther, Ovule and Leaf, respectively (Figure 1a). Comparative analysis showed that Anther and Ovule overlapped most, with 9111 genes commonly expressed in these two tissues; 2630 genes were commonly expressed in Anther and Leaf; and 1134 genes were commonly expressed in Ovule and Leaf (Figure 1a). Anther and Ovule contained more genes and more specific genes than vegetative tissue Leaf. In all three tissues, transcript abundance analysis indicated 74.1% to 78.6% genes with low expression (FPKM >0 to ≤10) and 19.6% to 24.0% genes with moderate expression (FPKM >10 to ≤100) but only 1.4% to 1.9% genes with high expression (FPKM > 100) (Figure 1b).  In Anther, Ovule and Leaf, 33,837, 30,501 and 28,078 genes, respectively, were expressed from A-genome; 35,126, 31,413 and 28,842 genes, respectively, were expressed from B-genome; and 27,323, 24,798 and 22,955 genes, respectively, were expressed from C-genome ( Figure 1c). The reference genomes of B. rapa, B. nigra and B. oleracea contained 41,020, 47,953 and 61,279 genes, respectively, thus the percentage of genes expressed in Anther, Ovule and Leaf from A-genome was 82.5%, 74.4% and 68.4%, respectively, while from B-genome the values were 73.3%, 65.5% and 60.1%, respectively, and from C-genome 44.6%, 40.5% and 37.5% (Figure 1c), respectively. These results indicated that the B-genome contained the largest number of genes in Anther, Ovule and Leaf in Brassica allohexaploids, while the A-genome was bias expressed. Three groups of differentially expressed genes (DEGs) were generated by pairing differential expression analysis of genes between all three tissues (Table S2). Volcano plots show DEGs of Anther vs. Ovule, Ovule vs. Leaf and Anther vs. Leaf ( Figure S2). Among all comparisons, the number of DEGs in Anther and Ovule was the smallest, and 22,305 genes (14,366 upregulated and 7939 downregulated) were differentially expressed in Anther compared with Ovule ( Figure 1d). Compared with Leaf, 30,893 genes (14,830 upregulated and 16,063 downregulated) were differentially expressed in Ovule ( Figure 1d). However, 35,849 genes (19,513 upregulated and 16,336 downregulated) were differentially expressed in Anther compared with Leaf, which was the largest number of DEGs ( Figure 1d). The findings showed that there were differences in gene expression between reproductive tissue and vegetative tissue and male gamete development had more genes expressed than female gamete development in Brassica allohexaploids.

Identification of Preferentially Expressed Genes in Male Gamete Development of Brassica Allohexaploids
To better understand the development of male gametes in Brassica allohexaploids, we analyzed the preferentially expressed genes in male gamete development. Venn diagrams showed that the overlap of 14,366 upregulated genes in Anther vs. Ovule and 19,513 upregulated genes in Anther vs. Leaf was 9775 preferentially expressed genes in male gamete development (Figure 4a). We used GO annotation to functionally classify these 9775 genes based on their biological process. We found that GO terms associated with the activation of protein kinase activity (GO: 0032147), protein ubiquitination (GO: 0016567), pollen wall assembly (GO: 0010208), anther wall tapetum development (GO: 0048658), pollen exine formation (GO: 0010584), sporopollenin biosynthetic process (GO: 0080110), pollen development (GO: 0009555), callose deposition in cell walls (GO: 0052543), anther development (GO: 0048653) and pollen sperm cell differentiation (GO: 0048235) were significantly overrepresented within the preferentially expressed genes in male gamete development (Figure 4b). These findings showed that preferentially expressed genes in male gamete development were concentrated in pollen wall formation and other related pathways in Brassica allohexaploids.

Identification of Preferentially Expressed Genes in Female Gamete Development of Brassica Allohexaploids
We next analyzed the preferentially expressed genes involved in female gamete development to explore their key events in Brassica allohexaploids. 4553 genes preferentially expressed in female gamete development were identified by overlapping 7939 and 14,830 upregulated genes of Ovule vs. Anther and Ovule vs. Leaf (Figure 6a). These genes were analyzed by GO biological process pathway analysis. These GO terms related to cell differentiation (

Identification of Preferentially Expressed Genes in Female Gamete Development of Brassica Allohexaploids
We next analyzed the preferentially expressed genes involved in female gamete development to explore their key events in Brassica allohexaploids. 4553 genes preferentially expressed in female gamete development were identified by overlapping 7939 and 14,830 upregulated genes of Ovule vs. Anther and Ovule vs. Leaf (Figure 6a). These genes were analyzed by GO biological process pathway analysis. These GO terms related to cell differentiation ( (Figure 6b). These results suggested that genes preferentially expressed in female gamete development may be related to cell division, nuclear positioning, cell differentiation and auxin-activated synthesis and signal transduction of Brassica allohexaploids.

Meiosis-Related Genes May Affect Homologous Recombination during Male and Female Gamete Development in Brassica Allohexaploids
The key events in both the male and female gamete development of Brassica allohexaploids mainly focused on meiosis-related genes and pathways. Polyploidy species may be used more as model organisms for meiosis in the future, and Brassica has attracted attention as model of allopolyploid meiotic regulation mechanisms [64]. Meiosis leads to the production of genetically unique haploid spores, which contribute to genome stability and genetic diversity. Crossover of homologous recombination ensures faithful segregation of homologous chromosomes in meiosis I [65]. Meiotic CO produces new combinations of alleles, increasing the genetic diversity of gametes [66]. Meiosis forms DNA DSB through programming, processes and repairs DSB by homologous recombination. In all eukaryotes, SPO11-1 is a strict requirement for meiosis DSB formation. In Arabidopsis and Brassica, meiotic DSB formation is required for synapsis and SC formation. Crossover or non-crossover recombination products can be isolated and genetically tested by recombination intermediates formed between homologous chromosomes. Consequently, we hypothesized that meiosis-related genes may affect homologous recombination during male and female gamete development in Brassica allohexaploids.

TDF1, AMS and MS188 May Influence Tapetum Development and Pollen Wall Formation in Brassica Allohexaploids
Male gamete development affects the effective pollination control system, which is the premise of utilizing heterosis of Brassica allohexaploids. The development of male gametes is accompanied by tapetal development, sporopollenin synthesis and pollen wall formation. Tapetum is the innermost layer of the four anther somatic cell layers, and provides material for pollen development. The main component of the exine is sporopollenin, which is produced by the tapetum. The tapetal development regulatory network during male gamete development has been studied in Arabidopsis, in which TDF1 may be involved in redox and cell degradation [67][68][69]. AMS is thought to control lipid transfer proteins in pollen wall building, repressing of upstream regulators and promoting of AMS protein degradation [44,67]. Most cell wall-related genes are regulated by transcription factor MS188, which is involved in both tapetum cell wall degradation and pollen wall synthesis [67]. TDF1 controls AMS directly through an AACCT cis-element, and TDF1 and AMS control downstream genes in a feed-forward loop [68]. In addition, MS188 can activate the expression of CYP704B1, ACOS5 and TKPR1, and form a feedforward loop with its direct upstream regulatory factor AMS to activate the sporopollenin biosynthesis pathway and rapidly form pollen wall in Arabidopsis [69]. Therefore, we speculated that these genes in Brassica allohexaploids may also potentially affect tapetum development and pollen wall formation.

AUX1 and PIN1 May Regulate the Formation of the Seven-Cell Embryo Sac in Brassica Allohexaploids
Female gamete development is rarely studied compared to male gamete development. To enable female gametophyte fertilization, hence plant reproduction, megagametogenesis comprises carefully controlled mitotic divisions, repositioning of nuclei along a polar axis and the acquisition of different identities by individual cells [70]. Auxin levels are known to be controlled by biosynthesis and transport, and it is critical for sporophytic developmental processes. During embryo sac development, localized auxin biosynthesis and import are essential for mitotic divisions, cell growth and patterning [55]. AUX1 and PIN1 influence mitosis progression at one-, two-and four-nucleate stages [55]. The final position of the nuclei foreshadows the cellularization pattern that divides the female gametophyte into seven cells. Furthermore, cellular identity is most likely determined by the location of the nuclei and related cells along the micropylar-chalazal axis. Therefore, these genes involved in mitosis, nuclear localization, cell differentiation and development may be related to the formation of mature seven-cell embryo sac in Brassica allohexaploids.

Plant Materials, Tissue Collection and RNA Extraction
A trigenomic Brassica allohexaploid (BBCCAA, 2n = 54) was generated by interspecific hybridization and chromosome doubling between maternal B. carinata ("VI047487", BBCC, 2n = 34) and paternal B. rapa ("JK66-83", AA, 2n = 20) in this study. These Brassica allohexaploid materials were grown in greenhouse with a 16-h light/8-h dark cycle at 22 • C. These plants were later chromosomally identified as allohexaploid (2n = 54). Anthers and ovules of these Brassica allohexaploids were collected according to the previous methods reported [25,27]. After flowering, anthers from meiosis to mature pollen stage (Anther) and ovules from meiosis to mature embryo sac stage (Ovule) were collected, and young leaves (Leaf) of main inflorescences were also collected as vegetative tissue (organ) control. Three biological replicates were taken from each tissue of Brassica allohexaploids. These samples were rapidly frozen in liquid nitrogen and stored at −80 • C to extract RNA. The purity of the samples was determined by NanoPhotometer ® (IMPLEN, Westlake Village, CA, USA). The concentration and integrity of RNA samples were detected using Agilent 2100 RNA nano 6000 detection kit (Agilent Technology, Santa Clara, CA, USA).

cDNA Library Construction, Filter and Alignment
A total amount of 1-3 µg RNA per sample was used as input material for the RNA sample preparations. Sequencing libraries were generated using VAHTS Universal V6 RNA-Seq Library Prep Kit for Illumina ® . In order to guarantee the data quality used to analysis, the useful Perl script was used to filter the original data (Raw Data). The reference genomes and the annotation file were downloaded from B. rapa genome v1.5 sequence, B. nigra genome v1.1 sequence and B. oleracea genome v1.0 sequence (http://Brassicadb.cn, accessed on 1 May 2021). Bowtie2 v2.2.3 was used for building the genome index, and Clean Data was then aligned to the reference genomes using HISAT2 v2.1.0 [71]. The RNA-Seq data was uploaded to the NCBI Gene Expression Omnibus (GEO), and its accession number, GSE201456, may be used to retrieve it.

FPKM and DEGs
Reads count for each gene in each sample was counted by HTSeq v0.6.0 (Simon Anders, Heidelberg, Germany), and FPKM was then calculated to estimate the expression level of genes in each sample. DESeq2 estimated the expression level of each gene in per sample by the linear regression, then calculated the p-value with Wald test [72]. The p-value was corrected by the BH method. Genes with fold change ≥ 2 and q-value (adjusted p-value) ≤ 0.05 were identified as DEGs.

Function Enrichment Analysis
The DEGs aligned to COG were classified according to functions of genes (http: //www.ncbi.nlm.nih.gov/COG/, accessed on 7 August 2021). The GO enrichment of DEGs was implemented by the hypergeometric test, in which p-value was calculated and adjusted as q-value, and data background was genes in the whole genome (http: //geneontology.org/, accessed on 27 August 2021). GO terms with q-value < 0.05 were considered to be significantly enriched. GO enrichment analysis could exhibit the biological functions of the DEGs. The information about Arabidopsis ortholog of Brassica was obtained from Brassicaceae Database (BRAD) (http://Brassicadb.cn, accessed on 1 May 2021).

RT-qPCR
In order to verify the accuracy of RNA-Seq, nine candidate genes were randomly selected for RT-qPCR. The cDNA used the same RNA samples of the RNA-Seq. The RT-qPCR analysis was performed using the SYBR Green I. To standardize the results, BrGAPDH was employed as an internal reference control. Three biological replicates were used for each sample. Primer Premier 5.0 was used to design gene-specific primers for nine genes and these primer sequences were listed in Table S1. The CT values (the fractional cycle number at which the fluorescence crosses the specified threshold) were generated using CFX manager software and evaluated using the 2 −∆∆Ct method [73].

Conclusions
The transcriptome profiling of anthers and ovules revealed the candidate genes of homologous recombination in male and female meiosis, the genes of tapetum development, sporopollenin biosynthesis and pollen wall formation in male gamete development and the genes of mitosis, nuclear localization, cell differentiation and development in female gamete development of Brassica allohexaploids. Our findings enhance the understanding of the genes involved in the male and female gamete development of Brassica allohexaploids and lay a foundation for the reproductive study of polyploids.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/plants11121556/s1, Figure S1: Correlation heat map of the nine samples of Brassica allohexaploids using Pearson correlation coefficients; Figure S2: Volcano plots of differentially expressed genes (DEGs) for three comparisons in Brassica allohexaploids; Table S1: FPKM of all genes in Anther, Ovule and Leaf of Brassica allohexaploids; Table S2: Statistics of differentially expressed genes (DEGs) in Anther, Ovule and Leaf by pairwise comparison in Brassica allohexaploids; Table S3: List of primer sequences for quantitative fluorescence verification involving selected genes and internal reference.

Data Availability Statement:
The RNA-Seq data were deposited into NCBI GEO and can be accessed with the accession number GSE201456.