Transcriptome Profile Analysis of Ovarian Tissues from Diploid and Tetraploid Loaches Misgurnus anguillicaudatus

RNA sequencing and short-read assembly was utilized to produce a transcriptome of ovarian tissues from three-year-old diploid and tetraploid loaches (Misgurnus anguillicaudatus). A total of 28,369 unigenes were obtained, comprising 10,546 unigenes with length longer than 1000 bp. More than 73% of the unigenes were annotated through sequence comparison with databases. The RNA-seq data revealed that 2253 genes were differentially expressed between diploid and tetraploid loaches, including 1263 up-regulated and 990 down-regulated genes in tetraploid loach. Some differentially expressed genes, such as vitellogenin (Vtg), gonadotropin releasing hormone receptor type A (GnRHRA), steroidogenic acute regulatory protein (StAR), mitogen-activated protein kinase 14a (MAPK14a), ATP synthase subunit alpha (atp5a), and synaptonemal complex protein 1 (Scp1), were involved in regulation of cell proliferation, division, gene transcription, ovarian development and energy metabolism, suggesting that these genes were related to egg diameter of the loach. Results of transcriptome profiling here were validated using real time quantitative PCR in ten selected genes. The present study provided insights into the transcriptome profile of ovarian tissues from diploid and tetraploid loaches Misgurnus anguillicaudatus, which was made available to the research community for functional genomics, comparative genomics, polyploidy evolution and molecular breeding of this loach and other related species.


Introduction
Loach (Misgurnus anguillicaudatus; Cypriniformes; Cobitidae) is a demersal freshwater teleost, widely distributed in China, Korea, Japan, and other countries in southeastern Asia [1]. The loach can be used as a traditional Chinese medicine or folk remedy for treatments of hepatitis, osteomyeitis, carbuncles, inflammations and cancers, as well as for patient's recovery from debilities caused by various pathogens and aging [2]. It is also a commercially important species, with a gradually increasing market demand in recent years [3]. Apart from its economic value in East Asia, loach is also a promising model animal to study evolutionary biology for polyploidy. Natural tetraploid loaches with 4n = 100 chromosomes appear with sympatric diploid loaches with 2n = 50 chromosomes in central China and low frequencies of natural triploid, pentaploid, and hexaploid loach are also found in China [4,5].
Recently, good progress has been made in genome-wide gene expression profiling by the development and application of large scale sequencing techniques, which have served as a powerful and cost-effective tool to obtain gene sequences and develop molecular markers [6]. Combined efforts of high-throughput sequencing and complicated bioinformatics analysis allow access to transcriptome profiling of gonads of several fish, including nile tilapia Oreochromis niloticus [7], platyfish Xiphophorus maculates [8], olive flounder Paralichthys olivaceus [9], rockfish Sebastiscus marmoratus [10], rainbow trout Oncorhynchus mykiss [11], catfish Ictalurus punctatus [12], turbot Scophthalmus maximus [13], and striped bass Morone saxatilis [14]. Meanwhile, there is almost no report on transcriptome profiling of loach gonad.
Herein, we generated a comparative transcriptome of ovarian tissues from diploid and tetraploid loaches M. anguillicaudatus by the application of high-throughput sequencing. The objectives of this study were to enrich genetic resources for M. anguillicaudatus, identify and analyze differently expressed genes (DEGs) between diploid and tetraploid M. anguillicaudatus, identify some genes related to egg diameters, and obtain information on biological pathways involved in ovarian development, nutrition metabolism and other biological activities as well. Our results will not only provide valuable genomic information for understanding the molecular mechanism of ovarian development of M. anguillicaudatus, but also establish a sound foundation for functional genomics, comparative genomics analysis, polyploidy evolution and molecular breeding for this loach and other closely related species.

Egg Diameters of the Three-Year-Old Mature Loaches of Different Ploidy
Three-year-old diploid and tetraploid loaches M. anguillicaudatus were used in this study. Many mature oocytes were found in ovarian tissues, indicating that these two loaches were of sexual maturity ( Figure 1). Diameters of 340 mature eggs from each kind of loach (i.e., diploid and tetraploid) were measured here ( Table 1). The average egg diameters for diploid and tetraploid loaches were respectively 1.17 and 1.37 mm. Eggs from tetraploid loaches were significantly larger than those from diploid loaches (p < 0.01).

Sequencing, De Novo Assembly and Functional Annotation
Illumina-based RNA-Sequencing (RNA-Seq) was conducted with ovarian tissue samples from diploid and tetraploid loaches. A total of 25.42 million of 101 bp paired end reads were generated. After trimming of low-quality reads and short reads sequences, a total of 23.83 million clean reads (93.72%) were obtained (Table 2), and these reads were used for the following analysis. Ovarian tissues of diploid and tetraploid loaches were used to generate 88,454 transcripts and 45,686 unigenes. The N50 values of transcripts and unigenes were 2230 and 1939, respectively. A summary of the assembly data is given in Table 3. The size distributions of contigs, transcripts and unigenes are shown in Figure 2.

Validation of Differentially Expressed Genes by Quantitative Real-Time PCR (qPCR)
qPCR was performed on 10 selected genes (GnRHRA, CDKL1, Creatine kinase B-type CKb, adenosylhomocysteinase AHCY, Rho guanine nucleotide exchange factor (GEF) 3 ARHGEF3, transforming growth factor beta-2 TGFβ2, growth/differentiation factor 7 GDF7, Scp1, Protein Wnt-11 Wnt11, and vitamin D3-25 hydroxylase Cyp27A) for validating the differentially expressed genes identified by RNA-Seq. Fold changes from qPCR were compared with the RNA-Seq expression profiles ( Table 5). The expression patterns revealed by qPCR analysis were similar to those revealed by RNA-Seq for the same genes. In addition, a significantly positive correlation (r = 0.996; p < 0.01; n = 10) was found between the results of RNA-Seq and qPCR for these genes ( Figure 8). Thus, RNA-seq could provide reliable data for mRNA differential expression analysis.

Discussion
Nowadays, environmental pollution problems are increasingly serious. Some environmental estrogenic substances such as natural and synthetic estrogens, some pesticides, and other manmade chemicals can cause gonadal growth retardation, testicular degeneration, and intersex at relatively low proportions [15][16][17][18]. Therefore, reproductive fitness is a key issue in biology [19]. The ovary is a very important reproductive organ, and ovarian development plays a very important role in reproduction of most animals, including fish. In this study, we used the RNA-Seq technology to present the transcriptome profiling of ovarian tissues from diploid and tetraploid loaches, which are of high economic and medicinal value, and of interest to science as well because of its extensive ploidy variability in nature. The identified DEGs between diploid and tetraploid loaches will not only help us to understand the molecular mechanism of ovarian development and maturation, but also will provide valuable information for studying the phenotypic and functional differences of fish with different types of ploidy. Molecular breeding of loach M. anguillicaudatus will also benefit from this study.
Ovarian tissues from diploid and tetraploid loaches used in the present study were both under mature post-vitellogenic conditions. Diameters of the mature eggs demonstrated significant differences between diploid and tetraploid loaches, suggesting that genetic factors most likely accounted for differences of their ovarian development and egg diameters. Within mature ovaries, the oocyte undergoes large increases in size mainly due to the incorporation of vitellogenin [20]. This might require a series of enzymes to provide both hormonal and energy support for synthesis and decomposition of vitellogenin.
We observed that egg diameters of tetraploid loaches were significantly larger than those of diploid loaches. The growing oocyte is considered to be largely transcriptionally inactive, acting as a storehouse of specific maternal RNAs, proteins, and other molecules required for competency for fertilization, initiation of zygotic development, and transition to embryonic gene expression [14].
Of the 2253 DEGs selected with rigorous criteria in the present study, a large proportion of key genes involved in protein processing, fat and energy metabolism, steroidogenic activity, cytoskeleton and cell division were identified. Compared to diploid loach, there were some genes showing marked up-regulation in tetraploid loach, namely Vtg, GnRHRA, StAR, MAPK14a, atp5a, Skp1, G2E3, drc1, CDKL1 and Scp1, which might account for the difference of egg diameters between diploid and tetraploid loaches. Vtg, the precursor protein for egg yolk, is a large serum phospholipoglycoprotein produced only in the liver of mature females and is transferred by the bloodstream to the ovary. Vtg uptake is stimulated by follicle-stimulating hormone (FSH) [21] and increasing FSH levels are thought to trigger vitellogenesis [22]. In vertebrates, GnRH regulates the synthesis and release of luteinizing hormone (LH) and follicle-stimulating hormone (FSH) from the pituitary gland, thereby regulating steroidogenesis and gametogenesis [23]. The steroidogenic acute regulatory protein (StAR), a member of the StAR-related lipid transfer domain (START) family, is critical to regulated steroidogenesis in vertebrates, controlling oocyte growth (vitellogenesis) and final maturation [24]. MAPK14a, part of the mitogen-activated protein kinase (MAPK) family, is found to act as a crucial mediator for cell differentiation and proliferation [25]. atp5a, one of the energy metabolism enzymes, play a crucial role in steroid hormone and vitellogenin synthesis during ovarian development [26]. Cell cycle phase-specific expression of genes is a principal mechanism controlling cell division, including Skp1 [27], G2E3 [28] and drc1 [29]. CDKL1 is a member of the cyclin-dependent protein kinase (CDK) protein family, a group of serine/threonine kinases [30]. Synaptonemal complex protein 1 (Scp1) is a member of the synaptonemal complex (SC), a meiosis-specific structure essential for synapsis of homologous chromosomes [31]. CDKL1 and Scp1 have also been demonstrated to be important regulators of cell division.

Ethics Statement
All experimental protocols were approved by the Ethics Committee of the College of Fisheries, Huazhong Agricultural University.

Sample Collection and Preparation
The natural loaches were randomly collected from Ezhou city, Hubei province, China. The ploidy status of loach was determined by flow cytometry described as Zhu et al. [32]. Then 8 three-year-old mature female loaches (including four diploid and four tetraploid) were chosen. Ages of these loaches were determined by using scale specimens according to Huang et al. [33].
For each ploidy (i.e., diploidy and tetraploidy), ovarian tissues were removed from the loaches after anesthetizing with 100 mg/L MS-222. In this experiment, tetraploid sample was used as a treatment group while diploid sample a control group. The ovarian samples of each loach were then divided into three parts. According to Zhao et al. [34], the first part was used to measure egg diameters to detect differences in egg-size between the two different ploidies. The second part was fixed in 4% paraformaldehyde solution for the histological observation described as in Cao and Wang [35]. The third part was immediately frozen in liquid nitrogen, and stored at −80 °C, which was used for RNA-Seq and qPCR analysis. Total RNA was extracted from ovarian tissues by using Trizol reagent (Invitrogen, Carlsbad, CA, USA) according to the instructions of the manufacturer. Total RNA concentration and quality were determined by an Agilent bioanalyzer 2100 (Palo Alto, CA, USA). For each ploidy, equal amounts of RNA from the four loaches were pooled to provide templates for the construction of the RNA-Seq library. Figure 9 shows the layout plan of the study design. Figure 9. The layout plan of the study design.

Library Construction and Illumina Sequencing
The cDNA library was constructed by using the extracted RNA samples. Poly (A) mRNA was segregated using oligo-dT beads (Qiagen, Dusseldorf, Germany). The fragmentation buffer was added to break all mRNA into short fragments. Random hexamer-primed reverse transcription was used for the first-strand cDNA synthesis. RNase H and DNA polymerase I were used for following generation of the second-strand cDNA. The QIAquick PCR extraction kit was performed to purify the cDNA fragments. These purified cDNA fragments were rinsed by EB buffer for end reparation poly (A) addition and then ligated to sequencing adapters. Thereafter, agarose gel electrophoresis was used for separating the short fragments. The fragments with a size suitable for sequencing criteria were isolated from the gels and enriched by PCR amplification to construct the final cDNA library. The cDNA library was sequenced on the Illumina sequencing platform (Illumina HiSeq™ 2000) using the single-end paired-end technology in a single run, by Biomarker Technologies CO., LTD, Beijing, China. The Illumina GA processing pipeline was used to analyze the image and for base calling.

De Novo Assembly and Functional Annotation
Good quality sequences were necessary for de novo assembly analysis. Before assembly, raw sequencing reads were clipped by abandoning adapter sequences and ambiguous nucleotides. Then all clean reads of the two libraries of different ploidy were jointly assembled into contigs implemented by Trinity software. After assembly, contigs longer than 200 bases were used for subsequent analysis. The contigs were connected to obtain the sequence that could not be extended on either end, and the sequence of the unigene was then generated. These unigenes were further spliced and then assembled to acquire maximum length non-redundant unigenes by TGICL clustering software (J. Craig Venter Institute, Rockville, MD, USA). Finally, Blastx with an E-value <10 −5 between the unigenes and the databases non-redundant proteins (Nr), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG) and Clusters of Orthologous Groups (COG) was conducted. GO annotation of these unigenes was produced using Blast2GO based on the results of the NCBI Nr database annotation. Blastn was used for aligning these unigenes to the Nr database, searching proteins with the highest sequence similarity to the given unigenes, accompanied by their protein functional annotations.

Identification of Differentially Expressed Genes (DEGs)
The mapped reads were normalized according to reads per kilobase of exon model per million mapped reads (RPKM) for each unigene between the two pooled samples (i.e., diploid sample and tetraploid sample), which benefited the comparison of unigene expression between samples, and differentially expressed genes (DEGs) identified by the DEGseq package applying the MA-plot-based method with Random Sampling model (MARS) method. DEGs between the two loaches (diploid loaches and tetraploid loaches) were selected with the following filter criteria: FDR (false discovery rate) <0.01 and the absolute value of log2 Ratio >1, meaning that the expression difference for each DEG between the two loaches should be at least two-fold. DEGs which might be related to egg diameter were selected with the following criteria: First, based on log2 Ratio >2, 810 DEGs were left. Then, we adopted three strategies orderly to excavate genes which might be related to egg diameter from them: (i) delete the DEGs (n = 278) whose annotations were not in detail; (ii) egg diameter-related DEGs (n = 95) were then obtained; (iii) finally, the top 10 DEGs that showed high-fold changes between the two loaches, related to egg diameter, were selected.

Validation of RNA-Seq Results by qPCR
To examine the reliability of the RNA-Seq results, ten differentially expressed genes (GnRHRA, CDKL1, CKb, AHCY, ARHGEF3, TGFβ2, GDF7, Scp1, Wnt11 and Cyp27A) involved in the development of ovarian tissues were selected randomly for validation using quantitative real-time PCR (qPCR). Total RNA was reverse transcribed to first-strand cDNA using reverse transcriptase (Invitrogen). The qPCR was carried out on an iQ5 system (Bio-Rad, Hercules, CA, USA) using SYBR Premix Ex Taq (TaKaRa, Dalian, China), according to the manufacturer's instructions. The primers for these genes were designed by using Primer Premier 5.0 ( Table 6). The reaction mixture (10 µL) comprised 2.5 µL cDNA (1:4 dilution), 5 µL SYBR Premix Ex TaqTM II (TaKaRa), 0.5 µL specific forward primer, 0.5 µL universal primer, and 1.5 µL water. The reactions were performed in an MJ Opticon™-2 machine (Bio-Rad) using the two-step method, with the following parameters: initiation at 95 °C for 30 s followed by 40 cycles of 95 °C for 10 s, 59 °C for 30 s, and 65 °C for 6 s for plate reading. A melting curve was performed from 65 to 95 °C to check the specificity of the amplified product. All experiments were conducted with three biological replicates for each sample. The relative expression levels were normalized to the endogenous control gene β-actin, which was expressed stably and equally in ovarian tissues of diploid and tetraploid loaches, and expression ratios were calculated by using the 2 −ΔΔCt method.

Conclusions
This is the first report of the ovarian transcriptome of diploid and tetraploid loach M. anguillicaudatus obtained by using RNA-seq. We generated a large number of ESTs and identified numerous differentially expressed genes of ovarian tissue between these two loaches. According to DEGs annotation information, we found some genes related to egg diameters of the loach. Gene products do not act independently but rather influence biological processes through networks of interactions. Further functional characterization of these genes by using transgenic, over expression, knockout and knockdown strategies may help elaborate the molecular mechanisms controlling egg sizes among different ploidy fish. Our results also provide the reference values for functional genomics, comparative genomics analysis, polyploidy evolution and molecular breeding for this loach and other closely related species.

Acknowledgments
This study was financially supported by National Natural Science Foundation of China

Author Contributions
Xiaojuan Cao and Weimin Wang conceived and designed the experiments; Weiwei Luo, Chuanshu Liu, and Yeke Wang performed the experiments; Songqian Huang and Weiwei Luo analyzed the data; Weiwei Luo wrote the paper; Xiaojuan Cao modified the paper. We would like to thank the two anonymous reviewers for their constructive, detailed and helpful comments to improve the paper.

Conflicts of Interest
The authors declare no conflict of interest.