Comparative Transcriptomics Reveal Key Sheep (Ovis aries) Hypothalamus LncRNAs that Affect Reproduction

Simple Summary The hypothalamus has an important role in sheep reproduction. In this study, the key long noncoding RNAs (lncRNAs) associated with sheep fecundity were detected and characterized using the RNA Sequencing technique in sheep hypothalami. The results indicated that several key lncRNAs may affect crucial reproductive processes by differentially influencing the expression of their target genes in polytocous sheep in the follicular phase (PF) vs. monotocous sheep in the follicular phase (MF) and in polytocous sheep in the luteal phase (PL) vs. monotocous sheep in the luteal phase (ML). These results provide an insight into the prolificacy mechanism in sheep without FecB mutation in terms of the hypothalamus. Abstract The diverse functions of long noncoding RNAs (lncRNAs), which execute their functions mainly through modulating the activities of their target genes, have been have been widely studied for many years (including a number of studies involving lncRNAs in the ovary and uterus). Herein, for the first time, we detect lncRNAs in sheep hypothalami with FecB++ through RNA Sequencing (RNA-Seq) and identify a number of known and novel lncRNAs, with 622 and 809 found to be differentially expressed in polytocous sheep in the follicular phase (PF) vs. monotocous sheep in the follicular phase (MF) and polytocous sheep in the luteal phase (PL) vs. monotocous sheep in the luteal phase (ML), respectively. Then, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed based on the predicted target genes. The most highly enriched GO terms (at the molecular function level) included carbonyl reductase (NADPH), 15-hydroxyprostaglandin dehydrogenase (NADP+), and prostaglandin-E2 9-reductase activity in PF vs. MF, and phosphatidylinositol-3,5-bisphosphate binding in PL vs. ML was associated with sheep fecundity. Interestingly, the phenomena of valine, leucine, and isoleucine degradation in PL vs. ML, and valine, leucine, and isoleucine biosynthesis in PF vs. MF, were present. In addition, the interactome of lncRNA and its targets showed that MSTRG.26777 and its cis-targets ENSOARG00000013744, ENSOARG00000013700, and ENSOARG00000013777, and MSTRG.105228 and its target WNT7A may participate in the sheep reproductive process at the hypothalamus level. Significantly, MSTRG.95128 and its cis-target Forkhead box L1 (FOXG1) were shown to be upregulated in PF vs. MF but downregulated in PL vs. ML. All of these results may be attributed to discoveries of new candidate genes and pathways related to sheep reproduction, and they may provide new views for understanding sheep reproduction without the effects of the FecB mutation.


Animal Processing
All of the animals involved in this study were approved by the Science Research Department (in charge of animal welfare issues) of the Institute of Animal Sciences, Chinese Academy of Agricultural Sciences (IAS-CAAS) (Beijing, China) and ethical approval was given by the Animal Ethics Committee of the IAS-CAAS (No. IASCAAS-AE-03). Initially, all of the sheep were fed at a sheep farm of the Tianjin Institute of Animal Sciences, and were all treated similarly with free access to water and feed. In total, 890 Small Tail Han sheep were genotyped in pre-experiment and 142 sheep with the FecB++ genotype were identified according to the genotyping results using the TaqMan MGB probe [19]. Then, 12 FecB++ Han sheep were treated with controlled internal drug releasing (CIDR, progesterone 300 mg) to achieve synchronous estrus. The ovulation rate was also determined by detecting the number of corpus luteum using the laparoscopy procedure in day 6 after CIDR removal. Finally, the sheep were divided into two groups according to their littering records and ovulation rate (Table 1): polytocous sheep (n = 6, the litter size and ovulation number was more than one) and monotocous sheep (n = 6, the litter size and ovulation number was only one).  4177  3  3  3  3  4775  2  3  3  3  211  3  3  3  3  265  2  3  3  3  446  3  3  3  3  12  3  3  3  3   Monotocous group   4053  1  1  1  1  4205  1  1  1  1  3048  1  1  1  1  4298  1  1  1  1  3390  1  1  1  1  4282 1 1 1 1

Tissues Acquirement and Sequencing
Three polytocous sheep and three monotocous sheep were slaughtered within 45-48 h of CIDR removal (follicular phase), and the remaining six sheep (three polytocous sheep and three monotocous sheep) were slaughtered on day nine after CIDR removal (luteal phase). Finally, all those sheep were divided into two groups according to their littering records and estrous cycle: polytocous sheep in the follicular phase (PF, n = 3), polytocous sheep in the luteal phase (PL, n = 3), monotocous sheep in the follicular phase (MF, n = 3), and monotocous sheep in the luteal phase (ML, n = 3).
Hypothalamus tissues were obtained from sheep killed by euthanasia and immediately stored at −80 • C. Then, the stored tissues were used for RNA extraction with TRIzol Reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's instructions. Examination involving the integrality and quality of the isolated RNA was performed via electrophoresis and the RNA Nano 6000 Assay Kit of the Bioanalyzer 2100 system (Agilent Technologies, CA, USA). The lncRNA library was constructed with 3 µg of high-quality RNA using the NEBNext ® Ultra™ RNA Library Prep Kit for Illumina ® (#E7530L, NEB, USA) according to the manufacturer's recommendations. During this process, Ribo-Zero™ GoldKits was used to remove rRNA. In addition, the RNA concentration was assessed, and the RNA library was then sequenced at a concentration of 1 ng/µL RNA using the Illumina platform.

Transcriptome Assembly
The raw reads were generated by sequencing and filtered using the criteria for removing low-quality reads including adapter contamination and reads with N% (the percentage of base which cannot be identified by sequencing in a raw read.) of more than 5%. Finally, cleaned reads were obtained. We used the Ovis aries reference genome (Oar_v3.1) and the genome annotation file from ENSEMBL; cleaned reads were then mapped to the reference genome using HISAT2 [20]. StringTie [21] was used to assemble the transcripts.

LncRNA Identification and Differential Expression Analysis
Novel IncRNAs with lengths of more than 200 nt and exon numbers greater than two were distinguished with CNCI [22], CPC [23], PFAM [24], and CPAT [25] software after transcriptome assembly. The fragments per kilobase per million mapped reads (FPKM, [26]) values were calculated to represent the expression levels of the lncRNAs. DESeq [27] was then applied to identify the differential expression of the lncRNAs using two comparisons: PF vs. MF and PL vs. ML. In addition, |Log 2 Foldchange |> 0.58, p < 0.05 was considered to be a significantly differential expression.

Target Gene Prediction of lncRNAs and Gene Ontology and Kyoto Encyclopedia of Genes and Genomes Analyses
In order to better understand the functions of differentially expressed lncRNAs in PF vs. MF and PL vs. ML, we carried out target gene predictions. The target genes can be divided into two types, cis-targets and trans-targets, based on their method of function. Cis-targets for lncRNAs execute important regulatory roles on nearby genes that are located less than 50 kb from them. On the other hand, trans-targets are lncRNAs that are indispensable for the site of transcription, but they function from remote target genes [9]. Additionally, when the expression quantity correlation coefficient of a lncRNA and its corresponding target mRNA was p ≥ 0.9, it was considered to be a potential trans-target.
In addition, we performed Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses by using predictable targeted genes of differentially expressed lncRNAs. The hypergeometric test method was applied to assess significantly enriched GO terms and KEGG pathways, and those with p < 0.05, after multiple tests and corrections were thought to be significantly enriched.

Construction of Integral LncRNA-mRNA Interaction Networks
To further reveal the potential roles of lncRNAs that are involved in modulating the reproductive process, integral interaction networks containing lncRNAs and their corresponding target genes in PF vs. MF, and PL vs. ML that showed evidence of both cisand trans-forms of regulation were built using Cytoscape software [28].

Data Validation
Six lncRNAs-ENSOARG00000025327, ENSOARG00000026466, ENSOARG00000025579, MSTRG.158551, MSTRG.235458, and MSTRG.135103-were selected to validate the accuracy of RNA sequencing via the reverse-transcription quantitative polymerase chain reaction (RT-qPCR). The primers were designed using Primer3 Input (http://bioinfo.ut.ee/primer3-0.4.0/) and synthesized by Tian Yi Biotech. cDNA was used to perform RT-PCR after reverse transcription from total RNA to cDNA. The qPCR reaction conditions were as follows: 95 • C for 15 min, followed by 40 cycles of 95 • C for 10 s and 60 • C for 30 s. The data obtained from qPCR reaction was then calculated using the 2 − Ct method and processed by SPSS19.0 with a one-way analysis of variance. The results are presented as means ± standard deviation. Furthermore, p < 0.05 was regarded as statistically significant.

Results
To identify differentially expressed lncRNAs (DE lncRNAs) in PF, PL, MF, and ML, an RNA library was constructed. In total, 123,001,527, 117,193,065, 123,812,583, and 122,744,344 raw reads in PF, PL, MF, and ML (on average) were obtained, respectively, from the hypothalamic tissues (Table 2), and the mapping rates were calculated as 92.26%, 92.86%, 92.36%, and 92.55%, on average, respectively, after low-quality reads had been removed and the reads had been mapped to the reference genome in four groups. In addition, regions in the genome with the identified lncRNAs were also predicted ( Figure 1A, Table S1). We can see that many lncRNAs originate from intergenic regions, followed by intron and exon regions. As Figure 1B shows, most of the lncRNAs range from 200 nt to more than 8000 nt, and the majority of lncRNAs have two exons. We also identified novel lncRNAs using CNCI, CPC, PFAM, and CPAT software, and the intersections of these predictions revealed 34,293 novel lncRNAs ( Figure 1C, Table S2). Furthermore, we drew a density plot ( Figure 1D), which showed that the pattern of FPKM distribution for the lncRNAs were similar. In addition, the proportions of expression at FPKM ≥ 500 were 1.41% and 10.16% for known and novel lncRNAs, respectively. There were 94.53% and 77.24% FPKM ≤ 50 for known and novel lncRNAs, respectively. From this data, we can see that the expression levels of known lncRNAs (1.41%) at high levels were less than those of novel lncRNAs (10.16%). Regarding DE lncRNAs, we detected 622,809 DE lncRNAs in total, with 406,439 being upregulated and 216,370 being downregulated within the two comparisons ( Figure 2, Table S3). Furthermore, the expression density of DE lncRNAs is also displayed in Figure 3, indicating clearly different expression patterns between PF and MF and between PL and ML. The validation results for the six lncRNAs (Table 3) selected to substantiate the accuracy of sequencing are displayed in Figure 4; the results indicate that there is a similar expression pattern of lncRNAs generated from RNA-Seq and RT-qPCR data.
Subsequently, GO and KEGG analyses were conducted using the identified target genes. As shown in Figure 5A,B and Table S4, the top three enriched GO terms were carbonyl reductase (NADPH), 15-hydroxyprostaglandin dehydrogenase (NADP+), and prostaglandin-E2 9-reductase activity in PF vs. MF, and the catalytic complex, phosphatidylinositol-3,5-bisphosphate binding, and the growth cone in PL vs. ML. The top 10-enriched KEGG pathways are shown in Figure 5B,D and Table S5; the most highly enriched pathway was cocaine addiction. Furthermore, many pathways related to metabolism were also enriched, such as valine, leucine, and isoleucine degradation. Regarding PL vs. ML, the most enriched pathway was that of glycosaminoglycan biosynthesis-chondroitin sulfate/dermatan sulfate. Some metabolism pathways were also enriched in PF vs. MF, such as valine, leucine, and isoleucine biosynthesis.        Subsequently, GO and KEGG analyses were conducted using the identified target genes. As shown in Figure 5 A,B and Table S4, the top three enriched GO terms were carbonyl reductase (NADPH), 15-hydroxyprostaglandin dehydrogenase (NADP+), and prostaglandin-E2 9-reductase activity in PF vs. MF, and the catalytic complex, phosphatidylinositol-3,5-bisphosphate binding, and the growth cone in PL vs. ML. The top 10-enriched KEGG pathways are shown in Figure 5B,D and Table S5; the most highly enriched pathway was cocaine addiction. Furthermore, many pathways related to metabolism were also enriched, such as valine, leucine, and isoleucine degradation. Regarding PL vs. ML, the most enriched pathway was that of glycosaminoglycan biosynthesischondroitin sulfate/dermatan sulfate. Some metabolism pathways were also enriched in PF vs. MF,   Increasing evidence and research is demonstrating that lncRNAs function by targeting corresponding genes. For example, NPCCAT1 (a lncRNA) can upregulate YY1 expression, which can further promote the progression of nasopharyngeal carcinoma [29]. The interaction of the lncRNA Gm2044 with Utf1 plays an inhibitory role in Utf1 translation during spermatogenesis [30]. Thus, it is necessary to predict the target genes for lncRNAs and analyze their functions in order to have a complete understanding of the lncRNA functions. Integral networks of lncRNAs and their target genes, containing cis-and trans-regulatory relationships, were thus built to explore their effects on reproduction ( Figure 6). There were 10 lncRNA-gene pairs in PF vs. MF, among which MSTRG.26777 Increasing evidence and research is demonstrating that lncRNAs function by targeting corresponding genes. For example, NPCCAT1 (a lncRNA) can upregulate YY1 expression, which can further promote the progression of nasopharyngeal carcinoma [29]. The interaction of the lncRNA Gm2044 with Utf1 plays an inhibitory role in Utf1 translation during spermatogenesis [30]. Thus, it is necessary to predict the target genes for lncRNAs and analyze their functions in order to have a complete understanding of the lncRNA functions. Integral networks of lncRNAs and their target genes, containing cisand trans-regulatory relationships, were thus built to explore their effects on reproduction ( Figure 6). There were 10 lncRNA-gene pairs in PF vs. MF, among which MSTRG.26777 cis-regulated four genes. We also constructed 24 lncRNA-gene pairs in PL vs. ML, among which 19 pairs displayed cis-regulation and five pairs showed trans-regulation.

Discussion
We initially identified a number of lncRNAs in the hypothalamus from PF, MF, PL, and ML via RNA-seq. We also analyzed the lengths and exon numbers (3448 nt and 2.5 exons, on average, respectively) of the detected lncRNAs. Compared with mRNAs, the lengths of the lncRNAs were , where the dashed and solid lines represent transand cis-regulation functions, respectively; red and green represent upand downregulation, respectively; and circles and inverted triangles represent mRNAs and lncRNAs, respectively. Furthermore, bigger circles or inverted triangles indicate that these mRNAs or lncRNAs will be discussed in the Discussion section.

Discussion
We initially identified a number of lncRNAs in the hypothalamus from PF, MF, PL, and ML via RNA-seq. We also analyzed the lengths and exon numbers (3448 nt and 2.5 exons, on average, respectively) of the detected lncRNAs. Compared with mRNAs, the lengths of the lncRNAs were smaller [31]; furthermore, the lengths and exon numbers of lncRNAs in sheep hypothalamus tissues were greater than those of the goat hypothalamus tissue (1108 nt and 2.2 exons on average, respectively) [17]. Furthermore, compared with mouse (500 nt) lncRNAs, the lengths of the lncRNAs in the sheep hypothalamus tissues were longer; however, the exon numbers in the sheep hypothalamus tissues were lower than in mouse (3.7, on average) [28]. Therefore, lncRNAs are both tissue-and species-specific [17,32] with diverse phenotypic and functional consequences.
The top three enriched GO terms in PF vs. MF (at the MF level) were carbonyl reductase (NADPH; ENSOARG00000013744, ENSOARG00000013700, ENSOARG00000013777, Carbonyl reductase 3 (CBR3)), 15-hydroxyprostaglandin dehydrogenase (NADP+) (ENSOARG00000013744, ENSOARG00000013700, ENSOARG00000013777), and prostaglandin-E2 9-reductase (ENSOARG00000013744, ENSOARG00000013700, ENSOARG00000013777) activity. Espey et al. [33] revealed that carbonyl reductase can indirectly modulate the ovulation process in rat ovaries, but the detailed mechanisms of this remain to be explored. Importantly, prostaglandin-E2 9-reductase, a member of the aldo-keto reductase superfamily, possesses wide substrate specificity, including the production of quinones and ketones [34]. The activity of prostaglandin E2-9-ketoreductase and the production of progesterone increased by around six-fold at the time of ovulation [35], which implies that an unclear co-operative relationship may exist. Additionally, progesterone appears to be a trigger for increasing the activity of prostaglandin E2-9-ketoreductase in sheep. Interestingly, progesterone can also increase the activity of 15-hydroxyprostaglandin dehydrogenase [36]. Importantly, a close relationship between NADPH-dependent carbonyl reductase and prostaglandin-E2 9-reductase exists [37]. Along with having similar enriched genes, carbonyl reductase may also be influenced by progesterone, or it may affect its catalytic activity [38] in the hypothalamus. Our results suggest that the activity of carbonyl reductase, 15-hydroxyprostaglandin dehydrogenase, and prostaglandin-E2 9-reductase in the hypothalamus is more intense in PF than that in MF. Given the crucial role of progesterone in GnRH release [39], we hypothesize that carbonyl reductase, 15-hydroxyprostaglandin dehydrogenase, and prostaglandin-E2 9-reductase may play key roles in sheep reproduction through the influence of secreted GnRH. Interestingly, three novel genes-ENSOARG00000013744, ENSOARG00000013700, and ENSOARG00000013777-are cis-targets of MSTRG.26777, and they were the top three significantly enriched GO terms (at the MF level), which suggests that MSTRG.26777 has key roles in reproduction. However, this remains to be validated.
In PL vs. ML, the most greatly enriched GO term of the MF aspect was phosphatidylinositol-3,5bisphosphate binding (ENSOARG00000015106, Sorting nexin 3 (SNX3), ENSOARG00000011950, SH3 and PX domains 2B (SH3PXD2B), ENSOARG00000018646, ATPase cation transporting 13A2 (ATP13A2), ENSOARG00000012741). Phosphatidylinositol-3,5-bisphosphate binding of genes such as Sac3 can downregulate the degree to which it responds to insulin in adipocytes [40]. Coincidentally, accumulating evidence [41] suggests that insulin plays a stimulatory role in GnRH release. Phosphatidylinositol-3,5-bisphosphate binding, therefore, may perform important functions in reproduction through its influence on the release of GnRH. SNX3, a member of the sorting nexin (SNX) protein family, is a trans-target of MSTRG.39635, and it exerts fundamental roles in protein traffic [42,43]. In the classic action of ovulation modulation, the progesterone receptor and the estrogen receptor, as two critical factors eliciting GnRH release, need to be trafficked to the membrane to act as required [37], and SNX3 may be the one to function in this process. Therefore, we speculate that PL sheep need greater protein transportation activity in the hypothalamus than ML sheep in order to maintain normal luteal functions. ATP13A2, also called transmembrane lysosomal P5-type ATPase, which is cis-regulated by MSTRG.130348, is normally located in the lysosome. Matsui et al. [44] proved that cathepsin D activity decreases in cells with ATP13A2-knockdown. Furthermore, prostaglandin F2α and prolactin both function in the rat corpora luteal, with the involvement of cathepsin D [45]. Given the presence of cathepsin D [46], prostaglandin F2α [47], and prolactin [48] in the hypothalamus, an as-yet unknown relationship between the three may exist. All of these differences between PL and ML may affect luteal function and potentially, the transition from the luteal to the follicular phase.
Interestingly, two pathways, named "valine, leucine, and isoleucine biosynthesis" and "valine, leucine and isoleucine degradation", were enriched in PL vs. ML, and in PF vs. MF, respectively. This demonstrates that some potential roles of valine, leucine, and isoleucine exist in the context of reproduction. Downing et al. [49] demonstrated that an increase of ovulation rate in ewes occurred when a mixture of the branched-chain amino acids leucine, isoleucine, and valine was infused during the late luteal phase of the estrous cycle. However, our results show that the pathways of valine, leucine, and isoleucine biosynthesis, which may occur in the early luteal phase, are found in the hypothalamus. This phenomenon suggests that an increased ovulation rate may be associated with some factors such as amino acids (AAs) biosynthesis in the hypothalamus. Furthermore, arginine, leucine, and glucose can drive the translation and synthesis of proteins, resulting in an increase in the proliferation of ovine trophectoderm cells, mainly through activator components for the mammalian target of rapamycin (mTOR) signaling pathway [50]. Coincidently, this pathway (Wnt family member 7A (WNT7A), ENSOARG00000002034, Eukaryotic translation initiation factor 4E family member 1B (EIF4E1B), Ribosomal protein S6 kinase A2(RPS6KA2), Wnt family member 9A (WNT9A), Frizzled class receptor 2(FZD2), RPTOR independent companion of MTOR complex 2 (RICTOR), NPR3 like, GATOR1 complex subunit (NPRL3), ENSOARG00000011164, Ras related GTP binding D(RRAGD), Mitogen-activated protein kinase associated protein 1 (MAPKAP1), Serum/glucocorticoid regulated kinase 1 (SGK1), Telomere maintenance 2 (TELO2) was also enriched in PL vs. ML. This may be because the mTOR signaling pathway also mediates the modulation of leucine [51], so similar mechanisms may exist in PL vs. ML. WNT7A, a critical WNT family member, is a cis-target of MSTRG.105228, and it plays important roles in reproduction. Homozygous Wnt7a/mice can survive, but they display a deficiency in female reproductive duct development; this defect leads to infertile female mice [52]. Recently, studies have proven that WNT7A can promote neural cell proliferation, and even the number of neurons, according to both in vivo [53] and vitro [54] evidence. NPRL3, also known as nitrogen permease regulator-like 3, which is regulated by MSTRG.152806, can greatly affect the neuronal structure and morphology of human, as proven by neural cell lines with NPRL3 knockdown in vitro [55]. Furthermore, the localization of mTOR at the lysosome was also altered in this experiment. It was also found that NPRL3 is necessary for mTOR to modulate AA levels, which may further influence the production process. All of these differences may be responsible for, at least in part, the luteal differences between PF and MF, from the point of view of AAs.
All in all, we speculate that valine, leucine, and isoleucine biosynthesis is fundamental for sheep reproduction in both polytocous or motonocous sheep, and it may be a requirement for maintaining normal luteal function or for conservation of the luteal phase for the degradation of AAs at the follicular phase, according to our results and the experiments discussed above. Also, their activity differs between polytocous and motonocous sheep regarding the degree to which they presumably function in GnRH release, which may be a reason for the difference in litter size.
Considering the way in which lncRNAs function, interactome networks involving lncRNAs and their target genes were predicted to confirm their reproductive roles. Except for the key genes that are regulated by the lncRNAs discussed above, an interesting phenomenon is that MSTRG.95128 and its target gene FOXG1, which is enriched in the FoxO signaling pathway, were upregulated in PF vs. MF, but downregulated in PL vs. ML. Regarding FOXG1, current studies have mainly focused on its role in neurons. The development of the vertebrate olfactory system, which is necessary for the formation of GnRH neurons in the early development stages [70], requires FOXG1 expression. FOXG1, in the presence of Emx2, can play an inhibitory role in gliogenesis and a promotive role on neurogenesis [71], and it can drive the survival of postmitotic neurons [72]. From the KEGG analysis, we know that many metabolites, including GABA, NO, dopamine, and serotonin, produced from AAs, function mainly through signal transductors; therefore, all of the functions of FOXG1 may facilitate the modulation of neuronal activities, such as those of GnRH neurons. In addition, some researchers have reported that FOXG1 can influence neural development through the mediation of TGFβ pathways [73]. Considering the involvement of the TGFβ superfamily in GnRH activity, it is speculated that some unexplored relationships between FOXG1 and GnRH release may exist. Their up-or downregulations may exert different functions on hormone release.

Conclusions
To the best of our knowledge, this is the first time that several key lncRNAs in the hypothalamus of FecB++ STH have been identified in PF vs. MF and PL vs. ML, via RNA-Seq. We also revealed some critical GO terms, interesting pathways, and interactomes between lncRNAs and their target genes through which several lnRNAs and genes were found to influence sheep reproduction. All of these findings may improve the understanding of neuronal transductor processes, especially GnRH release, and we expect this study to provide insight into the mechanisms of sheep hyper-fecundity using the FecB++ genotype.

Conflicts of Interest:
All authors declare no conflicts of interest.