Sex Determination Using RNA-Sequencing Analyses in Early Prenatal Pig Development

Sexual dimorphism is a relevant factor in animal science, since it can affect the gene expression of economically important traits. Eventually, the interest in the prenatal phase in a transcriptome study may not comprise the period of development in which male and female conceptuses are phenotypically divergent. Therefore, it would be interesting if sex differentiation could be performed using transcriptome data, with no need for extra techniques. In this study, the sex of pig conceptuses (embryos at 25 days-old and fetuses at 35 days-old) was determined by reads counts per million (CPM) of Y chromosome-linked genes that were discrepant among samples. Thus, ten genes were used: DDX3Y, KDM5D, ZFY, EIF2S3Y, EIF1AY, LOC110255320, LOC110257894, LOC396706, LOC100625207, and LOC110255257. Conceptuses that presented reads CPM sum for these genes (ΣCPMchrY) greater than 400 were classified as males and those with ΣCPMchrY below 2 were classified as females. It was demonstrated that the sex identification can be performed at early stages of pig development from RNA-sequencing analysis of genes mapped on Y chromosome. Additionally, these results reinforce that sex determination is a mechanism conserved across mammals, highlighting the importance of using pigs as an animal model to study sex determination during human prenatal development.


Introduction
Maternal nutrition may affect progeny anatomy, physiology and metabolism during critical periods of prenatal life, which is described as fetal programming [1]. The effects of maternal nutrition on fetal programming have been recently addressed in several studies using pigs as a model organism [2][3][4][5][6], since there are striking similarities between pigs and humans regarding the anatomy, physiology, metabolism, and nutrition, which provide the basis for the use of this animal in several studies. In addition, Sus scrofa is a prolific specie with a relatively short developmental period, besides being a cheaper and easier mammalian model organism to maintain [7,8]. In this context, transcriptome studies using RNA-sequencing (RNA-seq) performed in pigs highlight the relevance of the data obtained in this specie for human-related researches.
The RNA sequencing (RNA-seq) approach generates a comprehensive picture of gene expression levels at different developmental stages and physiological conditions [9][10][11]. However, these analyses may be influenced by sexual dimorphism, since males and females are marked by significant biological differences that can modify the gene expression of economically important traits [12]. Therefore, researchers should take into account the conceptuses sex and sex ratios in their research before drawing any conclusion [13].
In mammals, sex determination is basically organized in four stages: (1) Determination of the chromosomal sex, which is established at fertilization when a Y-or a X-chromosome sperm fuses with the oocyte to determine the zygote genetic sex; (2) the differentiation of gonads into testicles or ovaries from the presence or absence of sex-determining region Y gene (SRY) at the critical time window during embryonic development; (3) the differentiation of male and female internal and external genitalia from the non-sexually dimorphic structures present in the embryo, and (4) anatomical and physiological differences, i.e. sexual differentiation. Thus, all sex determination stages are mainly related to general gonadal function [14,15].
At early pig development, the absence of gonadal phenotype may impair the accurate visual sex determination, since the tunica albuginea, a marker of testis formation, is histologically identified only at 27 days [16] and the beginning of testicular descent occurs around 60 days [17]. Therefore, if research projects are designed at early development stages in which male and female conceptuses are not phenotypically divergent and there is response to sexual dimorphism, the sex determination could be performed using alternative tools rather than phenotypic differences.
Several studies have demonstrated how to identify the conceptuses sex at early stages of development using molecular techniques, as PCR (polymerase chain reaction) from Y chromosome-linked genes [13,[18][19][20]. The PCR provides sensitive, precise, rapid, and reliable results [20]. However, if a RNA-seq study has already been performed on conceptuses to obtain a broader knowledge of the transcriptome under experimental conditions (e.g. inclusion of additives on maternal diets), these data can also be applied to determine the conceptuses sex with no need for extra techniques. A similar approach has been used by Petropoulos et al. [21] to determine the sex of human embryos through single-cell sequencing.
Therefore, considering that in early prenatal pig development male and female conceptuses are not phenotypically divergent and the sexual dimorphism may affect the transcriptome analyses, we aimed to demonstrate that the sex of 25 and 35 days-old conceptuses can be determined by read counting of genes mapped on Y chromosome using already available RNA-seq data.

Experimental Animals and Design
The experimental protocols used in this study have followed ethical principles in animal research (CONCEA, 2016) and were approved by the Ethical Committee on Animal Use of the Universidade Federal de Viçosa (UFV), MG, Brazil [protocol # 06/2017].
All experimental protocols were performed according to the experimental design described in Costa et al. [22]. Briefly, 24 hours after the second insemination, 11 gilts received a basal diet for pregnant animals without supplementation (CONT) and 12 gilts received a CONT diet supplemented with 1.0% L-arginine (ARG), and two gestational ages were considered (25 and 35 days). From these gilts, 20 became pregnant. Additional details regarding reproductive management and nutritional information of the diets have been previously described in Costa et al. [22].
At 25 days of gestation, five females of CONT (n = 5) and five females of ARG (n = 5) were rendered unconscious using head-only electrical stunning (240V, 1.3A) and immediately exsanguinated. The same procedure was followed for four females of CONT (n = 4) and six females of ARG (n = 6) at 35 days of gestation. After slaughter, four conceptuses were collected per female at each gestational age, totaling in average 20 conceptuses per treatment.
The conceptuses were quickly washed with PBS (Phosphate Buffered Saline) solution, individually identified, stored in liquid nitrogen and transported to the Animal Biotechnology Laboratory (LABTEC) at the Department of Animal Science, UFV. At LABTEC, each conceptus (embryos from gilts slaughtered at 25 days and fetuses from gilts slaughtered at 35 days) was entirely and separately macerated in liquid nitrogen. Although an average of five gilts was initially used per treatment in the original trial described in Costa et al. [22], subsequent analyses were performed on a subset of gilts (n = 3) per treatment, since at least three biological replicates are recommended for describing results in RNA-seq experiments [9,23]. Moreover, three embryos from each CONT gilt (n = 3 gilts) and three embryos from each ARG gilt (n = 3 gilts) at 25 days of gestation (25DC and 25DA, respectively) and three fetuses from each CONT gilt (n = 3 gilts) and three fetuses from each ARG gilt (n = 3 gilts) at 35 days of gestation (35DC and 35DA, respectively), totaling 36 samples, were randomly chosen for RNA-seq analysis. Afterwards, these samples were transported in liquid nitrogen to the Animal Genetics Laboratory at the Embrapa Swine and Poultry National Research Center, Concordia, SC, Brazil, for further RNA extraction and library preparation.

RNA Extraction and Library Preparation
Total RNA extraction of the 25-and 35-day-old conceptuses from the ARG and CONT females was performed with TRIzol (Invitrogen, San Diego, CA, USA). The macerated conceptuses (100 mg) and TRIzol (1 mL) were mixed with vortex and then incubated for 5 minutes at room temperature (RT, 25 • C). Then, 200 µL of chloroform were added, shaking vigorously for 15 seconds and incubated at RT for 5 minutes. Centrifugation was performed at 11,000× g at 4 • C for 15 minutes. Approximately 600 µL of the clear upper aqueous phase containing only RNA were carefully removed and transferred to a new tube, and 600 µL of 70% ethanol were added and homogenized by inversion. This volume was added to the silica column RNeasy mini kit (Qiagen, Hilden, Germany) and centrifuged for 15 seconds at 8000× g. The eluate was discarded and 700 µL RW1 buffer were added, followed by centrifugation for 15 seconds at 8000× g. Two washes with 500 µL RPE buffer were done and, finally, RNAs were eluted in 50 µL of RNase free water.
After RNA extraction, the quantification was performed in a QUBIT fluorimeter (Thermo Scientific, Waltham, MA, USA) and the integrity was determined in 1.0% agarose gel. In addition, the Agilent 2100 BioAnalyzer (Agilent Technologies, Santa Clara, CA, USA) was used for integrity measurement, in which samples with RNA integrity number (RIN) higher than eight were used for library preparation. In this sense, one fetus sample was excluded from subsequent analysis due to low RIN score (RIN < 8). Therefore, the 35 remaining conceptuses samples (nine ARG and nine CONT at 25 days and eight ARG and nine CONT at 35 days) were submitted to RNA-seq library preparation using the TruSeq Stranded mRNA Library Prep Kit (Illumina, Inc., San Diego, CA, USA), followed by purification of the poly-A tail using 2 µg of total RNA, according to the manufacturer´s recommendations.
The BAQCOM pipeline () [24] was used to perform quality control (QC), mapping and reads counting. BAQCOM uses a set of software to analyze RNA-seq data, as Trimmomatic [25], version 0.38, to identify and remove adapter and low quality sequences, HISAT2 [26], version 0.11.2, to map reads against the genome and HTseq-count [27], version 0.11.2, to count reads in features. Using BAQCOM pipeline, only reads with Phred quality ≥ 20 and length ≥ 70 pb were mapped against the pig reference genome (Sus scrofa, v. 11.1) and the reads counting for each feature was based on Ensembl annotation release 95.

Sex Identification
The conceptuses sex determination (n = 18 embryos and n = 17 fetuses) was performed using discrepant reads counts per million (CPM) of Y chromosome-linked genes from the conceptuses RNA-seq dataset. After the samples classification as males or females, the HTseq files containing all samples transcripts were renamed and used to build a multi-dimensional scale (MDS) plot using plotMDS function of EdgeR package [28] of R software [29] to evaluate conceptuses dispersion.

Results
Averages of 15.3 and 14.4 million reads/samples were generated for embryos and fetuses, respectively. After the data QC, averages of 13.6 million reads/embryos and 12.9 million reads/fetuses remained for further analyses. More than 98% of the reads were mapped against the pig reference genome (Sus scrofa, v.11.1), with 83.8% and 79.8% of the reads counted into genes for embryos and fetuses, respectively. An average of 0.05% of the reads was uniquely mapped on Y chromosome. At 25 and 35 days of age, male and female conceptuses could not be phenotypically differentiated, since gonadal phenotype could not be observed ( Figure S1).

Discussion
The control of animals sex ratio is desirable in livestock production [20], since the expression of genes that affect a wide range of economically important traits may be influenced by sexual dimorphism [12]. Considering that the sex determination is equivalent to testis determination [30], the testis could be easily used to determine male and female individual rates. However, in pig conceptuses, the tunica albuginea, a marker of testis formation, is histologically identified only at 27 days [16] and the beginning of testicular descent occurs around 60 days [17]. In this context, we demonstrated that it is possible to determine the sex of pig embryos (25 days) and fetuses (35 days) using reads CPM of ten Y chromosome-linked genes through RNA-seq analysis, with no need to use other molecular tools for this purpose, such as PCR. Although most often considered a cheaper and faster technique than RNA-seq, PCR would represent another cost source in this study, since RNA-seq analyses had been previously performed aiming to test a hypothesis on a maternal nutrition experiment [22].
Several studies have also used Y chromosome-linked genes to determine the sex of pigs at early development stages [13,18,19], however, all of them have applied DNA-based approaches. The Y chromosome harbors genes that are essential for testis development and function, such as the master gene for testis determination (SRY), which is expressed during a critical period of embryonic development, and other genes that are important for spermatogenic function, as the ZFY, EIF1AY and KDM5D. The importance of these Y-linked genes during male development has been elucidated for mammals [31][32][33][34]. These genes are located in the non-recombining region of the Y chromosome (NRY), which is also known as the male-specific Y (MSY) region, transmitted from fathers to sons without recombination with the X chromosome [35]. The MSY region contains only 27 genes coding for distinct proteins, as the SRY, EIF1AY, KDM5D, ZFY, EIF2S3Y, and DDX3Y [31,35,36].
In this study, among all abovementioned genes, SRY was not used to differentiate male and female conceptuses (data not shown). This gene seems to follow a strict expression pattern during mammalian embryonic development. In mouse, its expression starts at 10.5 days post coitum (dpc),

Discussion
The control of animals sex ratio is desirable in livestock production [20], since the expression of genes that affect a wide range of economically important traits may be influenced by sexual dimorphism [12]. Considering that the sex determination is equivalent to testis determination [30], the testis could be easily used to determine male and female individual rates. However, in pig conceptuses, the tunica albuginea, a marker of testis formation, is histologically identified only at 27 days [16] and the beginning of testicular descent occurs around 60 days [17]. In this context, we demonstrated that it is possible to determine the sex of pig embryos (25 days) and fetuses (35 days) using reads CPM of ten Y chromosome-linked genes through RNA-seq analysis, with no need to use other molecular tools for this purpose, such as PCR. Although most often considered a cheaper and faster technique than RNA-seq, PCR would represent another cost source in this study, since RNA-seq analyses had been previously performed aiming to test a hypothesis on a maternal nutrition experiment [22].
Several studies have also used Y chromosome-linked genes to determine the sex of pigs at early development stages [13,18,19], however, all of them have applied DNA-based approaches. The Y chromosome harbors genes that are essential for testis development and function, such as the master gene for testis determination (SRY), which is expressed during a critical period of embryonic development, and other genes that are important for spermatogenic function, as the ZFY, EIF1AY and KDM5D. The importance of these Y-linked genes during male development has been elucidated for mammals [31][32][33][34]. These genes are located in the non-recombining region of the Y chromosome (NRY), which is also known as the male-specific Y (MSY) region, transmitted from fathers to sons without recombination with the X chromosome [35]. The MSY region contains only 27 genes coding for distinct proteins, as the SRY, EIF1AY, KDM5D, ZFY, EIF2S3Y, and DDX3Y [31,35,36].
In this study, among all abovementioned genes, SRY was not used to differentiate male and female conceptuses (data not shown). This gene seems to follow a strict expression pattern during mammalian embryonic development. In mouse, its expression starts at 10.5 days post coitum (dpc), reaches a peak at 11.5 dpc and then wanes by 12.5 dpc [32,37]. In pigs, the SRY expression is detectable in the embryos genital ridge around 21 and 23 days, with faint expression at 26 days and absence of expression at 31 days [38]. Consequently, both 25 and 35 days of pig prenatal periods do not comprise the main SRY expression phase, reinforcing the critical time window during embryonic development in which SRY is expressed.
Subsequently to the sex determination by the SRY gene, other male-specific genes are important to maintain testicular development and spermatogenesis [39]. According to Bellott et al. [40], during spermatogenesis, each stage of the molecular central dogma is regulated by the following Y-linked genes: KDM5D, ZFY, and EIF1AY. The KDM5D gene encodes a histone demethylase enzyme that acts removing tri-and di-methylations of lysine 4 of histone H3 at the start site of transcription in actively transcribed genes [41]. The transcription factor ZFY, the first coding gene identified in the human Y chromosome, regulates the transcription of a number of Y-linked genes [31] and mediates multiple aspects of spermatogenesis and reproduction, such as morphology, motility, capacitation, acrosome reaction, and oocyte activation, as well as chromosomal aberrations [42]. The translation initiation factor EIF1AY is requested for a high rate of protein biosynthesis and the absence of this gene may contribute to failures in the spermatogenic process [43].
Another important translation initiation factor during spermatogenesis is the EIF2S3Y gene, located in the short arm of the Y chromosome. This gene has been identified as a key mouse-specific regulator of spermatogonia proliferation and differentiation [44]. Finally, the DDX3Y gene, along with KDM5D and EIF1AY, are azoospermia factors (AZFa) [45]. Azoospermia is defined as the absence of sperm in at least two different ejaculates [31]. The DDX3Y is the major AZFa gene and belongs to a highly conserved subfamily of the DEAD-box RNA helicase family [33]. This gene family is involved in oogenesis and spermatogenesis, silencing mobile elements and other repetitive genomic regions in germinal tissues, and is important for gonads formation during embryo development [46]. Therefore, the KDM5D, ZFY, EIF1AY, EIF2S3Y, and DDX3Y genes showed discrepant reads CPM among samples and have important spermatogenic and gonadal functions, being crucial to differentiate male and female pig conceptuses in the present study. Similar pattern has already been observed in humans blastocyst [21], which highlights the importance of using pigs as an animal model to study sex determination throughout human prenatal development.
The reads CPM observed for the Y-linked genes in female conceptuses may be considered technical artifacts from RNA-seq, since mammalian X and Y chromosomes share regions of high sequence similarity, causing mismapping of short reads to a reference genome [47]. In this context, Ballouz et al. [48] found reads of female samples mapped in Y chromosome, which was also characterized as reads mismapping. Although considered technical artifacts, these Y chromosome reads CPM found in females were not a major issue in the present study, since the discrepant reads CPM among samples for the ten Y chromosome-linked genes were sufficient to differentiate males and females.
The LOC110255257, LOC100625207, LOC110255320, LOC110257894, and LOC396706 genes are located in the Y chromosome, however, they have not yet been fully characterized in Sus scrofa genome. According to Skinner et al. [49], there are few data available on the porcine Y chromosome. Therefore, although these genes are located on this chromosome and have shown contribution to sex determination of pig conceptuses due to the discrepant reads CPM among samples, their functions will not be addressed in this study. Further studies might be performed to characterize these Y-linked genes in pigs.

Conclusions
We have demonstrated that sex identification can be performed from read counts per million of genes mapped in the Y chromosome at early stages of pig development using already available conceptuses RNA-seq data, with no need for extra techniques. Additionally, the results reinforce that molecular sex determination is a mechanism conserved across mammalian species, since the role of these genes is also important during male development in humans, highlighting the applicability of pigs as an animal model to study sex determination during human prenatal development.