Development and Validation of Sex-Specific Markers in Pelodiscus Sinensis Using Restriction Site-Associated DNA Sequencing

The sex of an animal influences its economic traits, especially in species displaying sexual dimorphism. The Chinese soft-shelled turtle, Pelodiscus sinensis, is an economically important aquatic species that shows significant male sexual dimorphism, with a large body size, faster growth, a thick and wide calipash, and lower body fat. In this study, ten male and ten female turtles were subjected to restriction site-associated DNA sequencing (RAD-seq) using the Hi-Seq 4000 sequencing platform to isolate female-specific DNA fragments. We identified 5967 bp and 6532 bp fragments using genome walking. Three female-specific markers designed from these two fragments were confirmed to separate the sexes of Pelodiscus sinensis perfectly. One of the female-specific markers showed dosage association in female and male individuals. Individuals from different populations (n = 296) were used to validate that the female-specific markers could identify the genetic sex of Pelodiscus sinensis with 100% accuracy. The results of the present study demonstrated that RAD-seq was useful to develop sex-related markers in animals, and verified that the sex determination system of Pelodiscus sinensis belonged to the ZZ/ZW heterogametic system. Importantly, the developed markers could lead to a method for sex-controlled breeding in the Chinese soft-shelled turtle.


Introduction
An animal's genetic sex influences its economic traits, especially in animals displaying significant sexual dimorphism. The Chinese soft-shelled turtle, Pelodiscus sinensis, is widely distributed in China, Russia, Japan, Thailand, Vietnam, and Korea [1,2]. It is an economically important aquatic species, and in China, it is the most common turtle species used for food and medicine or sold as a pet [3]. Aquaculture practices for the Chinese soft-shelled turtle have revealed significant sexual dimorphism, in which the male individuals are characterized by a larger body size, faster growth, a thicker and wider calipash, and less body fat compared with female turtles. Therefore, juvenile male turtles are more popular in the market. This sexual dimorphism has encouraged research to produce a high-male rate or all-male juveniles using sex control approaches.
Sex-specific or sex chromosome-linked markers have significant applications for the genetic improvement of economically important traits in aquaculture species [4,5], especially for the uncommon growth differences between females and males [6]. Sex-specific or sex chromosome-linked markers make it possible to select superior individuals or exclude unwanted individuals at the DNA level, and can be used to increase the efficiency and precision of selecting individuals that have desired genetic traits at an early stage [7]. In addition, sex-specific markers can reveal whether a species has genetic sex determination (GSD) with either male or female heterogamety [8]. Random amplified polymorphic DNA fingerprinting (RAPD), microsatellite markers (SSR), amplified fragment length polymorphism (AFLP), or restriction site-associated DNA sequencing (RAD-seq) have been successfully used to obtain sex-specific markers of more than fifteen economically important aquatic species [9][10][11][12][13][14].
Sex determination in the Chinese soft-shelled turtle has been described using the ZZ/ZW heterogametic system [15,16]. However, to date, no sex-specific or sex chromosome-linked markers have been developed, which has seriously limited the development of gender control technology in this species. Therefore, the aim of the present study was to develop sex-specific genetic markers using RAD-seq, and to use the developed markers for genetic sex identification in Chinese soft-shelled turtles from different locations.

Animals and DNA Extraction
Ten male and ten female farmed Chinese soft-shelled turtles (named PSM1 to PSM10, and PSF1 to PSF10) were collected from Anhui Xijia Agricultural Development Co. Ltd (Bengbu, China). The calipash tissues were sampled, immersed in absolute ethanol, and stored at −20 • C. Genomic DNA was isolated from the alcohol-preserved calipash tissues using a DNA extraction kit (Takara, Shiga, Japan), according to the manufacturer's protocol. The DNA quality and concentration were evaluated using 1.0% agarose electrophoresis, measured using a NanoDrop 2000 spectrophotometer (Thermo scientific, Waltham, MA, USA), and then stored at −20 • C. All procedures using the experimental turtles conformed to the standards for Animal Care of the Yangtze River Fisheries Research Institute, Chinese Academy of Fishery Sciences (Wuhan, China).

Restriction Site-Associated DNA Sequencing
RAD-seq-reduced representation libraries for each sample were constructed following a previously described protocol, with some modifications [17]. Briefly, the high-fidelity restriction enzyme EcoRI (specific recognition sequence: G|AATTC, New England Biolabs, Ipswich, MA, USA) was used to digest the genomic DNA (about 1 µg) from each sample at 65 • C for 20 min. Individually barcoded P1 adapters were ligated onto the EcoRI cut site in each sample. A total of 20 samples were constructed (PSF1 to PSF10, and PSM1 to PSM10). Then, 150 bp paired-end (PE) sequencing libraries were constructed for the 20 samples. Each individual sample was sequenced in different lanes using the IIIumina Hiseq 4000 platform (IIIumina, San Diego, CA, USA). The RAD-seq library construction and sequencing were performed by the Gene Denovo Biotechnology Co. (Guangzhou, China).

RAD-seq Data Analysis
To improve the quality of the raw sequence reads, the Illumina sequencing reads were trimmed, and low-quality bases were removed before the analysis. High-quality clean reads were obtained using the following three steps: First, reads with an unidentified nucleotides (N) percentage >10% were removed; second, low-quality reads with >50% of bases having a phred quality score of ≤20 were discarded; and third, reads with only barcode adapters were removed. The high-quality reads were then used for further analysis. The filtered reads of each female and male individual were mapped to the Chinese soft-shelled turtle reference genome sequence (PelSin_1.0, https://www.ncbi.nlm.nih.gov/ assembly/GCF_000230535.1) using the Burrows-Wheeler Aligner (BWA) software with the parameter "mem 4-k32-M", where -k is the minimum seed length, and -M is an option used to mark shorter split alignment hits as secondary alignments [18]. Based on the alignment result, the common reads between the PSF and PSM samples were removed. The sex determination of the Chinese soft-shelled turtle has been described as a ZZ/ZW heterogametic system; therefore, the collected scaffolds were filtered to identify those with >100 total reads in all ten female samples, but <10 in total reads in all ten Genes 2019, 10, 302 3 of 11 male samples. Finally, the scaffold sequences satisfying the above criteria were selected as candidate female-specific regions.

Sex-Specific Markers Identification
Based on the different candidate regions, the top 10 different sequences were used to design PCR primers for putative sex-specific marker region identification, using Primer Premier 5.0 software (http://www.premierbiosoft.com/primerdesign/) ( Table 1). The PCR reaction volume of 25 µL contained 2.5 µL of 10× PCR buffer mix (TsingKe, Beijing, China), 0.5 µL of 10 µM of each primer, 0.5 µL of 5 U/µL Taq DNA polymerase (Takara, Japan), 1 µL of 50 ng/µL DNA, and 20 µL of ultrapure water. The PCR amplification conditions were as follows: Pre-denaturation at 94 • C for 5 min; denaturation at 94 • C for 30 s, 30 s at the denaturation temperature of the different primers, and 45 s at 72 • C, for 32 cycles; and a final extension at 72 • C for 4 min. The amplification products were then detected using a 1.5% agarose gel. In addition, for the specific sequences that displayed different results in male and female populations, but without remarkably different agarose gel bands, the genome walking method was performed using a Universal GenomeWalker 2.0 kit (Clontech, Mountain View, CA, USA) to obtain their longer sequences to design primers. Then, the GenomeWalker PCR products were sequenced, aligned, and assembled using the software DNAMAN (Lynnon Biosoft, San Ramon, CA, USA). Thereafter, the sex-specific regions were used to design sex-specific markers.

Validation of Sex-Specific Markers
To validate the putative sex-specific primers, two farmed populations of the Chinese soft-shelled turtle were collected. One population, comprising 48 female and 48 male individuals, was sampled from Xiantao, Hubei province, and the other, comprising 100 female and 100 male individuals, came from Hanshou, Hunan province. All 296 turtles were adults; therefore, their morphology was macroscopically observed or they were dissected to detect the gonad tissue to determine their sex. Genomic DNA was extracted from the female and male calipash tissues, and PCR was used to amplify the sex difference region to detect their genetic sex. Finally, their sex was cross checked with that detected using the gonad tissue.

Restriction Site-Associated DNA Sequencing
The detailed workflow of the sex-specific marker development based on RAD-seq is shown in Figure 1. The RAD-seq generated 26.28 Gb of raw data from the 10 individual female libraries and 23.41 Gb of raw data from the 10 individual male libraries. The average raw data per turtle was 2.63 Gb for females and 2.34 Gb for males. The Q20 high-quality score for the data ranged from 90.72% to 91.20% and the Q30 score ranged from 90.71% to 92.68% in males, while Q20 ranged from 95.96% to 96.86% and the Q30 score ranged from 91.17% to 92.70% in females. The GC content ranged from 43.67% to 44.10% in male individuals and from 43.77% to 44.17% in female individuals. There were 200,108,964 raw reads for the male samples and 179,268,916 for the female samples. After filtering, 193,656,716 and 173,982,866 high-quality reads were obtained from male and female individuals, respectively. The details are shown in Table 2. The RAD-sequencing result is available via the NCBI database with the SRA accession number PRJNA530350.
To validate the putative sex-specific primers, two farmed populations of the Chinese soft-shelled turtle were collected. One population, comprising 48 female and 48 male individuals, was sampled from Xiantao, Hubei province, and the other, comprising 100 female and 100 male individuals, came from Hanshou, Hunan province. All 296 turtles were adults; therefore, their morphology was macroscopically observed or they were dissected to detect the gonad tissue to determine their sex. Genomic DNA was extracted from the female and male calipash tissues, and PCR was used to amplify the sex difference region to detect their genetic sex. Finally, their sex was cross checked with that detected using the gonad tissue.

Restriction Site-Associated DNA Sequencing
The detailed workflow of the sex-specific marker development based on RAD-seq is shown in Figure 1. The RAD-seq generated 26.28 Gb of raw data from the 10 individual female libraries and 23.41 Gb of raw data from the 10 individual male libraries. The average raw data per turtle was 2.63 Gb for females and 2.34 Gb for males. The Q20 high-quality score for the data ranged from 90.72% to 91.20% and the Q30 score ranged from 90.71% to 92.68% in males, while Q20 ranged from 95.96% to 96.86% and the Q30 score ranged from 91.17% to 92.70% in females. The GC content ranged from 43.67% to 44.10% in male individuals and from 43.77% to 44.17% in female individuals. There were 200,108,964 raw reads for the male samples and 179,268,916 for the female samples. After filtering, 193,656,716 and 173,982,866 high-quality reads were obtained from male and female individuals, respectively. The details are shown in Table 2. The RAD-sequencing result is available via the NCBI database with the SRA accession number PRJNA530350.    Note: RAD-seq, restriction site-associated DNA sequencing (RAD-seq); Q20, quality scores of ≥20; Q30, quality scores of ≥30.

Identification of Sex Specific Regions and Molecular Marker Design
Each read from the PSF and PSM individuals was compared with the reference genome sequence of the Chinese soft-shelled turtle (PelSin_1.0) at the NCBI. The alignment results are shown in Table 3. The alignment results identified 19,904 scaffold sequences that could be subjected to further analysis. Among them, 120 scaffolds (0.6%, 120/19,904) were extracted according to the criteria of total reads >100 in all ten female samples, but total reads <10 in all ten male samples. To validate the candidate female-specific markers, the top 10 scaffold sequences based on read number were identified as candidate sequences, in which the selectivity was as high as 0.05% to identify the sex-specific regions.

Identification of Candidate Sex-Difference Regions
Based on the above scaffolds alignment, the top 10 candidate sex-difference sequences were used to design primers (Table 1). After the PCR amplification and agarose gel detection, different amplification products were demonstrated to exist in male and female individuals for sequences 4085 and 3137 using primers ps4085 and ps3137 ( Figure 2). However, these bands only displayed differences in band intensity between female and male individuals, which were not similar, as expected.

Identification of Candidate Sex-Difference Regions
Based on the above scaffolds alignment, the top 10 candidate sex-difference sequences were used to design primers (Table 1). After the PCR amplification and agarose gel detection, different amplification products were demonstrated to exist in male and female individuals for sequences 4085 and 3137 using primers ps4085 and ps3137 ( Figure 2). However, these bands only displayed differences in band intensity between female and male individuals, which were not similar, as expected.

Genome Walking
Although the agarose gel detection displayed different types of bands for the sequences of loci 4085 and 3137, their specific sequences were only about 100 bp, which were too short to design effective primers for PCR amplification. Thus, genome walking was performed to obtain longer sequences based on the known sequence information of 4085 and 3137. After eight rounds of genome walking amplification, four from walking to the right and four from walking to the left, longer sequences were amplified according to genomic fragments from the pool of female and male samples. Furthermore, each PCR-amplified product was sequenced 10 times to ensure their accuracy. Finally, a sequence of 5967 bp was obtained for locus 4085, while a sequence of 6532 bp was obtained for locus 3137, based on the PCR fragments from the female pools. Subsequently, the sex-specific markers ps4085s, ps3137s1, and ps3137s2 were designed (Table 1).

Identification of Sex-Specific Markers
Based on the genome walking results, we detected the sex-specific markers ps4085s, ps3137s1, and ps3137s2, which are located in loci 4085 and 3137. The three specific regions obtained by ps4085s, ps3137s1, and ps3137s2 were 1434 bp, 1313 bp, and 864 bp, and their GC contents were 56.45%, 51.33%, and 64.47%, respectively. The PCR amplification products were detected using 1.5% agarose gel electrophoresis. For marker ps4085s, the agarose gel only displayed application products (1434 bp) in the female individuals, and no bands were amplified from the male individuals ( Figure 3). This indicated that ps4085s could be a sex-specific marker for sex identification in Chinese soft-shelled turtles.
For locus 3137, we observed deletion fragments in male individuals compared with female individuals after completing the genome walking sequencing. Based on the 6532 bp sequence of locus 3137, two pairs of primers (ps3137s1 and ps3137s2) located at non-overlapping regions were designed to detect the specific region. The application product (864 bp) using primer pair ps3137s2 identified only one clear band in the female individuals, but no bands were amplified from the male individuals ( Figure 4). Interestingly, the amplification products (1313 bp) for marker ps3137s1 were observed in

Genome Walking
Although the agarose gel detection displayed different types of bands for the sequences of loci 4085 and 3137, their specific sequences were only about 100 bp, which were too short to design effective primers for PCR amplification. Thus, genome walking was performed to obtain longer sequences based on the known sequence information of 4085 and 3137. After eight rounds of genome walking amplification, four from walking to the right and four from walking to the left, longer sequences were amplified according to genomic fragments from the pool of female and male samples. Furthermore, each PCR-amplified product was sequenced 10 times to ensure their accuracy. Finally, a sequence of 5967 bp was obtained for locus 4085, while a sequence of 6532 bp was obtained for locus 3137, based on the PCR fragments from the female pools. Subsequently, the sex-specific markers ps4085s, ps3137s1, and ps3137s2 were designed (Table 1).

Identification of Sex-Specific Markers
Based on the genome walking results, we detected the sex-specific markers ps4085s, ps3137s1, and ps3137s2, which are located in loci 4085 and 3137. The three specific regions obtained by ps4085s, ps3137s1, and ps3137s2 were 1434 bp, 1313 bp, and 864 bp, and their GC contents were 56.45%, 51.33%, and 64.47%, respectively. The PCR amplification products were detected using 1.5% agarose gel electrophoresis. For marker ps4085s, the agarose gel only displayed application products (1434 bp) in the female individuals, and no bands were amplified from the male individuals ( Figure 3). This indicated that ps4085s could be a sex-specific marker for sex identification in Chinese soft-shelled turtles.
For locus 3137, we observed deletion fragments in male individuals compared with female individuals after completing the genome walking sequencing. Based on the 6532 bp sequence of locus 3137, two pairs of primers (ps3137s1 and ps3137s2) located at non-overlapping regions were designed to detect the specific region. The application product (864 bp) using primer pair ps3137s2 identified only one clear band in the female individuals, but no bands were amplified from the male individuals ( Figure 4). Interestingly, the amplification products (1313 bp) for marker ps3137s1 were observed in both female and male populations. However, the products exhibited a dosage effect between females and males, in which the amount of products in female Chinese soft-shelled turtles was higher than that in the males. To characterize and optimize the dosage difference between the female and male individuals, we used four PCR amplification reactions comprising 32, 27, 24, and 22 cycles. The results showed that the amount of PCR product decreased as the number of amplification cycles was reduced ( Figure 5). The agarose gel showed that in the female individuals, the PCR product bands became very faint, whereas the bands in the male individuals disappeared when using less than 22 cycles. Thus, the optimal amplification conditions to distinguish females from males were set at 24 cycles.
Genes 2019, 10, x FOR PEER REVIEW 7 of 11 both female and male populations. However, the products exhibited a dosage effect between females and males, in which the amount of products in female Chinese soft-shelled turtles was higher than that in the males. To characterize and optimize the dosage difference between the female and male individuals, we used four PCR amplification reactions comprising 32, 27, 24, and 22 cycles. The results showed that the amount of PCR product decreased as the number of amplification cycles was reduced ( Figure 5). The agarose gel showed that in the female individuals, the PCR product bands became very faint, whereas the bands in the male individuals disappeared when using less than 22 cycles. Thus, the optimal amplification conditions to distinguish females from males were set at 24 cycles.   both female and male populations. However, the products exhibited a dosage effect between females and males, in which the amount of products in female Chinese soft-shelled turtles was higher than that in the males. To characterize and optimize the dosage difference between the female and male individuals, we used four PCR amplification reactions comprising 32, 27, 24, and 22 cycles. The results showed that the amount of PCR product decreased as the number of amplification cycles was reduced ( Figure 5). The agarose gel showed that in the female individuals, the PCR product bands became very faint, whereas the bands in the male individuals disappeared when using less than 22 cycles. Thus, the optimal amplification conditions to distinguish females from males were set at 24 cycles.

Validation of the Developed Markers
To further verify the universality of the developed sex-specific markers ps4085 and ps3137, two farmed populations of the Chinese soft-shelled turtles were collected from Xiantao (Hubei province)

Validation of the Developed Markers
To further verify the universality of the developed sex-specific markers ps4085 and ps3137, two farmed populations of the Chinese soft-shelled turtles were collected from Xiantao (Hubei province) and Hanshou (Hunan province), respectively. Pictures of macroscopic morphology observation and gonad tissue are shown in Figure 6

Validation of the Developed Markers
To further verify the universality of the developed sex-specific markers ps4085 and ps3137, two farmed populations of the Chinese soft-shelled turtles were collected from Xiantao (Hubei province) and Hanshou (Hunan province), respectively. Pictures of macroscopic morphology observation and gonad tissue are shown in Figure 6   As expected, the fragment sizes in these individuals were identical to those in the farmed population of Anhui Province, in that the PCR amplification products were only observed in the female individuals, and no bands were found in the male individuals for ps4085 and ps3137. The identification results of gonadal tissue and sex-specific markers are presented in Table 4. The results As expected, the fragment sizes in these individuals were identical to those in the farmed population of Anhui Province, in that the PCR amplification products were only observed in the female individuals, and no bands were found in the male individuals for ps4085 and ps3137. The identification results of gonadal tissue and sex-specific markers are presented in Table 4. The results indicated that these three sex-specific markers could identify the genetic sex of Chinese soft-shelled turtles.

Discussion
To study sex differentiation and sex control, sex-specific DNA markers are required. Sex-specific marker development is also crucial for genetic breeding. Sexual dimorphism exists widely in animals, especially in the lower vertebrates, including fish, reptiles, and amphibians. Sex-specific genetic markers are often required to study sex-associated phenomena, such as significant sexual dimorphisms of growth rate and body size [19,20]. The development of next generation sequencing (NGS) technology has led to the emergence of restriction site-associated DNA sequencing (RAD-seq), in which sequences are obtained from the DNA flanking specific restriction sites throughout the whole genome, which can not only detect a huge number of genetic polymorphisms at a low cost, but also produces mass genetic markers, even if no genome information is available for a species [17,21]. Previously, most studies employed to identify sex-specific markers needed to construct linkage maps from test-crosses [22,23]. However, in many species, it is not feasible to generate test-crosses because they may have a long generation time, a small number of offspring, or are not easy to breed in captivity [24,25]. However, RAD-seq does not require test crossing, and thus is considered as a particularly powerful tool to explore genetic variation and identify sex-specific DNA-based (or molecular) genetic markers in 'non-model' species [12,14,21,26] In the present study, we isolated two female-specific DNA fragments using RAD-seq and subsequently identified a 5967 bp fragment from locus ps4085 and a 6532 bp fragment from locus ps3137 from the female-derived sequences using genomic walking. Finally, primer pairs were developed for three female-specific markers ps4085, ps3137s1, and ps3137s2, which were successfully validated as effective and stable sex-specific markers to identify the genetic sex of Chinese soft-shelled turtles. Among them, the female-specific markers ps4085 and ps3137s2 were only detected in female individuals. Male sex markers would be indicative of an XY system, whereas female sex markers would indicate a ZW system [26]. In addition, the identified female-related markers were heterozygous in female individuals and homozygous in male individuals, which suggested a ZW/ZZ female heterogametic sex determination system. Therefore, these female-specific markers confirmed the previous conclusion that the sex determination system of the Chinese soft-shelled turtle belongs to the ZZ/ZW heterogametic system [15,16].
Interestingly, the female-related marker ps3137s1 amplification products were detected in both male and female individuals; however, a dosage difference between females and males under different PCR amplification cycles was observed. Higher levels of the ps3137s1 amplification product were produced from female samples than from male samples. A female-related marker was discovered to produce higher dosage association in females than in males of Pseudobagrus ussuriensis, in which the sex determination was a male heterogametic XX/XY system. And the dosage effect was explained by the presence of two X chromosomes in females and one X chromosome in males [13]. In the present study, the female-specific marker primer pair ps3137s1F/ps3137s1R produced higher levels of amplification products in heterogametic ZW female individuals. Genomic variations, including deletions, insertions, inversions, and duplications, represent an important source of genetic diversity in organisms [27,28]. In addition, gene dosage effects are usually caused by sequence fragment deletions, duplications, or loss-of-function mutations [29]. In fact, when we sequenced the 6532 bp fragment of locus ps3137 in female and male individuals, we observed deletion fragments in the male individuals compared with those in the female individuals. Therefore, we speculated that marker ps3137s1 was located on a W sex chromosome, and not on an autosome. Thus, the dosage effect was probably caused by a difference in gene copy number caused by the identified deletions.
In summary, we developed three female-specific DNA markers in the Chinese soft-shelled turtle using RAD-seq. These female-specific DNA markers could identify the genetic sex of Chinese soft-shelled turtle populations with a 100% accuracy. The developed markers will enable studies of sex determination and sex control breeding in Chinese soft-shelled turtles at all life stages.