Isolation and Characterization of 46 Novel Polymorphic EST-Simple Sequence Repeats (SSR) Markers in Two Sinipercine Fishes (Siniperca) and Cross-Species Amplification

With the development of next generation sequencing technologies, transcriptome level sequence collections are emerging as prominent resources for the discovery of gene-based molecular markers. In this study, we described the isolation and characterization of 46 novel polymorphic microsatellite loci for Siniperca chuatsi and Siniperca scherzeri from the transcriptome of their F1 interspecies hybrids. Forty-three of these loci were polymorphic in S. chuatsi, and 20 were polymorphic in S. scherzeri. In S. chuatsi, the number of alleles per locus ranged from 2 to 8, and the observed and expected heterozygosities varied from 0.13 to 1.00 and from 0.33 to 0.85, respectively. In S. scherzeri, the number of alleles per locus ranged from 3 to 9, and the observed and expected heterozygosities varied from 0.19 to 1.00 and from 0.28 to 0.88, respectively. We also evaluated the cross-amplification of 46 polymorphic loci in four species of sinipercine fishes: Siniperca kneri, Siniperca undulata, Siniperca obscura, and Coreoperca whiteheadi. The interspecies cross-amplification rate was very high, totaling 94% of the 184 locus/taxon combinations tested. These markers will be a valuable resource for population genetic studies in sinipercine fishes.


Introduction
Mandarin fish (Siniperca chuatsi), an economically important species in China, has a relatively high market value, and is wide cultured throughout the country [1,2].It has a fast growth rate, but is susceptible to diseases.Compared with S. chuatsi, Golden mandarin fish (Siniperca scherzeri) has a great disease resistance, but grows slowly.Recently, outbreaks of diseases caused by parasites, bacteria and viruses have caused severe economic losses to the aquaculture industry [3].In addition, because of overfishing, drought and especially water pollution, the wild stock of S. chuatsi is declining [4].Therefore, breeding a disease-resistant and faster growing strain and preserving fish germplasm are becoming urgent aims in China.
Microsatellites or simple sequence repeats (SSRs) have become a useful tool to assess genetic diversity and develop molecular breeding techniques in fish due to their co-dominance, ubiquitous distribution within genomes, high reproducibility, and transferability across species [5,6].However, the development of microsatellite markers has been limited by the labor and time required to construct, enrich, and sequence genomic libraries [7].Fortunately, with the advent of next generation sequencing technologies, transcriptome sequencing is emerging as a rapid and efficient means for gene discovery and genetic marker development.Since EST-SSRs derived from transcriptome exist in the transcribed region of the genome, they can lead to the development of gene-based maps which help to identify candidate function genes and increase the efficiency of marker-assisted selection (MAS) [8].Furthermore, EST-SSRs show a higher level of transferability to closely related species than non-EST-SSRs [9].
Although a few microsatellite markers were developed for S. chuatsi [10][11][12][13][14] and S. scherzeri [15], the number of available SSRs is grossly inadequate for genetic and mapping studies.Here, we describe the isolation and characterization of 46 novel polymorphic microsatellite loci for the S. chuatsi and S. scherzeri.We also test the transferability of these markers in other four species of sinipercine fishes: Siniperca kneri, Siniperca undulata, Siniperca obscura, and Coreoperca whiteheadi.

Results and Discussion
As shown in Table 1, a total of 46 polymorphic EST-SSR markers were newly developed.Forty-three of these loci were polymorphic in S. chuatsi, and 20 were polymorphic in S. scherzeri.Concerning S. chuatsi, the number of alleles per locus ranged from 2 to 8, with an average of 4.3 alleles per locus.The observed (H O ) and expected heterozygosities (H E ) ranged from 0.13 to 1.00 (average of 0.55) and from 0.33 to 0.85 (average of 0.63), respectively.In S. scherzeri, the number of alleles per locus ranged from 3 to 9, with an average of 5.5 alleles per locus.The observed (H O ) and expected heterozygosities (H E ) ranged from 0.19 to 1.00 (average of 0.74) and from 0.28 to 0.88 (average of 0.72), respectively.Five loci (Sin134 in S. chuatsi, Sin118, Sin122, Sin158 and Sin159 in S. scherzeri) showed significant deviation from the Hardy-Weinberg equilibrium (HWE) after Bonferroni correction (adjusted p-value = 0.0012 for S. chuatsi and 0.0026 for S. scherzeri), which may be due to the small sample size (n = 32) or the excess of heterozygotes.Another possible explanation for the departure from HWE is the dramatic contemporary decline in spawning populations, and consequent non-random mating and genetic bottlenecks [14].No evidence for allelic dropout was found in these loci.
No significant linkage disequilibrium (LD) was detected across all loci following Bonferroni correction (adjusted p-value = 0.0001 for S. chuatsi and 0.0003 for S. scherzeri).
Overall, a high level of cross-species amplification was observed across the four species (Table 2).Forty-five of 46 polymorphic loci (97.8%) were amplified successfully in S. undulate and S. obscura, 44 (95.7%) in S. kneri, and 39 (84.8%) in C. whiteheadi.These results were expected because of the taxonomical relationships of the families [16].S. kneri, S. undulata, S. obscura are closely related to S. chuatsi and S. scherzeri, and all species belong to Siniperca, whereas C. whiteheadi is from Coreoperca which is sister genera to Siniperca.As transcriptome sequences are typically conserved relative to nontranscribed regions, SSRs residing in transcriptome sequences typically benefit from higher amplification rates and higher levels of cross-species transferability [17,18].The high level of cross-species amplification tested here indicated not only the potential utility of transcriptome sequences for the identification and characterization of large numbers of gene-based SSR loci across species for which limited marker resources were available, but also the potential usefulness of the developed markers for a broader range of evolutionary, conservation and management studies in sinipercine fishes.

Experimental Section
De novo transcriptome sequencing of F 1 hybrids between S. chuatsi (♀) and S. scherzeri (♂) was performed and a total of 118,218 unigenes were identified.The processes of library preparation for transcriptome analysis and sequence assembly were as described in [19].This unigene set was used for mining EST-SSR markers using the default parameters of the BatchPrimer3 v1.0 software [20].In this study, a subset of 62 EST-SSR markers was screened on 32 S. chuatsi (Chibi, Hubei Province, China) and 32 S. scherzeri (Fengcheng, Liaoning Province, China), respectively.The primers for these SSR loci were designed using NCBI/Primer-BLAST [21].
Total genomic DNA was extracted from fin clips using the TIANamp Genomic DNA Kit (Tiangen) following the manufacturer's instructions.Polymerase chain reaction (PCR) conditions were optimized for each pair of primers.PCRs were performed in 25 µL reaction volumes containing 2.5 µL of 10× PCR buffer, 1.0-3.0mM MgCl 2 , 50 µM dNTPs, 0.4 µM of each primer, 1 U Taq polymerase (Takara) and 50 ng genomic DNA.PCR conditions were as follows: initial denaturation at 94 °C for 3 min followed by 35 cycles at 94 °C for 30 s, the optimized annealing temperature (Table 1) for 30 s, 72 °C for 30 s, and then a final extension step at 72 °C for 10 min.PCR products were separated on a 8% non-denaturing polyacrylamide gel electrophoresis and visualized by silver staining.A denatured pBR322 DNA/MspI molecular weight marker (Tiangen) was used as a size standard to identify alleles.The number of alleles (Na), the observed (H O ) and expected heterozygosities (H E ) were estimated using POPGENE version 1.32 [22].The polymorphic information content (PIC) was calculated using the formula: where n is the number of alleles, and q i , q j is the ith and jth allele frequency, respectively [23].Deviations from Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium (LD) were tested using the online version of GENEPOP [24].All results were adjusted for multiple simultaneous comparisons using a sequential Bonferroni correction [25].Genotyping errors due to null alleles, stutter bands, or allele dropout were analyzed using the software Micro-checker 2.2.3 [26].
Cross-species amplification of the above-developed polymorphic SSR loci was tested in four species of sinipercine fishes: S. kneri, S. undulata, S. obscura, and C. whiteheadi.Two individuals of each species were analyzed.The same PCR conditions were used as described above except that the annealing temperature was re-optimized at each locus (Table 2).Amplification products were visualized in 1.5% agarose gels, and fragments were sized by comparison with a 2 kb DNA Marker (Trans).Primer pairs that amplified fragments with similar sizes to those observed in source species were considered as successful cross-species amplification.

Conclusions
In summary, a total of 46 polymorphic EST-SSR markers were newly developed.Forty-three of these loci were polymorphic in S. chuatsi, and 20 were polymorphic in S. scherzeri.We only tested a small subset of the SSR loci identified in our transcriptome, but high levels of polymorphism, and high level of cross-species amplification indicate that the pairs of primers described here may be suitable for assessments of genetic diversity and population structure, the construction of high-density linkage map, conservation and molecular marker-assisted breeding in many species of sinipercine fishes.Our results highlight the value of next generation transcriptome resources for the characterization and development of gene-based SSRs.

Table 1 .
Cont.For each locus the information in the top row refers to S. chuatsi and the second row refers to S. scherzeri.Ta corresponds to annealing temperature; Na is number of alleles; H O and H E are observed and expected heterozygosity, respectively; PIC is the polymorphic information content.* indicates significant deviation from HWE after Bonferroni correction; no polymorphism for each locus is denoted by "-".

Table 2 .
Cross-species amplification for the 46 polymorphic EST-SSR markers in four species of sinipercine fishes.

Table 2 .
Cont.The annealing temperature for each locus was shown.Unsuccessful amplification of PCR products for each locus is denoted by "-".