Markers in Coreoperca whiteheadi Boulenger and Cross-Species Amplification

We described and characterized 11 expressed sequence tag (EST)-derived simple sequence repeats (SSR) and seven genomic (G)-derived SSRs in Coreoperca whiteheadi Boulenger. The EST-SSRs comprised 62.2% di-nucleotide repeats, 32.2% tri-nucleotide repeats and 5.5% tetra-nucleotide repeats, whereas the majority of the G-SSRs were tri-nuleotide repeats (81.4%). The number of alleles for the 18 loci ranged from 3 to 6, with a mean of 3.8 alleles per locus. The observed (Ho) and expected heterozygosities (He) values ranged from 0.375 to 1.000, and 0.477 to 0.757, respectively. The polymorphic information content (PIC) values ranged from 0.466 to 0.706. The mean values number of alleles, Ho, He, and PIC of EST-SSRs were higher than those of the G-SSRs. Four microsatellite loci deviated significantly from Hardy-Weinberg equilibrium (HWE) after Bonferroni correction and no significant deviations in linkage disequilibrium (LD) were observed. These loci are the first to be characterized in C. whiteheadi and should be useful in the investigation of a genetic evaluation for conservation. Compared with 11 loci in C. whiteheadi, 37 potential polymorphic EST-SSRs were found in Siniperca chuatsi (Basilewsky), which will provide a valuable tool for mapping studies and molecular breeding programs in S. chuatsi.


Introduction
Coreoperca whiteheadi Boulenger, one of the lower percoid fishes, is found in south China and the Red River of North Vietnam [1,2]. Due to anthropogenic disturbances, such as over-exploitation and environment pollution, the wild resource of C. whiteheadi has markedly declined in recent years [3]. Because of its dire conservation status, the genetic conservation of C. whiteheadi is becoming essential for the sustainable management of natural resources and increasing the production of this species. Hence, robust genetic markers are needed for information on the population dynamics of this species, including genetic connectivity, in order to inform conservation efforts.
Microsatellites or simple sequence repeats (SSRs) have become a useful marker system in population genetics analysis, genetic mapping and marker-assisted selection (MAS) of many kinds of fish species because of their co-dominant nature, high allelic polymorphism, high reproducibility and transferability across species [4,5]. However, widespread use of these markers is often limited by the time and cost involved in their development [6]. The recent development of library enrichment techniques and automated sequencing has made production of these markers simple, rapid, and cost effective [7,8]. SSR markers can be classified in genomic SSRs and EST-SSRs [9]. However, until now, no microsatellite markers have been reported for C. whiteheadi.
In this study, we characterized 11 EST-SSRs and seven G-SSRs as a tool to support genetic conservation and breeding programs in C. whiteheadi. 37 potential polymorphic EST-SSRs were also found in a sample of 12 wild Siniperca chuatsi (Basilewsky) individuals. A total of 122 SSRs from ESTs in our transcriptome sequencing database and 49 ones from genomic sequences of enriched libraries were selected for designing microsatellite primers and tested using PCR amplification for C. whiteheadi. And all the 122 microsatellites from ESTs were also tested for S. chuatsi.

Results and Discussion
A microsatellite-enriched library was constructed from the genomic DNA of C. whiteheadi. Developmental steps for the construction of the enriched library and its characteristic features are summarized in Table 1. A total of 90 putative recombinant clones were picked from the enriched library, sequenced, and analyzed for presence of SSRs. Sequence analysis revealed that 30 clones (33.33%) were redundant clones. Of the remaining 60 unique clones (66.67%), 49 (81.67% of the unique clones) were found to harbor SSR sequences (GenBank Accession number: JX449105-JX449153), and 49 can be finally used for primer design. Sequence analysis of all the SSR-containing clones indicated that tri-nucleotide SSRs were found to be more frequent (81.4%) than di-nucleotide SSRs (18.6%), and no tetra/penta/hexa nucleotide SSRs was identified in the library. Among 49 tri-nucleotide SSRs, the CCT/GGA class of repeat motif was the most frequent (50% of total tri-nucleotide microsatellites), followed by the GAG/CTC class (31%).
In this study, we characterized a set of EST-SSR and G-SSR markers as a tool to support genetic conservation in C. whiteheadi. A total of 37 microsatellites from ESTs in our transcriptome and 31 from genomic sequences of enriched libraries were selected successfully for designing microsatellite primers and tested using PCR amplification. 11 EST-SSRs and seven G-SSRs were found to be polymorphic, and evaluated the performance in genetic analysis using 32 individuals randomly selected from a wild population ( Table 2). The number of alleles for the 18 loci ranged from 3 to 6, with a mean of 3.8 alleles per locus. The Ho and He values ranged from 0.375 to 1.000, and 0.477 to 0.757, respectively. PIC values ranged from 0.466 to 0.706 (Table 2). From the results, we could find the mean values of number of alleles, Ho, He, and PIC of EST-SSRs were higher than those of the G-SSRs. Four microsatellite loci deviated significantly from HWE (p < 0.0029) after Bonferroni correction (Table 2), which would be due to the limited sample size, sampling strategy and null alleles [10]. Analysis with MICROCHECKER [11] indicated that no significant LD was observed. Analysis of the nucleotide sequences of the 37 EST-and 31 G-SSRs showed that, the EST-SSRs comprised 62.2% di-nucleotide repeats, 32.3% tri-nucleotide repeats, and 5.5% tetra-nucleotide repeats. In contrast, the G-SSRs were composed mainly of tri-nucleotide repeats (81.4%). Meanwhile, 37 potential polymorphic EST-SSRs were found in a sample of 12 wild S. chuatsi individuals (Table 3). Among 92 successfully amplified EST-SSRs, 37 loci (40.2% of the designed primers) showed probably polymorphism in S. chuatsi, compared with 11 loci (12.0% of the designed primers) in C. whiteheadi. The number of alleles for the 37 loci in S. chuatsi ranged from 3 to 8, with a mean of 4.3 alleles per locus. It is easy to explain the difference in cross-species amplification as the transcriptome in this study was from F1 interspecies hybrids between S. chuatsi and S. schezeri. The 11 loci in C. whiteheadi were fully contained in 37 potential polymorphic EST-SSRs in S. chuatsi. And the mean values of number of alleles of the 37 loci were higher than those of the 11 loci (Table 3). Only a small subset of the EST-SSRs in our transcriptome was tested in this study, but high levels of polymorphism in S. chuatsi indicate that the pairs of primers described here may be suitable for assessments of genetic diversity, the construction of high-density linkage map, and MAS in S. chuatsi, but still required further study due to the limited sample size.   Table 3. Characteristics of 37 potential polymorphic EST-simple sequence repeats (SSR) loci in a sample of 12 S. chuatsi individuals.

Experimental Section
De novo transcriptome sequencing of F1 hybrids between S. chuatsi (♀) and S. scherzeri (♂) was performed and a total of 118,218 unigenes were identified. The processes of library preparation for transcriptome analysis and sequence assembly were as described in [12,13]. This unigene set was used for mining EST-SSR markers using the default parameters of the BatchPrimer3 v1.0 software [14]. The primers for these SSR loci were designed using NCBI/Primer-BLAST (Available online: http://www.ncbi.nlm.nih.gov/tools/primer-blast/index.cgi?LINK_LOC=BlastHome; accessed on 18 June 2012). In this study, a subset of 92 EST-SSR markers was screened on 12 wild S. chuatsi and 32 wild C. whiteheadi, respectively.
A microsatellite-enriched genomic library was constructed using the fast isolation by amplified fragment length polymorphism (AFLP) of sequences containing repeats (FIASCO) protocol [7]. High quality genomic DNA was fragmented using a restriction enzyme, MseI. The fragmented DNAs were ligated to specific adapters (5'-GACGATGAGTCCTGAG-3' and 5'-TACTCAGGACTCAT-3'). The polymerase chain reaction (PCR) products were size selected to preferentially obtain small fragments (300-1000 bp), which were hybridized to one streptavidin-biotinylated oligo simple sequence repeat complexes: (CCT/GGA) 15 . The enriched DNAs were cloned into the pGEM-T vector (Promega, Madison, WI, USA) and then transformed into competent DH5a strain (Promega, USA). White colonies were randomly picked from the primary transformation plates, and then the isolated Plasmid DNA was sequenced using ABI 3730 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA). Identification of SSR clones was screened using the SSRHUNTER program, which was designed to find regions containing SSRs [15]. For all types of SSRs, a minimum length criterion of 12 bp was selected, and only perfect SSR were considered. Primers flanking SSR were designed using the PRIMER PREMIER 5.0 program (PREMIER Biosoft International, Palo Alto, CA, USA). The positive clones were identified by PCR using MseI-N primers and M13 primers. Of the 90 colonies, 60 were sequenced using ABI 3730 Genetic Analyzer (Applied Biosystems), 49 of which contained microsatellites (GenBank Accession number: JX449105-JX449153). In this study, a subset of 31 G-SSR markers was screened on 32 C. whiteheadi, respectively.
Total genomic DNA was extracted from fin clips using the TIANamp Genomic DNA Kit (Tiangen, Beijing, China) following the manufacturer's instructions. Polymerase chain reaction (PCR) conditions were optimized for each pair of primers. PCRs were performed in 25 µL reaction volumes containing 2.5 µL of 10 × PCR buffer, 1.0-3.0 mM MgCl 2 , 50 µM dNTPs, 0.4 µM of each primer, 1 U Taq polymerase (Takara, Dalian, China) and 50 ng genomic DNA. PCR conditions were as follows: initial denaturation at 94 °C for 3 min followed by 30 cycles at 94 °C for 30 s, the optimized annealing temperature ( Table 2, 3) for 30 s, 72 °C for 30 s, and then a final extension step at 72 °C for 10 min. PCR products were separated on a 8% non-denaturing polyacrylamide gel electrophoresis and visualized by silver staining. A denatured pBR322 DNA/MspI molecular weight marker (Tiangen, Beijing, China) was used as a size standard to identify alleles.
The number of alleles (Na), the observed (Ho) and expected heterozygosities (He) were estimated using POPGENE version 1.32 [16]. The polymorphic information content (PIC) was calculated using the Formula 1: