Development of EST-Molecular Markers from RNA Sequencing for Genetic Management and Identification of Growth Traits in Potato Grouper (Epinephelus tukula)

Simple Summary The potato grouper is a novel aquaculture species in Taiwan. Due to the lack of genetic information concerning this species, we have developed molecular markers based on transcriptome sequencing and further characterized their association with gene diversity and growth traits of this species. Ultimately, these markers could be utilized as accurate and efficient tools for genetic management and marker-assisted selection of potato grouper with distinct growth traits. Abstract The accuracy and efficiency of marker-assisted selection (MAS) has been proven for economically critical aquaculture species. The potato grouper (Epinephelus tukula), a novel cultured grouper species in Taiwan, shows large potential in aquaculture because of its fast growth rate among other groupers. Because of the lack of genetic information for the potato grouper, the first transcriptome and expressed sequence tag (EST)-derived simple sequence repeat (SSR) and single nucleotide polymorphism (SNP) markers were developed. Initially, the transcriptome was obtained from seven cDNA libraries by using the Illumina platform. De novo transcriptome of the potato grouper yielded 51.34 Gb and 111,490 unigenes. The EST-derived SSR and SNP markers were applied in genetic management, in parentage analysis, and to discover the functional markers of economic traits. The F1 juveniles were identified as siblings from one pair of parents (80 broodstocks). Fast- and slow-growth individuals were analyzed using functional molecular markers and through their association with growth performance. The results revealed that two SNPs were correlated with growth traits. The transcriptome database obtained in this study and its derived SSR and SNP markers may be applied not only for MAS but also to maintain functional gene diversity in the novel cultured grouper.


Introduction
With the increase in the global population, the annual demand for animal-based protein has also risen, which includes increasing requirements for aquaculture. Aquaculture is critical for providing superior animal protein to humans [1]; thus, efficient aquaculture production is essential. While domesticated animals have long breeding histories, many farmed aquatic species (especially marine fish) are still in the early stages of domestication; many seeds are still obtained from the wild [2]. Thus, novel aquaculture species based on phenotype and genotype selection can improve individual phenotypes, yielding traits that enhance economic performance without additional production cost [3].
The accuracy and efficiency of marker-assisted selection (MAS) has been proven in the selection of economically essential aquaculture species [4]. Complete broodstock Figure 1. Experimental diagram showing the identification of molecular markers ideal for genetic management (parentage analysis and genetic diversity) and marker-assisted selection (MAS) of potato groupers. 80 broodstock individuals of potato grouper were tagged and cultured. A mature potato grouper was randomly chosen for transcriptome analysis. Seven tissues (the brain, gill, heart, head kidney, spleen, liver, and muscle) were collected for RNA sequencing. 180 juveniles were used for growth experiments and divided into two groups: the fast-growth group (top 24%; FG) and the slow-growth group (bottom 24%; SG). Simple sequence repeats (SSRs) were used for paternity analysis and genetic diversity; single nucleotide polymorphisms (SNPs) were used for the growth experiment and functional SNPs. Parents-M44/M69 represents the tagged id of broodstock.
One 5 mm and three 3 mm stainless steel beads were added to a microtube with Trizol and tissues, and the mixture was homogenized in a SpeedMill PLUS high-speed tissue homogenizer (Analytik Jena AG, Jena, Germany). Total RNA of each tissue was extracted using an EasyPure Total RNA spin kit (Bioman, Taipei, Taiwan), and genomic DNA (gDNA) was extracted using a Gene-SpinTM Genomic DNA isolation kit (Protech Technology Enterprise, Taipei, Taiwan) following manufacturer's instructions. The quality and quantity of total RNA and gDNA were determined using Nanodrop One (Thermo Fisher Scientific, San Jose, CA, USA) and run on 0.8% agarose gel. All samples of gDNA were diluted to 25 ng/µL for the DNA template and maintained at −20 °C. All tissues of total RNA were reverse-transcribed to cDNA by using a high-capacity cDNA reverse transcription kit (Applied Biosystems, Foster City, CA, USA) and maintained at −80 °C.

De Novo Assembly, Annotation, and Marker Detection
Total RNA (2 µg) from the seven tissues (brain, gill, heart, head kidney, spleen, liver, and muscle) were separately sequenced, and seven cDNA libraries were established to construct a transcriptome of the potato grouper, followed by RNA-seq using the Illumina sequencing platform [31]. Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) and ABI StepOnePlus real-time PCR system (Applied Biosystems) were used  80 broodstock individuals of potato grouper were tagged and cultured. A mature potato grouper was randomly chosen for transcriptome analysis. Seven tissues (the brain, gill, heart, head kidney, spleen, liver, and muscle) were collected for RNA sequencing. 180 juveniles were used for growth experiments and divided into two groups: the fast-growth group (top 24%; FG) and the slow-growth group (bottom 24%; SG). Simple sequence repeats (SSRs) were used for paternity analysis and genetic diversity; single nucleotide polymorphisms (SNPs) were used for the growth experiment and functional SNPs. Parents-M44/M69 represents the tagged id of broodstock.
One 5 mm and three 3 mm stainless steel beads were added to a microtube with Trizol and tissues, and the mixture was homogenized in a SpeedMill PLUS high-speed tissue homogenizer (Analytik Jena AG, Jena, Germany). Total RNA of each tissue was extracted using an EasyPure Total RNA spin kit (Bioman, Taipei, Taiwan), and genomic DNA (gDNA) was extracted using a Gene-SpinTM Genomic DNA isolation kit (Protech Technology Enterprise, Taipei, Taiwan) following manufacturer's instructions. The quality and quantity of total RNA and gDNA were determined using Nanodrop One (Thermo Fisher Scientific, San Jose, CA, USA) and run on 0.8% agarose gel. All samples of gDNA were diluted to 25 ng/µL for the DNA template and maintained at −20 • C. All tissues of total RNA were reverse-transcribed to cDNA by using a high-capacity cDNA reverse transcription kit (Applied Biosystems, Foster City, CA, USA) and maintained at −80 • C.

De Novo Assembly, Annotation, and Marker Detection
Total RNA (2 µg) from the seven tissues (brain, gill, heart, head kidney, spleen, liver, and muscle) were separately sequenced, and seven cDNA libraries were established to construct a transcriptome of the potato grouper, followed by RNA-seq using the Illumina sequencing platform [31]. Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) and ABI StepOnePlus real-time PCR system (Applied Biosystems) were used to qualify and quantify the cDNA libraries. Finally, these cDNA libraries were sequenced using the Illumina HiSeq 4000 platform.
From the raw reads, the following were filtered out: adaptors, reads with >5% unknown nucleotides, and low-quality sequences. The transcriptome assemblies of clean reads from the seven tissues were separately acquired using Trinity software [32]. To obtain the integrated unigene, Tgicl v2.1 software was used to cluster and assembly the unigenes into a large EST database [33]. De novo transcriptome assembly of seven libraries was submitted to the NCBI short read archive database (accession numbers: SRR12853319, SRR12854351, and SRR12855035-39).
The numbers of functional unigene sequences in deference libraries were subjected to BLAST search and annotated against the NCBI nucleotide sequences (Nt), nonredundant (Nr) protein of four related and some other fish species, gene ontology (GO) [34], clusters of orthologous groups (COG), Kyoto Encyclopedia of Genes and Genomes (KEGG), and SwissProt databases. Subsequently, the classification of all unigenes into GO, COG, and KEGG categories was annotated with the Blast2GO [35] and BLASTx software with an Evalue threshold of 10 −5 . Unigenes were compared with the nucleic acid database nucleotide through BLASTn (NT) (p < 0.00001): the protein with the highest sequence similarity to the unigene and the function annotation information of the unigene protein were both obtained.
All types of microsatellites, from mononucleotides to hexanucleotides, were detected using MISA version 1.0.0 software [36]. The parameters were determined to identify mono-, di-, tri-, tetra-, penta-, and hexanucleotides with a minimum of 12, 6, 5, 5, 4, and 4 motifs, respectively. The sequences of each microsatellite were used to design five sets of primers using Primer3 software [37].
SNPs were detected in each library using SAMtools and Picard software, which compare data with sequencing genomic locus and repeated reads. Next, HISAT software [38] was used to align the clean reads to unigenes. GATK v4.0 software [39] was used to improve the SNPs and InDel calling and eliminate low-quality SNPs.

SSR Analysis
The target PCR product was amplified in two steps, as recommended by Schuelke [40]. All forward primers and four fluorescent dyes (FAM, JOE, NED, and ROX) were labeled with an adaptor sequence. A total of 14 nonfunctional SSR [41][42][43] and 67 functional SSR primers (Table S7) were redesigned by using a labeling adaptor. For the first step, 10 µL of the PCR reaction mixture containing 5 µL of Taq DNA polymerase 2X Master Mix RED (Ampliqon, Odense M, Denmark), 0.3 µL of forward primer (10 µM), 0.3 µL of reverse primer (10 µM), 2 µL of template DNA (25 ng/µL), and 2.4 µL ddH 2 O. PCR was performed at 95 • C for 5 min, followed by 30 cycles of 95 • C for 40 s, T A • C for 30 s, and 72 • C for 40 s, ending with 72 • C for 5 min. For the second step, the allocated mixture was the same as above, except that the forward primer and template DNA were exchanged for the fluorescent primer (10 µM) and the first PCR product (diluted 10 times), respectively. The PCR procedure was also performed under the same conditions as the first step. The amplicons ranged from 110 to 410 bp, and PCR products (2 µL) were checked for by running on 2% agarose gel and dyeing in GelRed ® nucleic acid gel stain (Biotium, Hayward, CA, USA) for 30 min.
Multiple PCR products of each identical sample containing different fluorescent dyes were pooled into a 96-well microplate. The SSR fragments were separated using an ABI PRISM ® 3730xl DNA analyzer instrument (Applied Biosystems) with capillary electrophoresis. The output data were analyzed using GeneMapper ® v4.0 software (Applied Biosystems).

MassARRAY
In total, 46 SNPs were obtained from the transcriptome database genotyped by Agena MassARRAY platform and iPLEX chemistry (Agena, San Diego, CA, USA). All specific and extension primers (Table S7)  and genotyping were performed using Ellis and Ong's method [44]. Briefly, 7 nL of purified primer extension reaction was loaded onto a matrix pad of a SpectroCHIP (Agena). SpectroCHIPs were analyzed using MassARRAY Analyzer 4, and calling was performed through clustering analysis with TYPER 4.0 software. The genotype call rate was used to determine the accuracy of the result based on the following formula: P MA × P YLD × P SKW , where P MA -peak represents the correction of molecular weight and signal sharpness, P YLDpeak represents signal strength, and P SKW -SNP represents the signal strength ratio between two SNP genotypes. According to the call rate ranging from low to high, the genotype was assigned "aggressive," "moderate," or "conservative," in that order. AutoCluster (https: //www.geneticaffairs.com/features-autocluster.html) was used to cluster homozygotes and heterozygotes into two groups and plot a two-dimensional graph with the peak signal as the coordinate axis. The SNP genotypes were recognized through clustering.

Statistical Analysis
Geneious software (https://www.geneious.com/) was used to analyze the genotypes of markers with multiple fluorescent polymorphic amplicons. The genotypes of each SSR marker were imported into the GenAlEx software [45], which is fully compatible with Excel for Windows. For statistical analysis of population diversity, the parameters related to the number of alleles (N A ) and allele frequency (N E ) were as follows: observed heterozygosity (H O ), expected heterozygosity (H E ), polymorphism information content (PIC), fixation index (F IS ), and Hardy-Weinberg equilibrium (HWE) [46][47][48].
SPSS v22.0.0 (ICM) was used for one-way analysis of variance to determine the significance of correlations between genotypes of molecular markers (SSR and SNPs) and strains. Parentage analysis was analyzed using PARFEX v1.0 [49].

Quantitative Real-Time PCR
The cDNA of each individual was diluted 50 times as a template for quantitative real-time PCR (qRT-PCR). The specific primers (Table S7) were designed using Primer3web (http://bioinfo.ut.ee/primer3/). qRT-PCR was performed using Power SYBR ® Green PCR Master Mix (Thermo Fisher Scientific) on a Roche LightCycler ® 480 Instrument II (Roche Applied Science). All experiments were performed in triplicate. The expression levels of each gene were normalized to the expression of an internal housekeeping control, namely beta-actin. The median in each triplicate was used to calculate the relative target gene concentrations (Cp = Cp median target gene-Ct median beta-actin). The relative quantification was calculated and performed following Schmittgen and Livak's method [50].  (Table S2). The size distribution indicated that the length of the 48,427 unigenes was >1000 bp ( Figure S1).

RNA-Seq
In this experiment, 111,490 unigenes were assembled based on the pooled transcripts from the seven tissues of the potato grouper. To annotate these unigenes, we used BLASTx against the Nr, Nt, SwissProt, COG, KEGG, and GO databases, which were successfully annotated for 56 (Table S3). After cross-comparison and analysis of Venn diagrams, the number (proportion) of genes annotated to any database was 72,634 (65.15%), and the number of genes that could be simultaneously annotated in the six databases was 10,944 (9.82%); in addition, the number (proportion) of genes annotated in the five major databases other than GO or COG were 10,603 (9.51%) and 10,160 (9.11%), respectively. The results revealed that the number of genes annotated to the database from the seven tissues of the potato grouper was approximately 21,104-21,547.

Functional Annotation
The assembled unigene sequences of the potato grouper were subjected to BLAST searching against COG, GO, and KEGG databases. Figure 2A-C summarizes the statistical results. The possible functions of unigenes were predicted and classified by searching their predicted coding sequences (CDSs) of unigenes against the COG database. Possible functions of 23,718 unigenes were classified and subdivided into 25 COG categories ( Figure 2A; Table S4), among which the cluster "General function prediction only" was the largest group (9255 unigenes), followed by "Replication, recombination and repair" (4156 unigenes) and "Transcription" (3891 unigenes). The three smallest clusters were "defense mechanisms" (146 unigenes), "extracellular structures" (103 unigenes), and "nuclear structure" (10 unigenes).
GO enrichment analysis was used to assemble unigenes and provided defined ontologies to express gene product properties. According to the results of Nr annotation, the GO classification of unigenes was generated using the Blast2GO program. We categorized 24,368 unigenes by using GO classification, yielding 60 functional groups across three main categories: biological process, cellular component, and molecular function ( Figure  2B; Table S5). Among the 25 functional groups of the biological process category, "cellular process" (13,799 unigenes) and "single-organism process" (11,154 unigenes) had the largest proportions. Similarly, among the 18 functional groups of the cellular component category, "cell" (8865 unigenes) and "cell part" (8757 unigenes) were the most highly represented. In particular, "growth" was represented by 299 unigenes. Furthermore, among the 17 functional groups of the molecular function category, "binding" (12,232 unigenes) and "catalytic activity" (8707 unigenes) were the most abundant.
A four-way Venn diagram plot ( Figure 2D) presents the assembled unigenes annotated against the Nr, COG, KEGG, and SwissProt databases. The figure indicates that 22,031 unigenes were concurrently annotated on all four databases and that 14 were annotated in both KEGG and COG database, 2565 in Nr and KEGG databases, 58 in Nr and COG databases, 4371 in Nr and SwissProt databases, 1389 in KEGG and SwissProt databases, and 6 in COG and SwissProt databases. GO enrichment analysis was used to assemble unigenes and provided defined ontologies to express gene product properties. According to the results of Nr annotation, the GO classification of unigenes was generated using the Blast2GO program. We categorized 24,368 unigenes by using GO classification, yielding 60 functional groups across three main categories: biological process, cellular component, and molecular function ( Figure  2B; Table S5). Among the 25 functional groups of the biological process category, "cellular
We identified 122,220 SNPs ( Figure 3B) (87,542 transitions and 34,678 transversions) from mapping sequencing reads to assembled unigenes by using HISAT software. Under seven cDNA libraries, the total numbers of the two transition types A/G and C/T were 45,080 and 42,462, respectively, and the total numbers of the transversion types A/C, A/T, G/C, and G/T were 8824, 8144, 8809, and 8901, respectively. The transition/transversion (Ts/Tv) ratio was approximately 2.52.
We identified 122,220 SNPs ( Figure 3B) (87,542 transitions and 34,678 transversions) from mapping sequencing reads to assembled unigenes by using HISAT software. Under seven cDNA libraries, the total numbers of the two transition types A/G and C/T were 45,080 and 42,462, respectively, and the total numbers of the transversion types A/C, A/T, G/C, and G/T were 8824, 8144, 8809, and 8901, respectively. The transition/transversion (Ts/Tv) ratio was approximately 2.52.

Growth Experiment
The initial and final BW, BL, and TL for 180 juveniles of potato grouper after the 90-day growing period are presented in Table S8. The average initial BW, BL, and TL across the six groups were 3.51 ± 0.71 g, 5.07 ± 0.38 cm, and 5.91 ± 0.43 cm, respectively and the average final BW, BL, and TL were 79.72 ± 12.91 g, 13.81 ± 0.72 cm, and 16.66 ± 0.86 cm, respectively. The differences between initial and final values were significant. The survival rate was 97.78%.

SNPs
In total, 46 SNP markers were selected from the transcriptome, which was analyzed with Agena MassARRAY. The sequencing results revealed that all SNP assays had a 90% call rate, and the frequency of poly-and monomorphic SNP genotyping was derived from the 46 SNPs according to the nucleobase proportion of molecular weight ( Figures S4A,B  and S5A,B). Moreover, 38 of 46 SNPs were successfully genotyped; however, PCR amplification, clustering, and genotyping failed in the remaining eight (Unigene3858, Uni-gene7476, Unigene9685, Unigene25283, CL1123.Contig1, CL1123.Contig5, CL3051.Contig2, and CL4543.Contig1) (Table S11). In the broodstock and juveniles, we clustered the 38 SNP arrays into three groups depending on whether they had three, two, or one genotype. We found that seventeen and two SNP assays in the broodstock and juvenile groups, respectively, could be clustered into three genotypes; five and nine SNP assays, respectively, into two genotypes; and 16 and 27 SNP assays, respectively, into one genotype. Furthermore, 16 and 27 SNP assays, respectively, were homozygous and 22 and 12 SNP assays, respectively, were heterozygous. Among them, Unigene15043 in the juvenile group had only one heterozygous genotype (TA).

Correlation Between Genotypes and Traits
In juveniles, BW was highly correlated with BL (R 2 = 0.8821) and TL (R 2 = 0.8843) ( Figure S7A,B). Thus, we analyzed the correlation of the 11 SNPs with BW ( Table 2). Two SNP markers, Unigene7626 (p = 0.046) and CL8791.Contig3 (p = 0.032), were significantly correlated with growth traits (p < 0.05) ( Figure 4A,B). For Unigene7626, the TC genotype had a significantly larger body size (p = 0.017) than the TT genotype. For CL8791.Contig3, the AA genotype had a significantly smaller body size (p = 0.014) than the CA genotype. The alignment of the SNP sequences with Nt annotation implied that two SNPs were located in 3 -UTR and nonsynonymous coding regions within the alpha cardiac muscle actin (ACTC1) and pericentrin (PCNT) genes, respectively.  A/C K/Q AA/CA 0.032 * AA CA a Locus symbol is abbreviated according to the Unigene ID from the transcriptome database. The genotypes of SNPs were calculated using one-way ANOVA (* p < 0.05) with the correlation of growth traits.

Correlation Between Genotypes and Traits
In juveniles, BW was highly correlated with BL (R 2 = 0.8821) and TL (R 2 = 0.8843) ( Figure S7A,B). Thus, we analyzed the correlation of the 11 SNPs with BW ( Table 2). Two SNP markers, Unigene7626 (p = 0.046) and CL8791.Contig3 (p = 0.032), were significantly correlated with growth traits (p < 0.05) ( Figure 4A,B). For Unigene7626, the TC genotype had a significantly larger body size (p = 0.017) than the TT genotype. For CL8791.Contig3, the AA genotype had a significantly smaller body size (p = 0.014) than the CA genotype. The alignment of the SNP sequences with Nt annotation implied that two SNPs were located in 3′-UTR and nonsynonymous coding regions within the alpha cardiac muscle actin (ACTC1) and pericentrin (PCNT) genes, respectively.

Gene Expression in Different Tissues
For six juveniles from the FG and SG (control) groups, gene expression was examined for the brain, liver, and muscle tissues. Beta-actin was used as a reference gene for normalizing real-time PCR data in the groups for Unigne7626 (ACTC1) and CL8791.Contig3 (PCNT), the two genes tested, and their expression levels were compared between the two groups ( Figure S8). Compared with the SG (control) group, Unigene7626 (ACTC1) was downregulated in all the three tissues of the FG group; moreover, CL8791.Contig3 (PCNT) was upregulated in the brain and liver tissues and downregulated in the muscle tissue of the FG group.

The Potato Grouper Transcriptome and Candidate Functional Molecular Markers
The diversity of grouper species yields the advantages of interbreeding and dominant inheritance, such as the E. fuscoguttatus × E. lanceolatus hybrid grouper [51]. The potato grouper is a novel cultured grouper species in Taiwan with a potentially high growth performance; however, it is occasionally polycultured with different grouper species, possibly causing unexpected hybrids. This increases the degree of complexity in broodstock management. In such a case, molecular markers are helpful for tracking and managing the broodstock [52]. In the present study, we discovered that the available molecular markers form a close genetic relationship, meaning that they had not only low genetic diversity and commonality but also a smaller number of markers. Next-generation sequencing is efficient and productive for the development of numerous molecular markers in nonmodel animals. Whole-genome sequencing (WGS) generates a complete gene database of the species. However, establishing a de novo transcriptome database of nonmodel aquaculture species is more cost-effective and faster than WGS [53]. In addition, the applications of microarray, MassArray, RNA-seq, ddRAD-seq, and SLAF-seq have been adequately researched in aquaculture animals [54][55][56]. Furthermore, de novo transcriptome yields numerous EST molecular markers (SSR and SNPs) because of the annotation of different functional databases (COG, GO, KEGG, and SwissProt) [57], which may facilitate understanding of the complex trait mechanisms, genetic management, and genetic improvement of stocks [58].
In aquaculture species, the construction of a cDNA library at the transcriptome level can be commonly achieved with numerous methods, including pooling the same tissues from different individuals in the same treatment, using one tissue from an individual, and pooling all tissues as one sample [59][60][61][62]. The choice of the approach depends on factors such as cost, the purpose of research, rare species or individuals, and minimization of interindividual variation. In the present study, because of the low population of potato groupers in the broodstock and lack of genetic bioinformation, a de novo transcriptome was created from an individual fish using seven cDNA libraries to discover expressed sequences, which can be annotated with functional genes and show tissue-specific diversity.
A transcriptome is a powerful tool that provides not only different functional annotations of a bioinformatic expression gene between distinct samples but also molecular markers for application in aquaculture, conservation, fisheries management, genetics, and breeding, as well as correlation with target traits [63][64][65]. In the present study, RNA-seq was used to analyze the phenotypes of growth performance because ESTs may directly influence organism performance and also built the cornerstone of the database for other economical traits [66][67][68][69][70][71]. For example, considerable research using RNA-seq has been performed on the economically important aquaculture species of Atlantic salmon, rainbow trout, turbot, gilthead seabream, and Pacific white shrimp, which helps comprehension of the variation of transcripts between cells, tissues, ontogenetic, food nutrition, and environmental conditions and facilitates analysis if those traits or performance indicators are correlated with the polymorphic markers [72][73][74][75][76][77][78][79]. The common economic aquaculture grouper species-orange-spotted grouper, giant grouper, and hybrid grouper (E. fuscoguttatus × E. lanceolatus)-have been evaluated for growth trait performance, virus resistance, dietary supplementation, and temperature challenge, and the results have revealed some underlying molecular mechanisms [80][81][82][83]. Compared with our database, finding out growth-related functional annotation and molecular mechanisms are both shared in the GH/IGF axis and its downstream signaling pathways. However, these studies have focused on gene function and its integrated signal pathways; by contrast, transcripts identified using a transcriptome profile may influence target traits through various genes [84][85][86].
The potato grouper database was helpful in understanding its functional mechanisms/pathways, gene regulation in different tissues, and correlation of genotypes with traits. We screened the EST molecular markers derived from the transcriptome, which can be applied to genetic management and genetic diversity and growth-related analyses of potato grouper. These markers can be surveyed and used to verify performance by not only planning breeding schemes arranged in genotypes of EST markers to the next generation but also using CRISPR-Cas9 to transfect the target genes artificially to enhance transgenic fish behavior [87][88][89][90][91]. In addition, RNA-seq rapidly yielded numerous EST molecular markers that can be used in future studies on not only growth performance but also other traits, such as disease resistance, stress, and environmental change [92][93][94][95][96][97]. These useful functional molecular markers can provide a more efficient and systematic genetic method for artificial selection in hatchery [98].

Genetic Management, Genetic Diversity, and Growth-Related Molecular Markers in the Potato Grouper
Because of the lack of genetic information, RNA-seq was suitable to generate large amounts of molecular markers for nonmodel aquaculture species [99][100][101]. These annotations of functional genes and markers were highly conserved and potentially used in other similar species [102]. In Taiwan, especially in the early stages of breeding, inbreeding and interbreeding are common in hatcheries, without complete records; however, this problem can be resolved and managed through molecular biotechnology. SSR is widely used in genetic management, parentage analysis, genetic diversity, marker-assisted breeding selection [103], for differentiating population structures between released and wild species [104,105], conservation [106], and fisheries management [107]. Genetic diversity was high in every generation and breed in which we used molecular markers to trace, manage, and prevent inbreeding and intra-and interspecies interbreeding [108][109][110]. We discovered that functional (type I) and nonfunctional (type II) SSR markers have low genetic diversity in breeding selection, with functional SSR markers derived from candidate stocks and juveniles being more sensitive than nonfunctional SSR markers [111]. However, an organism cannot select the best genotype-phenotype combination, probably because various genotypes may be lost during artificial selection with functional molecular markers [112,113]. Hence, using functional SSR markers with polymorphism is critical because they directly influence gene function and because they may be easily missed during the breeding process. In addition, genomic selection can help to not only track signatures of artificial selection and genetic diversity of stocks in the domestication of farmed species [114] but also perform parentage analysis, enabling breeders to trace back to the broodstock with genotypes of specific molecular markers [115][116][117]. The 21 functional molecular markers across broodstock and juveniles identified in the present study demonstrated that the number of alleles and genotypes in the juveniles sharply declined after mating. Selection breeding for economical traits is the first goal of a breeder; hence, maintaining the high variability of genes and population of genetic variation becomes important. The functional SSRs (mentioned above) are directly related to genes that their highly variable produces individual differences which are influentially bred with individual or population selection of potato grouper in the future.
Studies on Atlantic salmon (Salmo salar), orange-spotted grouper (E. coioides), and turbot (Scophthalmus maximus) have indicated that SNP markers are more related to growth traits than SSR are [118][119][120][121][122][123]. Therefore, growth-related SNP markers were selected from our transcriptome and referenced to other studies of the growth performance of fish species. Among them, Unigne7626 (ACTC1) and CL8791.Contig3 (PCNT) was evidently correlated with the growth traits of potato grouper. ACTC1 plays a role in the cytoskeletal structure, cell motility, cell division, and intracellular movements and contractile processes in thin filaments [124]. In human studies, ACTC1 mutations have been noted in heart diseases, including dilated cardiomyopathy and hypertrophic cardiomyopathy [125,126]. Lee et al. observed that ACTC1 is upregulated in mud loaches during the growth stage [127], and Avey et al. suggested that ACTC expression in D. rerio alters the cardiac function and reduces aerobic efficiency [128]. Further research is warranted to determine whether ACTC mutation and expression during growth influence the growth performance of potato grouper. PCNT encodes the centrosome protein pericentrin, which contributes to the mitotic spindle for chromosomal segregation during cell division, thus influencing cell cycle progression and resulting in disorganized mitotic spindles and missegregation of chromosomes [129,130]. In Atlantic salmon, the SNP AX88270804 within PCNT may be associated with several growth traits, especially fat percentage, which has an approximate genetic variation of 4%, whereas the AA/GA/GG genotype of PCNT, which is associated with faster growth, is also associated with increased fatness [131]. In the current study, PCNT genotypes were also correlated with body weight: phenotypically, the CA genotype had the greatest growth, whereas the AA genotype had the smallest. However, these two SNP markers were not differentially expressed in different tissues between FG and SG groups, which phenotype may perform with polygenic inheritance [132] or epigenetics [133].

Conclusions
Our transcriptome analysis of a novel grouper species, potato grouper, provided useful bioinformation. The numerous molecular markers will facilitate genetic management, parentage analysis, genetic diversity in stocks, and complex and polygenic traits research; our findings thus established a basis for future genetic studies on potato groupers. The SNPs were within the ACTC and PCNT genes, which have potentially relevant functional connections to growth traits. Further breeding schemes of these candidate growth-related genes are warranted to identify putative causative variation.
Supplementary Materials: The following are available online at https://www.mdpi.com/2079-773 7/10/1/36/s1, Figure S1: Statistical length of the all-Unigene distribution. Figure S2: Distribution of annotated species that is statistics with Nr annotation. Figure S3: Graph of seven times of growth measurement. Figure S4: Percentage of genotyping call rate% well 1 (A) well 2 (B). Figure S5: Frequency of polymorphic and monomorphic SNPs genotyping derived from 46 genes well 1 (A) well 2 (B). Figure S6: Genotypes of MassARRAY in the two SNPs, Unigene7626 (A) and CL8791.Contig3 (B). Figure S7: The correlation (R 2 ) body weight with body length (A) and total length (B). Figure S8: Relative expression of Unigene 7626 and CL8791. Contig3 in the offspring brain, liver, and muscle tissues. Table S1: Statistics of clean reads after filtering. Table S2: Statistics of the sequencing data after quality trimming. Table S3: Results of all-Unigene annotated to different databases. Table S4: Statistics of unigenes in COG category. Table S5: Statistics of unigenes in GO category. Table S6: Statistics of unigenes in KEGG category. Table S7: Primer lists of functional and nonfunctional SSR, SNP, and qRT-PCR. Table S8: Growth performance. Table S9: Genotypes of 13 functional SSR in juveniles. Table S10: Paternity analysis of broodstock and juveniles.