Molecular Sciences Identification and Characterization of Reference Genes for Normalizing Expression Data from Red Swamp Crawfish Procambarus Clarkii

qRT-PCR is a widely used technique for rapid and accurate quantification of gene expression data. The use of reference genes for normalization of the expression levels is crucial for accuracy. Several studies have shown that there is no perfect reference gene that is appropriate for use in all experimental conditions, and research on suitable reference genes in red swamp crawfish (Procambarus clarkii) is particularly scarce. In this study, eight commonly used crustacean reference genes were chosen from P. clarkii transcriptome data and investigated as potential candidates for normalization of qRT-PCR data. Expression of these genes under different experimental conditions was examined by qRT-PCR, and the stability of their expression was evaluated using three commonly used statistical algorithms, geNorm, NormFinder and BestKeeper. A final comprehensive ranking determined that EIF and 18S were the optimal reference genes for expression data from different tissues, while TBP and OPEN ACCESS Int. 21592 EIF were optimal for expression data from different ovarian developmental stages. To our knowledge, this is the first systematic analysis of reference genes for normalization of qRT-PCR data in P. clarkii. These results will facilitate more accurate and reliable expression studies of this and other crustacean species.


Introduction
The red swamp crawfish Procambarus clarkii is a freshwater species native to the South-Central United States and Northeastern Mexico but has spread in range to become one of the most invasive species worldwide [1][2][3]. P. clarkii was introduced from Japan to Nanjing, China in 1929 [4,5], and this species has been farmed extensively in China since the 1990s, which is now the world's leading crawfish producer [6,7].
The ovary is a component of both reproductive and endocrine systems and ensures oogenesis proceeds smoothly [8]. In order to develop novel methods for controlling gonad maturation in crawfish to improve yields, it is critical to understand the molecular mechanisms of ovarian development and oogenesis [9]. Oogenesis is a complicated differentiation process regulated by a vast number of intra-and extra-ovarian factors [10][11][12].
Understanding gene expression is essential for elucidating the functional mechanisms underlying this complex process. Quantitative real-time PCR (qRT-PCR) is a useful technique for measuring gene expression due to its high sensitivity, accuracy, and reproducibility [13,14]. However, many factors can affect the accuracy of qRT-PCR, including the quality of the RNA, the efficiency of reverse transcription, the primer specificity and amplification efficiency, and the statistical analysis methods employed [15]. A suitable reference gene is needed for normalizing expression levels, and these should ideally be expressed at a stable rate across different species, in different tissues and under different experimental conditions. Selection of reference genes is therefore critical for ensuring the accuracy of qRT-PCR analysis, and numerous studies have demonstrated that this selection can have a significant impact on expression levels. Ideally, reference genes should not be heavily influenced by other factors [16].
To date, few studies on suitable reference genes in P. clarkii have been reported, and the use of unstable reference genes may lead to inaccurate results and should be avoided [21]. Consequently, novel reference genes are much needed for normalizing qRT-PCR data in this species.
Our knowledge of reproduction-associated genes in crustaceans has improved thanks to the high-throughput sequencing of transcriptome libraries [22][23][24][25], and we now have a large reference gene resource from which to select candidates for validating gene expression levels. In this study, we selected several frequently used crustacean reference genes from the P. clarkii transcriptome library [26] and evaluated their stability in eight tissues and six ovarian developmental stages. Results were assessed using the geNorm, NormFinder and BestKeeper statistical algorithms. Furthermore, in order to illustrate the usefulness of the newly identified reference genes, we analyzed the expression of two genes; vitellogenin (Vg), an interesting gene associated with ovarian maturation and oogenesis, and proliferating cell nuclear antigen (Pcna). This study will facilitate more accurate and reliable expression studies in P. clarkii and other crustacean species.

Primer Specificity and Amplification Efficiency
PCR amplifications using each primer pair were confirmed by the presence of a single peak in the melting curve analysis [27] and a specific band of the expected size in agarose gel electrophoresis. An ideal reaction reaches an efficiency value close to 1.0, representing a PCR efficiency of 100% [19,[26][27][28]. The results showed that the primer efficiency of the eight reference genes ranged from 96.0% to 103.6%, and the correlation coefficient (R 2 ) values ranged from 0.992 to 0.999 (Table 1).  Figure 1A) and from 15.11 (ACTB) to 30.04 (EF-1α) for samples in different ovarian developmental stages ( Figure 1B). These results showed that GAPDH and ACTB were the most abundant transcripts with the lowest Ct values, whereas TBP and EF-1α had the lowest expression levels with the highest Ct values. Based on Ct dispersion, all candidate genes except EF-1α and TBP exhibited minimal variation.

Stability Analysis of Reference Genes
To further evaluate the stability of the candidate reference genes, the three commonly used algorithms geNorm, NormFinder and BestKeeper were applied to samples from different tissues and different ovarian developmental stages using the online integration tool RefFinder.
The geNorm algorithm was used to calculate an expression stability value (M) where a lower value indicates more stable expression [28]. The results showed that among the different tissue samples, EIF and 18S were the most stably expressed, while GAPDH and EF-1α were the least stably expressed (Table 2; Figure 2). Of the samples from the ovarian developmental stage, TBP and EIF were the most stably expressed, while UB and EF-1α were the least stably expressed. NormFinder was used to evaluate the stability of reference genes by calculating intra-and inter-group variation. A higher stability of expression is indicated by a lower average expression stability value [29]. The results showed that among the different tissue samples, EIF and 18S were the most stably expressed, while TBP and EF-1α were the least stably expressed (Table 2; Figure 3). Of the samples from the ovarian developmental stage, TBP and EIF were the most stably expressed, while UB and EF-1α were the least stably expressed. The results of geNorm and NormFinder were therefore in good agreement.
BestKeeper uses the standard deviation (SD) and coefficient of variation (CV) of Ct values to evaluate expression stability, with more stable expression indicated by a lower SD [30]. Among the different tissues, 18S, EIF and UB exhibited the lowest SD values and therefore the most stable expression, while TBP and EF-1α were the least stably expressed (Tables 2 and 3). TUB, ACTB and 18S exhibited the lowest SD values and the most stable expression of the ovarian developmental stage samples, while UB and EF-1α were the least stably expressed.
Different ovarian developmental stages Candidate reference genes were ordered from the most to the least stably expressed, based on their stability values calculated by different algorithms. The comprehensive rank for each gene is based on the geometric mean of its ranks from geNorm, NormFinder and BestKeeper. A lower geometric mean indicates higher stability.
RefFinder is a comprehensive tool developed for evaluating reference genes and was used to confirm the results obtained from geNorm, NormFinder and BestKeeper. Based on the geometric mean obtained from several statistical approaches, we obtained the similar results for candidate reference genes. However, their ranks were slightly different under different experimental conditions (Table 2; Figure 4). Among the different tissue samples, the final comprehensive ranking suggests that the most stable reference genes were EIF and 18S, followed by UB, GAPDH, ACTB, TUB, TBP and EF-1α. For the ovarian developmental stage samples, TBP and EIF were the most stably expressed genes, followed by 18S, TUB, GAPDH, ACTB, UB and EF-1α. In all statistic algorithms, EF-1α was the least stably expressed among the eight tested genes. This gene should therefore not be used as a reference for normalization of qRT-PCR data in P. clarkii, certainly when comparing different tissues or ovarian developmental stage samples.

Validation of the Usefulness of the Selected Reference Genes
It has been demonstrated that the use of unstable reference genes can dramatically increase the inaccuracy of expression data of target genes. In order to confirm the validity of the tested candidate reference genes, the relative expression level of the ovarian developmental-associated Vg and Pcna were normalized using the two most stably expressed and the least stably expressed genes as determined by RefFinder as described above.
Using the most stable reference genes (EIF and 18S), the relative expression level of Vg was highest in gonad tissues, especially in ovary ( Figure 5A). In contrast, Vg expression levels were higher in gill and hepatopancreas tissue when the least stable reference gene (EF-1α) was used for normalization. The relative expression level of Vg during different ovarian developmental stages provided insight into the function of Vg in this process ( Figure 5B). Vg expression levels were significantly lower during no developmental stages (I) and during early development (stage II) when normalized against the most stable reference genes (EIF and TBP), and gradually increased to peak at the mature stage (V) before decreasing significantly at the post-spawning stage (VI). These results were similar to those reported previously for Vg in both P. clarkii [31] and Litopenaeus vannamei [32]. However, when the least stable reference gene (EF-1α) was used for normalization, Vg expression levels were higher at stages I and II, and also higher from the previtellogenic stage (III) to mature (V) stages.
Pcna is an accessory protein of DNA polymerase δ and plays an essential role in nucleic acid metabolism [33]. Data showed that Pcna is expressed highly in proliferating tissues such as testis and ovary in shrimp [34]. In the present study, when the relative expression of Pcna was normalized against the most stable reference genes (EIF and 18S), the data mirrored this trend with higher expression levels in testis and ovary. In contrast, normalization against the least stable reference gene (EF-1α) resulted in higher expression in gill, hepatopancreas and intestine tissue ( Figure 6A). A previous study in the developing ovary of Marsupenaeus japonicus identified differential expression of Pcna, with lowest expression before development and highest expression during stage II [35]. In the present study, Pcna showed a similar expression profile, with lowest levels before development (stage I), peak expression during stage II, and a gradual decrease to the post-spawning stage (VI) when normalized against the most stable reference genes (EIF and 18S) ( Figure 6B). These results further suggest that Pcna plays an important role in ovarian development, especially in cell proliferation. However, the Pcna expression pattern differed significantly when normalized against the least stable reference gene (EF-1α), emphasizing the importance of the reference genes chosen for normalization.

Discussion
Analysis of gene expression under different experimental conditions is important for informative functional analysis, and qRT-PCR is one of the most widely used methods for quantifying gene expression [14,36]. However, since no potential reference genes are likely to be stably expressed under all possible experimental conditions, it is advisable to validate candidate reference genes under the required experimental conditions prior to their use in qRT-PCR normalization, rather than using previously published reference genes under different conditions [13,21,37].
Recent studies showed that many traditional reference genes are not stably expressed under different conditions [38][39][40][41][42], and the results of the present study support this. Frequently used reference genes such as ACTB, UB and EF-1α have been found to be inappropriate for normalization of expression data in P. clarkii. In the present work, EF-1α was the least stably expressed gene among different tissues and during different ovarian development stages, UB was ranked the next least stable, and ACTB expression was of intermediate stability.
Some previous studies have focused on identifying reference genes in other crustaceans [43,44], but research on P. clarkii is lacking. For accurate measurement of gene expression, reference genes should be stably expressed under given experimental conditions [37]. In the present study, eight candidates were tested among different tissues and during different ovarian developmental stages using the commonly used statistical algorithms geNorm, NormFinder and BestKeeper.
Furthermore, the results obtained from the different algorithms were relatively consistent in terms of the most and the least stably expressed genes. Slight differences in ranking between different programs was ascribed to their different statistical algorithms. 18S and EIF were the most stably expressed among the different tissues tested. EF-1α was found to be the least stable reference gene by all the three statistical algorithms, and is not suitable for use in qRT-PCR analysis among different tissues from P. clarkii, despite being an ideal reference gene in ovarian tissue from Peneaus monodon [43]. Based on the final ranking, EIF and 18S were designated as the most appropriate reference genes for normalization of gene expression in different tissues of P. clarkii. The results for different ovarian developmental stages were similar overall, but TBP and EIF were the most stable reference genes, while EF-1α was the least stable. Consequently, TBP, EIF and 18S were considered to be the most appropriate reference genes for normalization of expression data during ovarian developmental stages of P. clarkii.
To investigate the effects of using different reference genes, the relative expression levels of the ovarian developmental-associated Vg and Pcna were investigated in different tissues and different ovarian developmental stages. Normalization using the most stable reference genes (EIF and TBP) were consistent with previous reports on Vg and Pcna expression in crustaceans, however normalization using the least stable reference gene (EF-1α) generated inconsistent results [31,32,45]. These results demonstrated that selection of different reference genes can have a significant impact on expression data, as previously reported [46].

Crawfish Tissue Sample Collection
The handling of crawfish was conducted in accordance with the guidelines on the care and use of animals for scientific purposes set by the Institutional Animal Care and Use Committee (IACUC) of Shanghai Ocean University, Shanghai, China. The crawfish used in this project were obtained from Jiangsu Xuyi River red Crawfish Eco-Park, Jiangsu Province, China. Before tissue collection, crawfish were cultured in a tank of continuously aerated freshwater at ambient temperature (28 °C) for 72 h. Eyestalk, gill, hepatopancreas, heart, intestine, muscle, testis and ovary tissue were collected from six healthy, sexually mature crawfish. Based on a previous study [47], we classified the ovarian developmental cycle of P. clarkii into six stages: no developmental (I), early developmental (II), previtellogenic (III), vitellogenic (IV), mature (V) and post-spawning (VI). Ovarian developmental stage tissue (including the six stages described above) was surgically removed from six individuals per developmental stage, immediately frozen in liquid nitrogen, and stored at −80 °C until RNA extraction. Thus, eight tissue samples and six ovarian developmental stage samples were used in the subsequent analysis.

RNA Extraction and Reverse Transcription
Total RNA from each sample was isolated using TRIzol (Invitrogen, Carlsbad, CA, USA) following the manufacturer's instructions, incubated with 10 U DNase I (Ambion, Austin, TX, USA) at 37 °C for 1 h and purified with a MicroPloy (A) Purist Kit (Ambion, Austin, TX, USA) according to the manufacturer's instructions. Purified mRNA was dissolved in RNase/DNase-free ddH2O, and a NanoDrop (Thermo, Wilmington, DE, USA) was used to determine the sample quality and final concentration. The cDNA template for qRT-PCR was generated from 500 ng of total RNA with a PrimeScript RT reagent Kit (Takara, Dalian, China) in accordance with the manufacturer's instructions. All cDNA samples were diluted to 50 ng/μL and stored at −20 °C for qRT-PCR analysis.
Sequences with high sequence similarity to candidate reference genes were aligned separately to highlight potential areas of polymorphism. Specific primers (Table 1) were designed using the Primer Premier 5.0 program [49], and primer pairs were selected from sequence regions with the fewest polymorphisms. The annealing temperature was 60 °C for all primer pairs, and the length of PCR products was 100-250 bp. PCR efficiency and correlation coefficients (R 2 ) were determined based on the slopes of the standard curves generated using a 10-fold serial dilution of cDNA template. Primer specificity was confirmed by the presence of a single product of the expected size following 1.5% agarose gel electrophoresis and ethidium bromide staining. In addition, target amplicons were sequenced to confirm specificity of the PCR products.

Quantitative Real-Time PCR
Expression levels of the eight selected genes and the ovary-associated genes Vg and Pcna were measured using qRT-PCR performed in a 96-well plate with the CFX 96 real time system (Bio-Rad, Hercules, CA, USA). Each reaction of 25 μL contained 12.5 μL of 2× SYBR Green Real-Time PCR Master Mix, 2 μL of cDNA (100 ng), 9.5 μL of RNase/DNase-free ddH2O and 0.5 μL of each forward and reverse primer (10 mM). PCR conditions were 95 °C for 10 min, followed by 40 cycles of 95 °C for 10 s and 60 °C for 30 s, and generation of a dissociation curve by increasing the temperature from 65 to 95 °C in 0.5 °C increments to check for specificity of amplification. Each sample was amplified in triplicate (three technical replicates).
Baseline and threshold cycle (Ct) values were automatically determined using Bio-Rad CFX manager version 1.6 with default parameters. The ΔΔCt method was used to evaluate the quantities of each amplified product according to the user manual. All figures were drawn with SigmaPlot 12.5 [50].

Analysis of Gene Expression Stability
The web-based tool RefFinder [51] (www.leonxie.com/referencegene.php) that integrates the algorithms geNorm [28], NormFinder [29] and BestKeeper [30] was used to evaluate the stability of the expression of the eight potential reference genes. For each gene, RefFinder estimated the geometric mean of the ranks calculated using the aforementioned approaches. Finally, we selected the ideal reference genes determined by RefFinder with lowest rank geometric mean. In order to confirm that the tested genes were suitable as internal controls for qRT-PCR analysis in P. clarkii, the two most stable and the least stable potential reference genes as determined by RefFinder were used to normalize the expression level of Vg in the same qRT-PCR conditions described above.

Conclusions
To the best of our knowledge, this is the first investigation into the expression stability of reference genes in P. clarkii. We identified and tested eight candidate reference genes for their ability to normalize qRT-PCR expression data from eight different tissues and six different ovarian developmental stages. These results will be useful for ensuring that future qRT-PCR data from this species and other crustaceans is accurate and reliable.