Selection of Reliable Reference Genes for Gene Expression Normalization in Sagittaria trifolia

Real-time quantitative PCR (RT-qPCR) is a method with high sensitivity and convenience that has been extensively used to analyze the expression level of target genes. A reference gene with a highly stable expression is required to ensure the accuracy of experimental results. However, the report on appropriate reference genes in arrowheads (Sagittaria trifolia) is still limited. In this study, eight candidate reference genes (ACT5, UBQ, GAPDH, CYP, NAC, IDH, SLEEPER and PLA) were selected. The candidate genes were employed in a RT-qPCR assay in different tissues at different developmental stages of the same tissue (including corm, leaf and leafstalk) in arrowheads. Five statistical algorithms, GeNorm, NormFinder, BestKeeper, delta cycle threshold (ΔCt) and RefFinder, were used to evaluate the stability of these genes’ expressions in order to identify the appropriate reference genes. The results showed that UBQ was the optimum reference gene in leaf, leafstalk, root, stolon and corm, IDH exhibited the most stable expression during the expansion of corm, UBQ and PLA were the most stable reference genes in developmental stages of leaf and leafstalk, respectively. Finally, the reliability of reference genes was further confirmed by the normalization of PDS and EXP1 genes under different arrowhead tissues and developmental stages of corm, respectively. This study constitutes important guidance for the selection of reliable reference genes for analyzing the tissue- and developmental-stage-specific expression of genes in arrowheads.


Introduction
Real-time quantitative PCR (RT-qPCR) is an approach that uses fluorescence signals to monitor the total amount of products after each polymerase chain reaction (PCR) cycle [1]. Due to its high accuracy and strong specificity, it has been widely applied to analyze gene expression levels [2,3]. The gene expression analysis by RT-qPCR is affected by many factors, such as RNA quality, cDNA synthesis efficiency, primer specificity, and the stability of reference genes [4,5]. The expression level of the target gene can be calibrated by using stably expressed reference genes as standard, so as to reduce the difference between the target gene expression and the true value [6]. The stability of reference genes exerts a decisive effect; unstable reference genes will lead to erroneous results in gene expression quantification [7]. Selecting appropriate reference genes to standardize the expression of target genes is required to ensure the accuracy and reliability of RT-qPCR results [8,9].
Reference genes are a class of genes that is stably expressed in various tissues and cells, or stable in the same tissues and cells at all stages of plant growth. The expression of these reference genes will not be affected by the variation of external environmental factors and experimental conditions [10,11]. The commonly used reference genes include actin gene (ACT), glyceraldehyde-3-phosphate dehydrogenase gene (GAPDH), tubulin gene (TUB), ubiquitin gene (UBQ), ribosomal RNA encoding gene (18S rRNA) and others [12,13].
Reference genes differ between species, and different experimental conditions also require different reference genes. Therefore, the study on the stability of reference gene expression in the materials from different groups is prerequisite and crucial for gene expression normalization.
Arrowhead (S. trifolia L.) is an herbaceous aquatic plant of the genus Sagittaria in the Alismaceae family. Arrowhead contains arrow-shaped leaves, white flowers, and expanded corms from an underground part. The corms show ovoid or spherical shape, and are the edible organ of arrowheads with high nutritional and medicinal values [14]. It is reported that arrowhead contains a variety of colchicine, thus showing several effects, including clearing heat, reducing swelling, preventing cancer and fighting cancer [15]. The nutrients in arrowhead corms are mainly composed of starch, protein, fat, carbohydrates and fiber [16]. Cultivated arrowheads are mainly distributed in the middle and lower reaches of the Yangtze River, South China and Southwest China, and are a kind of characteristic vegetable favored by consumers due to their abundant nutrition. They are also of great value in enriching the variety of vegetables on the table.
For arrowheads, the research on their molecular biology is lagging far behind other species. The analysis of gene expression is the most fundamental and frequently used technical tool in molecular biology research. However, there is still no consensus on which genes can be applied as reference genes in RT-qPCR experiments of arrowheads. In this study, eight potential genes (ACT5, UBQ, GAPDH, CYP, NAC, IDH, SLEEPER, and PLA) were selected as candidate genes. Then, we tested these genes expression levels in different tissues and three developmental stages of the same tissue (corm, leaf and leafstalk), and then performed the normalization assay and verification to identify stable reference genes for future RT-qPCR analysis.

Plant Materials
In this experiment, the S. trifolia. L. cv. 'Baimati (BMT)' was used as the material, which was planted in the aquatic vegetable experimental field of Yangzhou University. The arrowhead plants with 3-4 leaves were planted in the non-porous plant pot containing soil and maintained a 5 cm water layer on the surface of the soil. Early corm formation occurred at 6-7 weeks after planting. Here, the corm developmental stage was divided into three stages, including stage 1 (c1, 50 days after planting), stage 2 (c2, 70 days after planting) and stage 3 (c3, 90 days after planting). Leaf and leafstalk growth was divided into three stages, namely stage 1 (30 days after planting), stage 2 (50 days after planting) and stage 3 (70 days after planting). Different tissues (leaf, leafstalk, root, stolon, corm) were collected at stage 2 of corm development. The tissues, including corm, leaf and leafstalk, at three developmental stages were also harvested at three stages, respectively ( Figure 1). Obtained samples were stored at −80 • C for subsequent experiment.

RNA Extraction and cDNA Synthesis
RNA was extracted using the Plant Total RNA Extraction Kit (Pudi, Shanghai, China). The concentration and quality of RNA was determined using a One-Drop TM spectrophotometer (Thermo Fisher, Waltham, MA, USA). The OD 260/280 of each RNA sample was 1.8~2.1. Purified RNA was stored in an ultra-low-temperature refrigerator at −80 • C. Then, 1000 ng RNA was used for reverse transcription to generate cDNA, and the obtained cDNA was stored at a −20 • C condition. All experimental procedures were conducted according to the manufacturer's instructions. Three biological replicates were set for each sample.

RNA Extraction and cDNA Synthesis
RNA was extracted using the Plant Total RNA Extraction Kit (Pudi, Shanghai, China). The concentration and quality of RNA was determined using a One-Drop TM spectrophotometer (Thermo Fisher, Waltham, MA, USA). The OD260/280 of each RNA sample was 1.8~2.1. Purified RNA was stored in an ultra-low-temperature refrigerator at −80 °C. Then, 1000 ng RNA was used for reverse transcription to generate cDNA, and the obtained cDNA was stored at a −20 °C condition. All experimental procedures were conducted according to the manufacturer's instructions. Three biological replicates were set for each sample.

Primer Design
The ORF (open reading frame) of eight reference genes (ACT5, UBQ, GAPDH, CYP, NAC, IDH, SLEEPER, and PLA) was retrieved from our transcriptome database and cloned from the 'BMT' cDNA. Specific primers of these genes were designed using Primer 6.0 software (Premier Biosoft, Palo Alto, CA, USA). The parameter for primer design was set as follows: melting temperatures (Tm) of 54-60 °C, GC content of 45~55%, primer length ranging between 18 and 25 bp, an amplicon length range of 100~250 bp. Designed primers for all target genes were synthesized by General Biology corporation (Chuzhou, Anhui, China). The sequences of all primers are listed in Table 1.

Primer's Performance Analysis and RT-qPCR Test
The fragment of candidate reference genes was amplified using designed primers and a cDNA template from arrowhead corms. The amplification reaction system was 20 µL, including 10 µL of Taq PCR SuperMix (TransGen, Beijing, China), 1 µL of forward/reverse primers (10 µM), 2 µL of cDNA and 6 µL of ddH 2 O. The reaction conditions were as follows: pre-denaturation at 94 • C for 5 min, then denaturation at 94 • C for 30 s, annealing at 60 • C for 30 s, extension at 72 • C for 30 s, a total of 35 cycles, and finally 72 • C for 10 min. The amplified products were analyzed by 1% agarose gel electrophoresis to detect the uniqueness of the amplified bands. The amplification efficiency of primers was detected by a qPCR assay using a 10-fold diluted series (10×, 10 2 ×, 10 3 ×, 10 4 × and 10 5 ×) of cDNA of arrowhead corms. The total reaction mixture for qPCR was 10 µL, including 4 µL of cDNA, 5 µL of iTap Universal SYBR Green Supermix (Bio-Rad, Hercules, CA, USA) and 0.5 µL of forward/reverse primer (10 µM). Reaction conditions involved pre-denaturation at 95 • C for 3 min, followed by 39 cycles at 95 • C/10 s and 60 • C/30 s. The amplicon was heated from 65 to 95 • C to obtain the dissociation curve. Three technical repetitions were set for each sample. The cDNA of all arrowhead tissues was used as the template for the analysis of the melting curve. The specificity of primer was further determined by melting curves. In addition, Cq values were used to draw standard curves, and the primer amplification efficiency (E) and correlation coefficient (R 2 ) were analyzed. The computing formula of primer amplification efficiency was % E = (10 [−1/slope] − 1) × 100%. The calculation of amplification efficiency requires that the standard curve has a satisfactory linear relationship (R 2 > 0.99), and the amplification efficiency of all primers meeting the requirements of qPCR detection should be between 90% and 110%.
The expression levels of candidate reference genes in five plant tissues and three developmental stages of the same tissue (corm, leaf and leafstalk) were performed by RT-qPCR analysis, using the CFX96 PCR system (Bio-Rad, Hercules, CA, USA). Three biological replicates, each with three technical replicates, were set for the samples.

Statistical Analysis of Gene Expression Stability
In order to select the reference genes with high expression stability, four statistical algorithms were employed. Firstly, the M value (gene expression stability measurement) was computed using GeNorm to screen stable reference genes [17]. NormFinder software was used to reanalyze and calculate the S value (Normfinder stability) [18]. After evaluation by SD (standard deviation) and CV (coefficient of variation) using BestKeeper, the reference genes were ranked [19,20], and then the SD value was analyzed again by ∆Ct [21]. Finally, RefFinder [22] (available online: http://blooge.cn/RefFinder/, accessed on 15 April 2023) was used to synthetically analyze data from all algorithms. The reliable reference genes in different tissues and developmental stages were screened. All algorithms were used according to the manufacturer's instructions.

Expression Analysis of PDS and EXP1 Genes
PDS (encoding phytoene desaturase) and EXP1 (encoding one expansin) were chosen as target genes to verify the reliability of selected reference genes and their ORF sequences were listed in Figures S1 and S2. PDS was used to test the reliability of reference genes in different tissues, and EXP1 was used to verify the reference gene for corms at three developmental stages. The expression levels of two target genes, PDS and EXP1, were normalized using the top two stable and least stable reference genes. The reaction conditions of RT-qPCR consulting in the procedure are described above. Relative expression levels of PDS and EXP1 were calculated by the 2 −∆∆Ct method. Three biological repetitions and technical repetitions were set for each reaction. The primers for gene expression verification of PDS and EXP1 are shown in Table 2.

Data Analysis
The significance of data was analyzed by one-way ANOVA using Duncan's method. SPSS v20.0 software was used to conduct one-way ANOVA. Values of p < 0.05 were considered statistically significant.

Specificity and Amplification Efficiency of Primers
PCR amplification was used to detect the specificity of primers, while electrophoresis results showed that the fragment size was correct and the band was single ( Figure S3). The cDNA of all arrowhead tissues was used as the template, and melting curves of the selected reference genes were detected by qPCR. Generated melting curves showed a specific single peak, and no other miscellaneous peak was found, providing more evidence for the primer's specificity ( Figure 2). The standard curve method was employed to calculate the amplification efficiency (E) of candidate reference genes. The qPCR efficiency of eight reference genes was between 98% and 109%, and the linear correlation coefficient was above 0.99 (Table S1). The above results suggest that the specificity and amplification efficiency of the designed primers was great; the primers could be used in subsequent experiments.

Expression Levels of Candidate Reference Genes
The expression levels presented as the quantification cycle (Cq) values of eight candidate reference genes in the samples were obtained by RT-qPCR assay. The Cq values of all reference genes are shown in Figure 3 and Table S2. The Cq values of reference genes in samples from different tissues and different developmental stages of the same tissue (corm, leaf and leafstalk) ranged from 25.13 (GAPDH) to 35.26 (SLEEPER), 20.85 (UBQ) to 33.39 (NAC), 28.03 (UBQ) to 35.57 (NAC) and 26.72 (UBQ) to 34.55 (IDH), respectively. Low Cq values represented high expression levels [23]. In different arrowhead tissues, the expression of GAPDH was the highest, while SLEEPER exhibited the lowest expression ( Figure 3a). The expression difference of PLA was the smallest, whereas that of GAPDH was the largest. In corms of three developmental stages, the expression levels of UBQ and GAPDH were significantly higher than those of other selected genes (Figure 3b). The expression difference of UBQ and CYP was the smallest and largest, respectively. During different developmental stages of the leaf and leafstalk, the expression of UBQ was significantly higher than that of other genes, and the expression of IDH showed significant differences among the detected samples (Figure 3c,d). However, it is not accurate to identify the stably expressed reference genes simply based on the evaluation of the original Cq value. Thus, we employed the four following statistical algorithms to screen the most stable reference genes.

GeNorm Analysis
The GeNorm algorithm was used to calculate the M value to evaluate the expression stability of reference genes [24]. The gene with the lowest M value is considered the most stable reference gene, while the gene with the highest M value is considered to be the least stable. Regrettably, this algorithm could not distinguish the M values of the top two stable genes, resulting in two most stable reference genes in the end. The M values of each gene in materials of different groups are shown in Table 3. The GeNorm algorithm required that the M value of the appropriate reference gene be less than 1.

Expression Levels of Candidate Reference Genes
The expression levels presented as the quantification cycle (Cq) values of eight ca pression difference of UBQ and CYP was the smallest and largest, respectively. During different developmental stages of the leaf and leafstalk, the expression of UBQ was significantly higher than that of other genes, and the expression of IDH showed significant differences among the detected samples (Figure 3c,d). However, it is not accurate to identify the stably expressed reference genes simply based on the evaluation of the original Cq value. Thus, we employed the four following statistical algorithms to screen the most stable reference genes.

GeNorm Analysis
The GeNorm algorithm was used to calculate the M value to evaluate the expression stability of reference genes [24]. The gene with the lowest M value is considered the most stable reference gene, while the gene with the highest M value is considered to be the least stable. Regrettably, this algorithm could not distinguish the M values of the top two stable genes, resulting in two most stable reference genes in the end. The M values of each gene in materials of different groups are shown in Table 3. The GeNorm algorithm required that the M value of the appropriate reference gene be less than 1.5, indicating the expression stability of each reference gene in the samples of different groups was satisfactory.

NormFinder Analysis
The NormFinder algorithm was applied to evaluate the expression stability of the eight selected reference genes, the S values of all the reference genes were calculated based on intra-group and inter-group variation [25]. The lower the S values, the more stable the gene expression. The S values of each gene in the materials from different groups are listed in Table 3. In different tissues of arrowheads, the expression of ACT5 was the most stable (S value of 0.22), followed by UBQ (S value of 0.25), and GAPDH was the least stable (S value of 2.95). In corms of different developmental stages, IDH, with the lowest S value (0.11), showed the highest expression stability, followed by ACT5 (S value of 0.13), and CYP was the worst stable reference gene with the highest S value (0.71). UBQ (S value of 0.27 and 0.26) was the optimum reference gene and IDH (S value of 1.04 and 0.74) was the more unstable reference gene in leaves and leafstalks at different developmental stages.

BestKeeper Analysis
The BestKeeper algorithm was used to calculate the standard deviation (SD) and coefficient of variation (CV) to rank the expression stability of eight reference genes [26]. The smaller the SD and CV values, the more stable the gene expression. The SD and CV values of each gene in the materials from different groups are shown in Table 3. BestKeeper algorithm believes that the SD value of the appropriate reference gene should be less than 1, so GAPDH and IDH were excluded from the analysis of different tissues and different developmental stages of leaf and leafstalk in arrowheads, respectively. In different tissues, NAC was the most stable gene (SD value of 0.72, CV value of 2.31), followed by UBQ (SD value of 0.76, CV value of 2.73), and GAPDH was the least stable gene (SD value of 2.99, CV value of 10.24). In corms of different developmental stages, SLEEPER (SD value of 0.29, CV value of 0.94) and UBQ (SD value of 0.30, CV value of 1.40) were judged to be the top two genes in terms of stable expression, and NAC (SD value of 0.67, CV value of 2.06) showed the least stable expression. During the different developmental stages of leaves and leafstalks, CYP (SD value of 0.21 and 0.33, CV value of 0.68 and 1.05) expression was the most stable, while IDH (SD value of 1.00 and 1.08, CV value of 3.03 and 3.24) was, to the contrary, not stable.

∆Ct Analysis
The expression stability of the selected reference genes was also evaluated using the ∆Ct algorithm based on the SD. The gene with the lowest S value is thought to be the most stable [27]. As shown in Table 3, the S values of each gene in the arrowhead samples from different groups were assessed. In different tissues, UBQ showed the lowest S value (1.04) and it was the most stable reference gene, whereas GAPDH (S value of 3.03) was the least stable. The expression of IDH was the most stable (S value of 0.44) in arrowhead corms among the three developmental stages, followed by ACT5 (S value of 0.46), and CYP exhibited the least stable (S value of 0.78) expression. In leaves and leafstalks at different developmental stages, UBQ (S value of 0.69) and PLA (S value of 0.51) showed the most expression stability, while the expression of IDH (S value of 1.15 and 0.81) was the most unstable.

RefFinder Analysis
RefFinder is a comprehensive web-based tool that includes different algorithms from GeNorm, NormFinder, BestKeeper and ∆Ct [22]. Based on the results of all statistic methods, the stability of the arrowhead candidate reference genes was comprehensively evaluated by RefFinder. The comprehensive analysis results are recorded in Table 3.
After comprehensive comparative analysis based on the above four algorithms, we found that UBQ was the most stable gene expressed in different arrowhead tissues, followed by SLEEPER, whereas GAPDH was the least stable. In different developmental stages of corms, IDH showed the most stable expression, followed by UBQ, and CYP obtained the lowest score in expression stability. The stability scores of UBQ and PLA were the highest during the developmental stages of leaves and leafstalks, respectively, whereas the stability score of IDH was the lowest.

Reference Gene Validation
Phytoene desaturase (PDS) is a key enzyme involved in carotenoid synthesis, which also has been proved to function in regulating chlorophyll deposition in plant tissues [28]. PDS is usually highly expressed in plant tissues that appear green. EXP1 (encoding one expansin) is the major factor regulating cell wall extension and plays an important role in plant organ expansion [29]. In order to verify the reliability and accuracy of the selected reference genes, we measured the relative expression of PDS in different tissues and EXP1 in corms from different developmental stages. The results show that the relative expression levels of PDS in green leaf and leafstalk are significantly higher than those in three other tissues when UBQ and SLEEPER were used as the reference genes. When the most unstable reference gene GAPDH was used as the internal standard, PDS was only highly expressed in the leaf (Figure 4a-c). During the corm expansion process, stably expressed IDH and UBQ were employed as the reference gene, and the relative expression level of EXP1 was up-regulated with the development of corms and peaked in stage 3 of corm development. When CYP was used as the reference gene, the EXP1 expression pattern was overestimated at stage 2 of corm development. (Figure 4d-f). The above results demonstrate that our screening for reference genes in varied arrowhead tissues and corms of different development stages was reliable.

Discussion
In recent years, RT-qPCR has become the preferred method in detecting gene expression [30]. Normalization of data by a reference gene with a highly stable expression is a prerequisite in ensuring the reliability of experimental results [31]. In this study, the ex-

Discussion
In recent years, RT-qPCR has become the preferred method in detecting gene expression [30]. Normalization of data by a reference gene with a highly stable expression is a prerequisite in ensuring the reliability of experimental results [31]. In this study, the expression levels of eight candidate reference genes (ACT5, UBQ, GAPDH, CYP, NAC, IDH, SLEEPER and PLA) were determined by using the samples from different tissues and different developmental stages of the same tissue (corm, leaf, leafstalk) in arrowheads. This provided a reference for studying the physiological and biochemical processes of arrowheads at the gene expression level.
Four statistical algorithms, GeNorm, NormFinder, BestKeeper and ∆Ct, were used to evaluate the stability of these selected reference genes [32]. The most stable reference genes screened from several statistical algorithms were different, while the least stable reference gene was highly consistent. In different arrowhead tissues, the combination of UBQ and SLEEPER, ACT5, NAC and UBQ ranked first in GeNorm, NormFinder, BestKeeper and ∆Ct algorithms, respectively, and were considered to be the most stable reference genes. Whereas GAPDH was judged to be the least stable reference gene throughout the analysis of four algorithms. During the expansion of corms, both Normfinder and ∆Ct algorithms considered IDH to be the best reference gene, Genorm and BestKeeper algorithms indicated that the combination of UBQI, PLA and SLEEPER may be the most stable, respectively. CYP and NAC constituted the least stable reference genes based on the analysis of four algorithms. In three developmental stages of leaves and leafstalks, UBQ was considered as the most stably expressed gene by the NormFinder algorithm and ∆Ct algorithm, and the expression of IDH showed the least stability. Since different statistical algorithms were applied, these differences in screening of optimal internal reference genes were acceptable. Finally, the RefFinder algorithm was used to comprehensively compare and rank these reference genes [33], suggesting that UBQ was the best reference gene in different tissues and growth of leaves, and IDH and PLA were the optimum references for gene expression normalization during the growth of corms and leafstalks, respectively. Based on the results that mentioned above, it is concluded that UBQ is the most suitable reference gene in arrowhead tissues.
UBQ and IDH have also been reported as reference genes in other plants. UBQ was verified to be a stably expressed reference gene in different parts and seeds of different developmental stages of Paeonia ostii [34], while a similar report has been found in Dimocarpus longan [35]. IDH has been used as the reference gene under specific conditions in other species, such as different developmental stages of flowers in Pinus massoniana, and various stress conditions in Eucalyptus [36,37]. The reliable reference genes evaluated by GeNorm, NormFinder and ∆Ct in our work were similar to previous findings.
In order to verify the accuracy of the selected optimal reference genes, we detected the expression levels of PDS and EXP1 in different tissues and corms of different developmental stages using different reference genes. PDS was closely linked to the accumulation of chlorophyll and was highly expressed in green plant tissues in general [38]. In Triticum aestivum, the expression of the PDS gene was highest in the leaf and petal, followed by the stem and inflorescence, and lowest in the decolorized root [39]. In Andrographis paniculata, PDS exhibited the highest transcription level in the leaf [38]. In Allium sativum, the PDS gene was highly expressed in green leaves compared to those in roots [40]. Considering our results in arrowheads, when UBQ and SLEEPER genes were applied to normalize gene expression, the PDS gene showed a significantly high expression in leaves and leafstalks, and a low expression in roots, stolons and corms. This result is consistent with the findings in previous research. However, the expression pattern of PDS normalized by the GAPDH gene presented an inconsistent result. EXP1 encoded a class of expansins, which contributed to cell expansion during the development of plant tissue [41]. In Psidium guajava, the expression of EXP1 increased with fruit ripening and reached the highest level at the ripening stage [42]. In Solanum tuberosum, EXP was closely related to the tuber expansion process, and the EXP gene exhibited high expression in growing tubers [43].
Under the normalization of IDH and UBQ, the expression level of EXP1 during corms development was consistent with that previously described. However, the expression level of EXP1 was overestimated at stage 2 of corm development when the unstable reference gene (CYP) was used to normalize gene expression. The assay of PDS and EXP1 gene expression further confirmed the accuracy of the selected reference genes in arrowheads.

Conclusions
In summary, five statistical algorithms were used to compare and screen suitable reference genes in the samples from different tissues and three developmental stages of the same tissue (corm, leaf and leafstalk) in arrowheads. The results concluded that UBQ was the most stable reference gene in different tissues and developmental stages of leaves, while IDH and PLA were the most stable reference genes during developmental stages of corms and leafstalks, respectively. The accuracy of the selected reference genes was verified by the expression normalization of the PDS gene in different tissues and the EXP1 gene in corms at three developmental stages. The current work will provide a reliable basis for the future research about the tissue-and developmental-stage-specific expression analysis of genes in arrowheads.

Supplementary Materials:
The following supporting information can be downloaded at https: //www.mdpi.com/article/10.3390/genes14071321/s1. Figure S1: The ORF sequence of PDS gene in 'BMT' arrowhead. Figure S2: The ORF sequence of EXP1 gene in 'BMT' arrowhead. Figure S3: PCR amplification of specific primers for selected candidate reference genes. Table S1: Amplification efficiency of candidate reference genes primer pairs.