Transcriptome-Based Evaluation of Optimal Reference Genes for Quantitative Real-Time PCR in Yak Stomach throughout the Growth Cycle

Simple Summary The stomach is one of the primary sites for the digestion and absorption of nutrients. Quantifying related gene expression patterns using quantitative real-time PCR (RT-qPCR) is conducive to further understanding the molecular mechanisms underlying nutrition metabolism in the yak stomach. The authenticity of RT-qPCR data is affected by the selection of reference genes. Unfortunately, no studies have demonstrated suitable reference genes for the normalization of RT-qPCR data in the yak stomach. In this study, 15 candidate reference genes (CRGs) were identified according to transcriptome sequencing (RNA-seq) results and the previous literature. Five algorithms were used to evaluate the stability of the CRGs across the entire developmental stage in the yak stomach. RPS15, MRPL39, and RPS23 were found to be the most stable genes in the yak stomach from birth to adulthood. This study indicates the appropriate reference genes for gene expression analysis via RT-qPCR in the yak stomach. Abstract Efficient nutritional assimilation and energy metabolism in the stomachs of yaks contribute to their adaption to harsh environments. Accurate gene expression profile analysis will help further reveal the molecular mechanism of nutrient and energy metabolism in the yak stomach. RT-qPCR is regarded as an accurate and dependable method for analyzing gene expression. The selection of reference genes is essential to obtain meaningful RT-qPCR results, especially in longitudinal gene expression studies of tissues and organs. Our objective was to select and validate optimal reference genes from across the transcriptome as internal controls for longitudinal gene expression studies in the yak stomach. In this study, 15 candidate reference genes (CRGs) were determined according to transcriptome sequencing (RNA-seq) results and the previous literature. The expression levels of these 15 CRGs were quantified using RT-qPCR in the yak stomach, including the rumen, reticulum, omasum and abomasum at five stages: 0 days, 20 days, 60 days, 15 months and three years old (adult). Subsequently, the expression stabilities of these 15 CRGs were evaluated via four algorithms: geNorm, NormFinder, BestKeeper and the comparative CT method. Furthermore, RefFinder was employed to obtain a comprehensive ranking of the stability of CRGs. The analysis results indicate that RPS15, MRPL39 and RPS23 are the most stable genes in the yak stomach throughout the growth cycle. In addition, to verify the reliability of the selected CRGs, the relative expression levels of HMGCS2 were quantified via RT-qPCR using the three most stable or the three least stable CRGs. Overall, we recommend combining RPS15, MRPL39 and RPS23 as reference genes for the normalization of RT-qPCR data in the yak stomach throughout the growth cycle.


Introduction
The yak (Bos grunniens), a precious domesticated ruminant, also known as the "boat on the plateau", is mostly found on the Qinghai-Tibetan Plateau and nearby areas at an altitude yaks were raised in Hongyuan County of Sichuan Province and fed with natural lactation and pasture. A total of 15 Maiwa yaks (7 males and 8 females) were selected from the same herd at 5 different growth stages: 0 days (lactating stage), 20 days (lactating stage and starting to graze), 60 days (lactating stage and graze stage), 15 months (graze stage but still lactating) and 3 years old (natural graze stage). For sample collection, three separate yaks of each age were slaughtered. The stomach tissues of the yaks, including rumen, reticulum, omasum and abomasum, were rinsed immediately in 0.1% DEPC water after slaughter and frozen in liquid nitrogen until processing for total RNA extraction.

RNA Extraction and cDNA Synthesis
The total RNA of the rumen, reticulum, omasum and abomasum tissues were extracted using the mirVana miRNA Isolation Kit (Invitrogen, Carlsbad, CA, USA) following the manufacturer's protocol. The purity and concentration of total RNA were confirmed using the NanoDrop2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). The integrity of total RNA was assessed using 1% agarose gel electrophoresis. The cDNA was generated from 1000 ng total RNA using the PrimeScript RT reagent Kit with gDNA Eraser (TaKaRa, Dalian, China) in a reaction mixture of 20 µL. The cDNA was stored at −80 • C until required.

Primer Pairs Design
Primers for RT-qPCR were designed using Primer-BlAST with a length of 20 ± 3 bases and amplicon sizes ranging from 100 to 150 bp. The sequences of the CRGs were obtained from NCBI (https://www.ncbi.nlm.nih.gov/ accessed on 25 June 2022). The primer specificity of each CRG was verified using 2% agarose gel electrophoresis and melting curve analysis. To validate the specificity of each primer pair, the products of PCR were purified and sequenced using a 3730 DNA analyzer (ABI, Carlsbad, CA, USA), and the sequencing results were compared with all potential transcript sequences in NCBI using BLAST.

RT-qPCR Assay
All RT-qPCR assays were carried out in triplicate for each sample using the LightCycler 96 System (Roche Diagnostics, Indianapolis, IN, USA). The total volume of each reaction mixture was 20 µL, including 10 µL of TB Green Premix Ex Taq II (TaKaRa, Dalian, China), 2 µL of diluted cDNA, 0.5 µL of each of 10 µM forward and reverse primers and 7 µL of RNase Free dH 2 O. The PCR program was conducted as follows: 95 • C for 30 s (predenaturation), 40 cycles of 95 • C for 5 s and 60 • C for 30 s (quantitative analysis), 95 • C for 5 s and 60 • C for 1 min (melting curves analysis). To determine the correlation coefficient (R 2 ) and amplification efficiency (E) for each primer pair, a five-point standard curve was generated using a five-fold dilution of cDNA. The correlation coefficient (R 2 ) and amplification efficiency (E) of each primer pair were calculated using the LightCycler 96 System. A modified Pfaffl equation was used to determine the relative quantity (RQ) of each gene [23]: C q (calibrator) = C q for the arithmetic mean of all samples at 5 stages, C q (sample) = C q for the sample. The formula for calculating the relative expression level of a target gene is as follows: Relative gene expression = RQ GOI Geomean[RQ REFs ] RQ GOI : the RQ value of the target gene, Geomean[RQ REFS ]: the geometric mean of the RQ value of selected reference genes. The normalization factor (NF) was calculated using the geometric mean of the RQ value of the selected reference genes [23].

Stability Analysis of CRGs
The expression stability of 15 CRGs was evaluated using 4 algorithms: geNorm, NormFinder, BestKeeper and the comparative C T method. In addition, RefFinder (http:// blooge.cn/RefFinder/ accessed on 10 October 2022) was used to synthesize the evaluation results of the above four algorithms to give an overall ranking.

Validation of Optimal Reference Gene Combinations
HMGCS2 is the key rate-limiting enzyme in the ketogenic pathway and plays an important role in the digestion and absorption of nutrients in the stomach. The expression levels of HMGCS2 were quantified using RT-qPCR to validate the selected reference genes. The expression levels of HMGCS2 in the stomach at 5 stages were normalized using the three most stable gene combinations and the three most unstable gene combinations identified from this study. The relative mRNA expression of HMGCS2 was calculated using the 2 −∆∆Ct method. In addition, statistical significance was analyzed using one-way analysis of variance via SPSS 25.0 software (IBM, Armonk, NY, USA). A p value below 0.05 was regarded as statistically significant.

Quality Control of Total RNA
The 260/280 ratio of total RNA for each sample ranged from 1.8 to 2.2, and the purity and concentration were qualified for subsequent experiments (Table S1). The RNA of all samples clearly displayed two prospective bands at 18 s and 28 s without any signs that the products were degraded ( Figure S1). The above results indicate that the RNAs of all samples were equipped for cDNA synthesis.

Selection of CRGs Based on RNA-seq Data and Previous Literature
The criteria for preliminary selection of reference genes were relatively high transcriptome abundance and low expression variation [15]. As a result, preliminary selection comprised genes with relatively high transcriptome abundance (FPKM > 100) as identified by the mean FPKM value and low variability as identified by the coefficient of variation (CV < 20%). A total of 80 CRGs were preliminarily selected using our previous RNA-seq results of the stomach at five stages in fifteen yaks (Table S2). Furthermore, 7 genes (RPS15, RPS23, YWHAZ, RPL13A, ACTB, GAPDH, and RPS9) were considered as CRGs due to their lower CV values, higher FPKM values, and easier primer designs. In addition, eight CRGs were selected based on previous studies. Among these CRGs: UXT, HMBS, MRPS15, PPP1R11, MRPL39 and TBP were validated to be appropriate reference genes for RT-qPCR in yak [13,[24][25][26]. Additionally, DBNDD2 and DDX54 were verified as suitable reference genes for RT-qPCR in the rumen epithelium of cows [27]. In conclusion, 15 genes were chosen as CRGs for further evaluation.

Characteristics of Primer Pairs
The details of primer pairs of 15 CRGs are displayed in Table 1. Primer pairs of amplification efficiency (%) ranged from 91 to 109%, the amplicon's size lay in 100-286 bp and the R 2 of each primer pair was not less than 0.99. The specificity of primer pairs for each gene was verified via 2% agarose gel electrophoresis ( Figure S2) and melting curve analysis ( Figure S3). To further validate the specificity of each primer pair, the products of PCR were purified and sequenced. Then, the sequencing results were compared with all potential transcript sequences in NCBI using BLAST (Table S3).

RT-qPCR Analysis for CRGs
The mean C q values of all tested samples calculated to determine the expression levels of the 15 CRGs are illustrated in Figure 1. C q value had a negative correlation with gene expression level. In other words, higher gene expression levels are associated with lower C q values and vice versa. The C q values of all CRGs ranged from 18.49 to 32.67. For each CRG, the mean and median C q values were relatively close. Among all the CRGs, RPS23 demonstrated the highest expression level, with C q = 20.18 ± 0.76, while PPP1R11 had the lowest expression level, with C q = 30.59 ± 0.89.  The upper whisker caps represent Q3 + 1.5×IQR (where Q3 is the third quartile and IQR is the inter-quartile range, or distance between the first and third quartiles) and the lower whisker caps represent Q1 − 1.5× IQR (where Q1 is the first quartile). The black dots indicate outliers.

Evaluation of Expression Stability for CRGs
In this study, four algorithms: geNorm, NormFinder, BestKeeper and the comparative CT method were used to evaluate CRGs for stability ranking. The stability rankings obtained from the four algorithms were different. Thus, RefFinder was employed to obtain a total score that was used to rank the stability of the 15 CRGs ( Table 2).
The M-value was calculated via geNorm analysis to identify gene expression stability. Then, the M-value was used to rank the stability of expression for the 15 CRGs, and the M-value was negatively correlated with the stability of gene expression. According to the geNorm method, the results show that RPS15 and DBNDD2 were the most stable CRGs with the lowest M-value of 0.48, while RPL13A was the least stable gene with the highest M-value of 0.74 in the yak stomach throughout the growth cycle. The upper whisker caps represent Q3 + 1.5 × IQR (where Q3 is the third quartile and IQR is the inter-quartile range, or distance between the first and third quartiles) and the lower whisker caps represent Q1 − 1.5× IQR (where Q1 is the first quartile). The black dots indicate outliers.

Evaluation of Expression Stability for CRGs
In this study, four algorithms: geNorm, NormFinder, BestKeeper and the comparative C T method were used to evaluate CRGs for stability ranking. The stability rankings obtained from the four algorithms were different. Thus, RefFinder was employed to obtain a total score that was used to rank the stability of the 15 CRGs ( Table 2).
The M-value was calculated via geNorm analysis to identify gene expression stability. Then, the M-value was used to rank the stability of expression for the 15 CRGs, and the M-value was negatively correlated with the stability of gene expression. According to the geNorm method, the results show that RPS15 and DBNDD2 were the most stable CRGs with the lowest M-value of 0.48, while RPL13A was the least stable gene with the highest M-value of 0.74 in the yak stomach throughout the growth cycle.
The NormFinder algorithm was used to calculate the stability value (SV) to identify the ranking of the CRGs, with the most stable gene showing the lowest SV. For the yak stomach throughout the growth cycle, the most stable gene was RPS15 with the lowest SV of 0.35, and RPL13A was the most unstable gene with the highest SV of 0.62.
The BestKeeper and the comparative C T method regard standard deviation (SD) as one of the criteria to evaluate the stability of gene expression.. The lower the SD value, the more stable the gene expression. Based on BestKeeper analysis, RPS15 was the most stable gene, whereas YWHAZ was the least stable gene with the highest SD value. By contrast, according to the comparative C T method, MRPL39 had the highest stability, and the YWHAZ was the most unstable gene. Based on the results obtained using these four algorithms, RefFinder was used for comprehensive ranking. As a result, the comprehensive rankings according to stability from the highest to the lowest are

Optimal Number of Reference Genes
The pairwise variation values (V) were calculated using geNorm software, which is a valid tool to identify the optimal number of reference genes for RT-qPCR. Vandesompele et al. [16] proposed taking 0.15 as a cut-off value below which the inclusion of additional reference genes is not necessary. Thus, according to the cut-off value (V = 0.15), the results indicate that the combination of three genes was the optimal number for normalization of RT-qPCR data in the yak stomach throughout the growth cycle ( Figure 2A). Furthermore, low pairwise variation values correspond to a high correlation coefficient [16]. Clearly, there is no need to include an additional gene when using the three most stable reference genes for calculating the NF ( Figure 2C). In contrast, it is essential to have more than an additional gene when using the two most stable reference genes for calculation of NF ( Figure 2B). Thus, we recommend the combination of the three most stable genes (RPS15, MRPL39, and RPS23) to normalize RT-qPCR data in the yak stomach throughout the growth cycle.

Validation of the Combination of CRGs
To verify the effect of the combination of RPS15, MRPL39 and RPS23 for the normalization of RT-qPCR data, the expression of HMGCS2 was quantified via RT-qPCR in yak stomach at 5 stages (0 d, 20 d, 60 d, 15 m and adult). Moreover, the expression patterns of HMGCS2 in yak stomach at 5 stages were also identified using the FPKM of RNA-seq results. The results show a correspondence between the RT-qPCR and RNA-seq, indicating the RT-qPCR data of HMGCS2 using the RPS15, MRPL39 and RPS23 for normalization were reliable (Figure 3).  The y-axis indicates Vn/Vn + 1 between the calculation of the NFn using the mo stable reference genes and the NFn + 1 using an addition of the next most stable reference gene The gray line depicts the cut-off value (V = 0.15). (B,C) NFn/n+1 scatterplots before and after the a dition of (n + 1) reference gene (x-axis and y-axis). The NFn was calculated using the geometr mean of the RQ data of n CRGs. r: Spearman rank correlation coefficient.

Validation of the Combination of CRGs
To verify the effect of the combination of RPS15, MRPL39 and RPS23 for the no malization of RT-qPCR data, the expression of HMGCS2 was quantified via RT-qPCR yak stomach at 5 stages (0 d, 20 d, 60 d, 15 m and adult). Moreover, the expression pa terns of HMGCS2 in yak stomach at 5 stages were also identified using the FPKM RNA-seq results. The results show a correspondence between the RT-qPCR an RNA-seq, indicating the RT-qPCR data of HMGCS2 using the RPS15, MRPL39 and RPS2 for normalization were reliable (Figure 3). CRGs. The y-axis indicates Vn/Vn + 1 between the calculation of the NFn using the most stable reference genes and the NFn + 1 using an addition of the next most stable reference genes. The gray line depicts the cut-off value (V = 0.15). (B,C) NF n/n+1 scatterplots before and after the addition of (n + 1) reference gene (x-axis and y-axis). The NF n was calculated using the geometric mean of the RQ data of n CRGs. r: Spearman rank correlation coefficient. To further validate the selection of CRGs, the three most stable CRGs (RPS15, MRPL39, and RPS23) and the three least stable CRGs (RPS9, UXT and YWHAZ) were used to normalize the expression of HMGCS2. As shown in Figure 4A,B, the expression patterns of HMGCS2 in the rumen, reticulum, omasum and abomasum at five stages (0 d, 20 d, 60 d, 15 m and adult) were similarly obtained using FPKM based on RNA-seq results and the combination of three most stable CRGs (RPS15, MRPL39, and RPS23) for normalization. Furthermore, the expression of HMGCS2 in the rumen, reticulum and omasum were the lowest at 0 d and the highest at adulthood, while the opposite was true in the abomasum. However, compared with the expression of HMGCS2 based on RNA-seq results ( Figure 4A), normalization of HMGCS2 expression using the three least stable CRGs (RPS9, UXT and YWHAZ) demonstrated significant differences ( Figure 4C). Hence, it is essential to select suitable reference gene combinations to normalize the expression of target genes. To further validate the selection of CRGs, the three most stable CRGs (RPS15, MRPL39, and RPS23) and the three least stable CRGs (RPS9, UXT and YWHAZ) were used to normalize the expression of HMGCS2. As shown in Figure 4A,B, the expression patterns of HMGCS2 in the rumen, reticulum, omasum and abomasum at five stages (0 d, 20 d, 60 d, 15 m and adult) were similarly obtained using FPKM based on RNA-seq results and the combination of three most stable CRGs (RPS15, MRPL39, and RPS23) for normalization. Furthermore, the expression of HMGCS2 in the rumen, reticulum and omasum were the lowest at 0 d and the highest at adulthood, while the opposite was true in the abomasum. However, compared with the expression of HMGCS2 based on RNA-seq results ( Figure 4A), normalization of HMGCS2 expression using the three least stable CRGs (RPS9, UXT and YWHAZ) demonstrated significant differences ( Figure 4C). Hence, it is essential to select suitable reference gene combinations to normalize the expression of target genes. Animals 2023, 13, x FOR PEER REVIEW 10 of 1

Discussion
Gene expression analysis via RT-qPCR is a dependable and extensively used method to reveal the molecular mechanism of digestion and absorption of nutrients in the stomach. The use of reference genes is the most credible strategy for taking into account the initial concentration of RNA, sample loss during experimentation, the efficiency of cDNA synthesis, and so on [28]. However, the selection of inappropriate reference genes also affects the authenticity of RT-qPCR data [29]. Therefore, selecting suitable reference genes is essential to obtain meaningful RT-qPCR results. Until now, strategies for identifying reference genes from the transcriptome have been widely used. Reference genes selected from the transcriptome increase the reproducibility and sensitivity of results, give a stronger correlation between protein expression levels, and have better detection and coverage [30]. Although RNA-seq screening has many merits in predicting reference genes, this strategy is not absolutely trustworthy and needs further validation via RT-qPCR [15,31]. In this study, 15 CRGs were determined via RNA-seq and the previous literature, and further verified using RT-qPCR.
In this study, geNorm, NormFinder, BestKeeper and the comparative C T method were employed to assess the stability of 15 CRGs. Ribosomal protein S15 (RPS15) is a component of the 40S ribosomal subunit and functions as a nuclear export factor [32]. Although different algorithms were used for the stability ranking of 15CRGs, RPS15 had the best stability in all the algorithms except the comparative C T method (geNorm, NormFinder and BestKeeper) ( Table 2). Furthermore, RPS15 was also the most stable gene in the comprehensive ranking of the results of the four algorithms using RefFinder. This is consistent with Bionaz et al. [28] finding that RPS15 is one of the best reference genes used for the normalization of gene expression data in the bovine mammary gland during the lactation cycle. Tyrosine 3 monooxygenase/tryptophan 5-monooxygenase activation protein zeta (YWHAZ), belonging to the 14-3-3 protein family, participates in various cell activities including cell growth, cell cycle, apoptosis, and so on [33,34]. Some studies have demonstrated the consideration of YWHAZ as an appropriate reference gene due to its high stability in cattle [35], buffaloes [36] and yak [25]. By comparison, it seems that YWHAZ was the least stable gene in our study (Table 2). Even so, the M value of YWHAZ (0.73) derived from the geNorm analysis is well below the threshold (M = 1.5) proposed by Vandesompele et al. [16], suggesting that it is also a relatively stable gene in the yak stomach. These results indicate that the stability of reference genes is highly specific and should be evaluated for a given experimental context.
Although RPS15 had the highest stability in our evaluation, we still do not recommend using it alone as the reference gene for the normalization of RT-qPCR data in the yak stomach. Many studies show that using a single gene as the reference gene should be avoided [16,18,28]. It has been reported that using a single reference gene results in significant bias [37]. To date, no specific theory prescribes a certain number of reference genes to be used. Use of geNorm can provide the optimal number of reference genes needed to eliminate the majority of technical variation [11]. Accuracy and practicality are trade-offs when determining the optimal number of reference genes. It is an unnecessary waste of resources to use more reference genes if the inclusion of additional genes has no significant effect on NF [16]. In our study, there was no significant change between the NF calculated with the three most stable CRGs and that calculated with the four most stable CRGs, indicating that it was superfluous to add a gene for normalization ( Figure 2C). In addition, the digestive tract of the yak has three developmental stages: pre-rumination (0-20 days), transition from pre-rumination to rumination (20-60 days), and rumination (after 60 days). The diets of yaks are different at different developmental stages. Therefore, we evaluated the stability of these genes in yak stomach tissue over five developmental stages. Our results support the use of these reference genes in the normalization of RT-qPCR data under different dietary conditions. As a result, we recommend using the combination of three most stable genes (RPS15, MRPL39, and RPS23) to calculate the NF for normalization of RT-qPCR data in the yak stomach throughout the growth cycle.
The expression profiles of HMGCS2 were quantified via RT-qPCR in the yak stomach at five developmental stages, and its expression levels were normalized by the selected combination of reference genes. HMGCS2 is the key rate-limiting enzyme in the ketogenic pathway and induces the biosynthesis of HMG-CoA, which is the central metabolite of rumen epithelial cells [38,39]. The ketogenic capacity of ruminal epithelium in ruminants increases with age, and newborn ruminants have no ketogenic capacity [40]. Thus, we hypothesized that the expression level of HMGCS2 in the rumen should increase in terms of age, as well as those in the reticulum and omasum because they serve analogous functions to the rumen. In this paper, the expression level of HMGCS2 did increase with age, as determined using RNA-seq and RT-qPCR in the rumen, reticulum and omasum ( Figure 3A-C). For newborn ruminants, the rumen was not fully developed and ingested colostrum is instead digested in the abomasum [8]. Consequently, the expression level of HMGCS2 in the abomasum should be highest at birth and lowest in adulthood ( Figure 3D).
In addition, although target genes with significant expression changes can be identified using less stable reference genes, target genes with imperceptible expression changes can only be detected using the best reference genes [20,37]. Our results confirm that significant changes in the expression of HMGCS2 between birth and adulthood could be identified using either the three most stable CRGs (RPS15, MRPL39, and RPS23) or the three least stable CRGs (RPS9, UXT and YWHAZ). However, when the change in HMGCS2 expression is slight, errors may occur when using RPS9, UXT and YWHAZ for normalization. For example, in the rumen, there were no significant differences in HMGCS2 expression levels between 60 days and 15 months either based on the results of RNA-seq or using RPS15, MRPL39, and RPS23 for normalization, whereas its expression levels normalized using RPS9, UXT and YWHAZ had significant differences (p < 0.05) between 60 days and 15 months (Figure 4). These results imply that using suitable reference genes is essential for accurate normalization of target gene expression.

Conclusions
In this study, 15 CRGs were selected using transcriptome sequencing results and the previous literature, and their expression stability was evaluated using five algorithms. Therefore, we recommend the combination of the three most stable genes RPS15, MRPL39, and RPS23 as reference genes for the normalization of RT-qPCR data in the yak stomach throughout the growth cycle.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/ani13050925/s1, Figure S1: Agarose gel electrophoresis for all tested RNA samples; Figure S2: Agarose gel electrophoresis for 15 candidate reference genes; Figure  S3: Melting curves of 15 candidate reference gen; Table S1: The purity and concentration of RNA samples used in this study; Table S2: The stably expressed genes identified in the yak stomach via RNA-sequencing; Table S3: Sequence alignment results in NCBI for 15 candidate reference genes.