Variation in Expression of Reference Genes across Life Stages of a Bee, Megachile rotundata

Simple Summary Reference genes are key to normalizing expression data across samples of organisms collected after different treatments are applied, but often, reference genes are not properly validated for this purpose. In this report, we screened several genes for a solitary bee, Megachile rotundata, and identified two (RPS18, and RPL8) with very stable expression levels across all life stages of the bee, and under a variety of environmental conditions, including during and after diapause. These genes should make good reference genes. We also identified other genes with stable expression, even if used only for a limited number of developmental stages. This information is important for future gene expression studies on these bees, but it also demonstrates the importance of validating reference genes in general. Abstract The alfalfa leafcutting bee, Megachile rotundata is widely used in the western United States as a pollinator for alfalfa seed production. Unfortunately, immatures experience high mortality in agriculturally managed populations. Quantified gene expression could be used to identify how this bee responds during different life stages to pathogens, environmental toxins, and other stresses, but stably expressed reference genes are needed to normalize transcription data. We evaluated twelve candidate genes for their transcription stability across different life stages, including during and after diapause. RPS18 and RPL8 were the two most stably expressed genes, followed by RPS5 and RPL27A. These genes were also very stable even during and after diapause, while the most variable genes being APN, PMIIM, NPC2, and Cr-PII had increased expression levels during larval growth and were also variable during and after diapause. The four reference genes we identified in M. rotundata may prove useful for transcriptomic studies in other bees as well, such as honey bees.


Introduction
In North America, the alfalfa leafcutting bee, Megachile rotundata, is a critical pollinator for alfalfa seed production [1,2]. However, bee health is frequently threatened by outside environmental elements, including exposure to pathogens, pesticides, and unfavorable weather conditions [3][4][5]. Gene expression analyses provide insights into the physiological mechanisms by which bees deal with these stresses [6,7]. Several immune response genes, detoxification genes and some signaling pathway systems have been identified for M. rotundata [8]; however, gene expression levels during different developmental stages are largely unknown.
Northern blots, microarrays, and real-time, quantitative, reverse transcription polymerase chain reactions (RT-qPCR) were all used to quantify gene expression in living organisms. These methods can be very sensitive, especially RT-qPCR, but they typically require internal gene expression standards if expression levels are compared across treatments and experiments [9,10]. Typically, the reference genes used as internal standards are associated with proteins involved in basic, ubiquitous cellular functions, as these are most likely to have consistent levels of expression under most conditions. Moderate transcript abundance and low variation in abundance in different individuals and life stages are important criteria for reference genes [11]. The 18S and 28S rRNAs were thought to be good reference genes for most animals [11,12], since they are abundant and have uniform expression levels in different tissues. However, a major drawback to using rRNA as a normalizer in RT-qPCR reactions is its lack of poly-A, which precludes its isolation in oligo(dT) based procedures [12]. Thus, we focused on genes transcribed by RNA polymerase II whose transcripts can be isolated using oligo(dT), such as the genes associated with glyceraldehyde 3-phosphate dehydrogenase (GAPDH), beta-actin (ACT), and ribosomal proteins (e.g., RPS5, RPS18, and RPL8). These genes are commonly used in insect studies [13][14][15], but their stability across different tissues, life stages, or environmental conditions is not consistent among species [16][17][18].
When it comes to bees, both RPS18 and RPL13A have been reported as suitable reference genes in the honey bee, Apis mellifera [15]. However, no reference genes have been validated for any bee as having uniform expression across different life stages. As with other insects, diapause is a special state where bees arrest development to adapt to extreme environmental conditions. It represents a distinct suite of developmental, behavioral, physiological, and biochemical attributes. During diapause, bee development is interrupted, and metabolic activity is suppressed; thus, diapause could result in dramatic changes in gene expression, as compared with other life stages of the bee [19][20][21]. Transcription analyses of the diapause to post-diapause transition identified 643 genes being upregulated and 242 genes being down-regulated in M. rotundata [19]. Other genes associated with metabolism, such as fatty acid biosynthesis, metal-binding juvenile hormone, innate immunity, and cold tolerance, were also substantially changed in diapausing Drosophila montana [21] and Culex pipiens [20].
Megachile rotundata overwinters as fifth (ultimate) instar larvae, but voltinism in this insect is variable. In any given location, and in any given year, part of the population will be univoltine, and part of the population will be bivoltine. The first eggs laid in a season complete development to the fifth instar mid-summer. After the fifth instar larvae have entirely consumed all the food provisions in their nest, they spin a cocoon. Larvae in the cocoon stage are referred to as prepupae. A portion of these will avert diapause, going on to pupate and complete development, making up the bivoltine segment of the population. The rest of the prepupae will go into diapause and overwinter, not pupating until the next spring. This life cycle provided an opportunity to compare gene expression for the same life stage, but in insects that went into diapause, and insects that averted diapause. Thus, we were able to compare the effects of having a diapause versus not having a diapause. We hypothesized that if genes are stably expressed before and after diapause as well as in diapausing and non-diapausing bees, they probably are also very stable across all other life stages. To evaluate the hypothesis, we selected 12 candidate reference genes utilizing information from a M. rotundata cDNA library (Table 1). Previously, we found expression of these genes was relatively stable for healthy and pathogen challenged bee larvae when the bees were exposed to different temperature conditions [6,8].

Source of M. rotundata Bees
We collected five individual M. rotundata from each of 11 life stages: eggs; 2nd, 3rd, 4th, and 5th instars; non-diapausing prepupae; diapausing prepupae; post-diapause pupae; pupae from non-diapause pre-pupae; adult males; and adult females. Some life stages were collected directly outdoors: eggs, all larval instars before cocoon formation, pupae from non-diapausing prepupae, adult males, and adult females. Those stages were all collected from leafcutting bee shelters in a farmer's alfalfa seed field near Tremonton, Utah, during July and August 2011. Eggs are very difficult to distinguish from first instars because the latter remain in the egg chorion until they molt to the next instar. Thus, all references to eggs in this study might include first instars. Prepupae collected in the summer could contain a mix of diapausing and non-diapausing insects because no studies have been conducted to determine when diapause first occurs. Thus, for simplicity, we refer to summer collected prepupae as "non-diapausing prepupae." We were able to distinguish pupae that originated from diapausing and non-diapausing prepupae. This is because pupae found in July and August originate from non-diapausing prepupae. Diapausing prepupae do not pupate in the summer, only in the spring, after at least three months exposure to cold temperature [22]. Diapausing prepupae were pur-chased in May as cocoons from a commercial bee supplier (JWM Leafcutter Inc., Nampa, ID, USA). The supplier collected these bees as cocoons the previous fall, then stored them at 4 • C for the winter (a standard beekeeping practice for M. rotundata). Non-diapausing prepupae would have died during cold storage. Thus, any live prepupae in this sample were in diapause. Post-diapause pupae were collected by incubating the diapausing prepupae at 39 • C for 7-10 days until they formed pupae.
Each bee sampled was submerged in RNAlater (Life Technologies, NY, USA), soaked overnight (at 4 • C) using approximately 5x the estimated volume of the bee, and then was stored at −80 • C, still in the RNAlater, until processed (about six months).

RNA Extraction
Total RNA was isolated from individual bees using a PureLink™ Micro-to-Midi Total RNA Purification System (Life Technologies/Ambion, Grand Island, NY, USA). We isolated total RNA from 5 individuals for each life stage. The adult bees were ground individually under liquid nitrogen using a mortar and pestle. The frozen tissue powder was transferred to a 2.0 mL tube precooled with liquid nitrogen, and 1.2 mL RNA lysis solution added to each sample before the tissue thawed. Eggs, larvae, and pupae were individually placed directly into chilled, 2 mL microcentrifuge tubes with RNA lysis solution (0.6 mL for eggs and small larvae, 1.2 mL for pupae and large larvae). Each bee was homogenized independently by crushing it with a small plastic pestle, vigorously shaking it using a vortex mixer, then passing through an 18 gauge needle several times using an RNase-free syringe. Then, an equal volume of 70% ethanol was mixed into the sample.
RNA was recovered from the lysate/ethanol mixture by filtering the sample through the RNA spin cartridge supplied with the kit. The flow-through was discarded, the cartridges were washed with buffer (as directed in the kit), and the total RNA was eluted by adding 30 µL of RNase-free water, incubating 1 min, and then centrifuging for 2 min at 13,000 rpm at room temperature. The elution steps were conducted three times for each spin cartridge, yielding approximately 90 µL of RNA solution. The concentration of total RNA in these samples was measured using a NanoDrop (Thermo Scientific, Inc. Wilmington, DE, USA), and the quality and integrity of total RNA were examined by electrophoresis. We only used samples where the ratios of A260/A280 were in the range of 2.0 to 2.2, and with typical two rRNA bands (28S and 18S), indicating that the total RNA had high quality and integrity [23]. The RNA samples were stored at −80 • C.

cDNA Synthesis
RNA expression levels were quantified using two-step RT-qPCR. First-strand cDNA was synthesized from the total RNA using a SuperScriptR VILO™ cDNA Synthesis Kit (Life Technologies/Invitrogen, Grand Island, NY, USA), as follows: 4 µL 5× VILO™ reaction mix and 2 µL 10× SuperScriptR enzyme were mixed with 2.5 µg total RNA in 14 µL DEPC-treated water, the samples incubated at 25 • C for 10 min followed by a second incubation at 42 • C for 60 min, after which the reactions were terminated by incubation at 85 • C for 5 min. The concentrations of the resulting cDNA samples were measured using a NanoDrop. The cDNA was stored at −20 • C until use.

Primer Design
The second step, amplification of the cDNA, was conducted using primer pairs designed for 12 potential reference genes ( Table 1). The gene sequences were obtained from our M. rotundata EST database (available in Genbank, http://www.ncbi.nlm.nih. gov/nucest/?term=Megachile+rotundata). The primers were designed according to the corresponding open reading frames for each of the conserved protein-coding domains within the candidate reference genes using a modified version of Primer3 software (http://fokker.wi.mit.edu/primer3/input.htm) [24]. Primers were designed using the following criteria: length (20 to 27-mer), melting temperature (60 to 64 • C), GC percentage (40 to 70%), an absence of significant secondary structures, and an amplicon length ranging from 64 to 108 bases (Table 1).

Quantitative Real-Time PCR
All RT-qPCR reactions were performed using Fast SYBR Green Master Mix kits (Life Technologies/AB Applied Biosystems, Grand Island, NY, USA) with 20 µL reactions consisting of 10 µL SYBR Green Master Mix, 0.4 µL of each primer (8 µM), 1 µL cDNA (400 ng/µL), and 8.2 µL water (Molecular Biology Grade, Thermo Scientific, Logan, UT, USA). Each primer pair was a separate reaction (no multiplex reactions). The PCR program was an initial 3 min denaturation at 95 • C, followed by 40 cycles of 15 s denaturation at 94 • C, 15 s annealing at 60 • C, and 15 s extension at 72 • C. Melting curves for the product were determined by raising the temperature from 55 to 90 • C in sequential steps of 1 • C for 2 s. Melting curves were evaluated to assure a single product was produced in each reaction. Any primer pairs that produced multiple peaks were discarded, and the results are not reported.
Ct values were standardized by subtracting the baseline fluorescence, and then adjusting the output Ct based on the calculated PCR efficiency for each target cDNA using LinRegPCR (version 2012.0, Ruijter) [25]. Two reactions were run for each insect, always being conducted on different dates, and the resulting standardized Ct-values were averaged, yielding our reported Cq values. The replicate was five individually sampled insects for each of the 132 gene and insect stage combinations, for a total of 660 Cq values. We also included a no template cDNA negative control for every primer pair, also with a replicate of 5 conducted similarly to the insects (yielding another 60 Cq values).

Data Analysis
Analysis of variance (ANOVA) was also used to evaluate whether the Cq values for RT-qPCR reactions with insect cDNA differed significantly from the blank controls for each primer pair and insect stage combination using Dunnett's test. Student's t-tests were carried out to compare gene transcripts of non-diapausing vs. diapausing prepupae, and pupae from non-diapausing prepupae vs. post-diapause pupae.
The expression stability across the different bee life stages was evaluated using both BestKeeper and NormFinder [26]. For the ANOVA method, the dependent variable was Cq values, and the main effects were the genes, life stages, and the interaction between these. This allowed for statistical comparison of the impact of the main effects on Cq values. We also ranked the stability based on the product of the average Cq and its standard deviation. Lower SD values in BestKeeper and lower SV values in NormFinder indicated lower variation in abundance in the different life stages, hence, greater stability of expression.

Transcript Abundance Profiles for Different Genes and Life Stages
Using the primer pairs in Table 1, RT-qPCR was used to evaluate gene expression levels of individual M. rotundata for each candidate reference gene. The bees sampled included different life stages collected at different times: eggs, larvae, diapausing and non-diapausing prepupae, and pupae from diapausing and non-diapausing prepupae, as well as female and male adults. Five bees were used for each life stage, and each RT-qPCR run was replicated twice for each bee. All Ct values were normalized to each other using LinRegPCR software to yield Cq values, as recommended by Ruijter et al. [25]. The highest level of transcript abundance among the candidate genes was observed for TarRNA, a nuclear gene encoding an antisense ribosomal RNA as it had the lowest Cq values, around 5-8 ( Figure 1). The other transcripts tested had lower abundances with Cq values ranging from 15-30 ( Figure 1).

Transcript Variation among Different Life Stages
Low variation of transcript level among different life stages is one of the mo portant criteria for determining the best reference gene. The Cq value for the Ta transcript was not significantly different among different larval and pupal stages (F d.f. = 7; p= 0.382) (Figure 2A and Table 2). The transcript abundance was uniform a the different immature life stages, except for the eggs and early instars, where tran abundance was lower ( Figure 2, Table 2). Transcript abundance for the RPL8, RPS18, RPL27A, GRX, GAPDH, and ACT genes showed little variation among th non-diapause pupae, post-diapause pupae, adult male, and adult female stages, bu variable during the larval life stages, as shown by the greater variation in Cq values also showed little variation in Cq values between diapausing prepupae and non-di ing prepupae except for GRX and GAPDH (Figure 2A,B, Table 2 Table 2).
We used two additional statistical analyses, BestKeeper and NormFinder, to f evaluate and compare expression variability for the genes. Using these analys RPS18 transcript had the most uniform abundance across life stages, among all th scripts we evaluated, followed by RPL8 and GAPDH, which had the second to thir est variability in both statistical analyses (Figure 2A, Table 3). The next most stabl the transcripts from the RPS5, GRX, RPL27A, TarRNA, and ACT genes, which r from 4th to 6th place (Figure 2A,B, Table 3). The transcripts from the other candidate APN, PMIIM, NPC2, and Cr-PII ranked as more variable ( Figure 2C, Table 3).

Transcript Variation among Different Life Stages
Low variation of transcript level among different life stages is one of the most important criteria for determining the best reference gene. The Cq value for the TarRNA transcript was not significantly different among different larval and pupal stages (F = 1.11; d.f. = 7; p = 0.382) (Figure 2A and Table 2). The transcript abundance was uniform among the different immature life stages, except for the eggs and early instars, where transcript abundance was lower (Figure 2, Table 2). Transcript abundance for the RPL8, RPS5, RPS18, RPL27A, GRX, GAPDH, and ACT genes showed little variation among the egg, non-diapause pupae, post-diapause pupae, adult male, and adult female stages, but were variable during the larval life stages, as shown by the greater variation in Cq values. They also showed little variation in Cq values between diapausing prepupae and non-diapausing prepupae except for GRX and GAPDH (Figure 2A,B, Table 2 Table 2).
We used two additional statistical analyses, BestKeeper and NormFinder, to further evaluate and compare expression variability for the genes. Using these analyses, the RPS18 transcript had the most uniform abundance across life stages, among all the transcripts we evaluated, followed by RPL8 and GAPDH, which had the second to third lowest variability in both statistical analyses (Figure 2A, Table 3). The next most stable were the transcripts from the RPS5, GRX, RPL27A, TarRNA, and ACT genes, which ranked from 4th to 6th place (Figure 2A,B, Table 3). The transcripts from the other candidate genes APN, PMIIM, NPC2, and Cr-PII ranked as more variable ( Figure 2C, Table 3).

Stability of Gene Expression in Diapausing and Non-Diapausing Bees
Diapause is a common strategy for bees to overcome adverse environmental conditions. Thus, it may involve various gene expression changes, primarily decreased expression of metabolism and energy related genes and increased expression of cold resistance genes [19,20]. To test our hypothesis that genes stably expressed before and after diapause will also be stably expressed in all other life stages, we compared the twelve candidate reference genes between pupae from diapausing and non-diapausing prepupae, as well as between diapausing prepupae and non-diapausing prepupae (Figure 3). Only the PMIIM gene had significantly different expression levels between post-diapause pupae and pupae from non-diapause prepupae (t = 2.424; d.f. = 8; p = 0.042) ( Figure 3A), while all other genes appeared to be stably expressed. For diapausing vs. non-diapausing prepupae, we found that expression varied significantly for APN (t = 3.504; d.f. = 8; p = 0.008), GAPDH (t = 2.523; d.f. = 8; p = 0.036), and GRX (t = 2.329; d.f. = 8; p = 0.048). In contrast, all other genes were expressed at similar levels in both diapausing prepupae and non-diapausing prepupae ( Figure 3B).

Discussion
We found that genes stably expressed during diapause are generally stable across most or all development stages in M. rotundata. Not surprisingly, it is easier to find stable expression among various larval stages, or among adult insects, than across all life stages. Thus, one must remain cautious when selecting reference genes if the study includes insects across different life stages.
Ribosomal protein genes were the least variable between diapausing and non-diapausing insects [21]. Therefore, these genes tend to be useful insect reference genes [27]. Among the four ribosomal protein genes we tested, RPL8, RPS5, and RPS18 proved to be better reference genes for M. rotundata than RPL27A, based on their lower variation across different life stages. We found RPS18 to be the best among all twelve of the candidate genes we tested. These results corroborate the use of RPS18 as a good reference gene for

Discussion
We found that genes stably expressed during diapause are generally stable across most or all development stages in M. rotundata. Not surprisingly, it is easier to find stable expression among various larval stages, or among adult insects, than across all life stages. Thus, one must remain cautious when selecting reference genes if the study includes insects across different life stages.
Ribosomal protein genes were the least variable between diapausing and non-diapausing insects [21]. Therefore, these genes tend to be useful insect reference genes [27]. Among the four ribosomal protein genes we tested, RPL8, RPS5, and RPS18 proved to be better reference genes for M. rotundata than RPL27A, based on their lower variation across different life stages. We found RPS18 to be the best among all twelve of the candidate genes we tested. These results corroborate the use of RPS18 as a good reference gene for the honey bee [15], the Colorado potato beetle, the red flour beetle [16], the planthopper Delphacodes kuscheli [14], and the bed bug Cimex lectularius [28].
TarRNA is the transcribed antisense to the rRNA genes in the nuclear rDNA repeat region and was first identified in the yeast Saccharomyces cerevisiae [29]. Of all the genes we tested in M. rotundata, TarRNA had the most abundant transcript and had a stable expression level between diapause and non-diapause, and it was also fairly stable among the different life stages. The gene was ranked in fifth place. It is an acceptable reference gene if one is looking for something with high transcript abundance. However, we do not recommend it as one of the best reference genes due to its high abundance. It may not be a good normalizer for genes with low expression. We also found ACT to be an acceptable reference gene for M. rotundata. Although the transcript abundance for ACT was lower than for TarRNA, it was more variable across the different life stages than for both TarRNA and the ribosomal genes we tested. ACT has been extensively used as a reference gene in other insects including the honey bee, but its usefulness varied [14,15,[30][31][32][33][34][35]. In particular, ACT was not a good reference gene in the silkworm or beet armyworm because transcript levels varied too greatly across different developmental stages [34].

Conclusions
We identified four genes for M. rotundata that are stably expressed across all life stages of the bee, and even during diapause. These genes may also prove useful as reference genes for other species of bees. Our top recommendations on reference genes for M. rotundata are RPS18 and RPL8, as these had the two most stable expression levels. RPS5 and RPL27A also are good reference genes, although expression levels are less stable than the top two genes. Although TarRNA was stable across most life stages, we do not recommend it because it had the most abundant transcript levels across all life stages. High abundance can make it difficult to normalize the baseline value in RT-qPCR data analysis accurately and may not be a good normalizer for weakly expressed genes. We found the abundance of the APN, PMIIM, NPC2, and Cr-PII gene transcripts were too variable among the life stages and do not recommend these be used if treatments include different life stages.
Author Contributions: All authors were involved in planning, data interpretation, and writing the manuscript. In addition, J.X. collected samples of different stage bees, isolated RNA, constructed the cDNA library, and performed RT-qPCR and data analysis; and R.R.J. conceived the idea and provided conceptual oversight. All authors have read and agreed to the published version of the manuscript.