Screening of Differentially Expressed Microsporidia Genes from Nosema ceranae Infected Honey Bees by Suppression Subtractive Hybridization

The microsporidium Nosema ceranae is a high prevalent parasite of the European honey bee (Apis mellifera). This parasite is spreading across the world into its novel host. The developmental process, and some mechanisms of N. ceranae-infected honey bees, has been studied thoroughly; however, few studies have been carried out in the mechanism of gene expression in N. ceranae during the infection process. We therefore performed the suppressive subtractive hybridization (SSH) approach to investigate the candidate genes of N. ceranae during its infection process. All 96 clones of infected (forward) and non-infected (reverse) library were dipped onto the membrane for hybridization. A total of 112 differentially expressed sequence tags (ESTs) had been sequenced. For the host responses, 20% of ESTs (13 ESTs, 10 genes, and 1 non-coding RNA) from the forward library and 93.6% of ESTs (44 ESTs, 28 genes) from the reverse library were identified as differentially expressed genes (DEGs) of the hosts. A high percentage of DEGs involved in catalytic activity and metabolic processes revealed that the host gene expression change after N. ceranae infection might lead to an unbalance of physiological mechanism. Among the ESTs from the forward library, 75.4% ESTs (49 ESTs belonged to 24 genes) were identified as N. ceranae genes. Out of 24 N. ceranae genes, nine DEGs were subject to real-time quantitative reverse transcription PCR (real-time qRT-PCR) for validation. The results indicated that these genes were highly expressed during N. ceranae infection. Among nine N. ceranae genes, one N. ceranae gene (AAJ76_1600052943) showed the highest expression level after infection. These identified differentially expressed genes from this SSH could provide information about the pathological effects of N. ceranae. Validation of nine up-regulated N. ceranae genes reveal high potential for the detection of early nosemosis in the field and provide insight for further applications.


Introduction
Honeybee (Apis mellifera) is essential pollinator of many crop plants in natural ecosystems and agricultural crops; it contributes more than $14 billion to agriculture annually [1][2][3][4]. However, the American apiculture industry experienced catastrophic losses of unknown origin since 2006. The decline and disappearance of bee species in the natural environment and the collapse of honeybee colonies were defined as colony collapse disorder (CCD) [1]. The consequences of this syndrome are

Experimental infection
The experimental infection was modified from Higes et al. (2007), briefly described as below, frames of sealed brood obtained from a healthy colony of A. mellifera (Nosema free confirmed by PCR, weakly) were kept in an incubator at 32 ± 2℃ and 95% relative humidity to provide newly emerged Nosema free honeybees. Three days after eclosion, the bees were starved for 2 h and 100 bees per group were each fed with 5 μl of 50% sucrose solution containing 1.25×10 5 spores of N. ceranae by a droplet with the spore solution at the tip of a micropipette until it had consumed the entire droplet [26]. Control honeybees were fed with 5 μl of 50% sucrose solution. The honey bees were reared in an incubator at 32 ± 2℃ and 95% relative humidity and the survival individuals were recorded daily for 21 days. The Kaplan-Meier survival curve was generated by SPSS software. For the sample collections, 10 bees from each cage were collected at 7, 14 and 21 days post infection (dpi) and their ventriculi processed for microscopic examination. Spores were counted and spore production was recorded every week until 3 weeks pi. Control bees were also sacrificed in day 7, 14 and 21 dpi and analyzed to confirm the absence of spores in ventriculus.

Detection of infection
DNA was extracted from the midgut tissues of control and infected honey bees by using the following procedure modified from Tsai et al., 2002 [27]. In brief, 10 midgut tissues of control and infected honey bees were homogenized in TE buffer (0.1 M Tris, 0.01 M EDTA, pH 9.0) with a pestle. The macerated solution was centrifuged and the supernatant was discarded. The pellets were incubated with RNase A at 37℃ for 1 hr and proteinase K at 56℃ overnight followed by DNA extraction with an equal volume of phenol-chloroform-isoamyl alcohol (25:24:1), and DNA precipitation with 1/10 volume of 3M sodium acetate and 2 times volume of pure ethanol. The DNA pellet was air dried and then dissolved in water. The SSU rDNA fragment of the microsporidium was amplified using primer set 18f /1537r (Supplementary Table 1). For the internal control, 18S primer set 143F/145R (Lo et al., 1994) was used (Supplementary Table 1). Each 50-μl PCR mix contained 5 μl 10× reaction buffer (Bioman), 4 μl 2.5 mM dNTPs, 0.5 μl 100 mM of each primer, 1μl 1.25 U HiFi Taq polymerase (RBC) and 1 l template DNA. PCR amplifications were performed as follows on an AG9600 Themal Station (Biotronics Corp.): thermal cycler was preheated at 95℃ for 5 min, 35 cycles at 94℃ for 30 sec, 50℃ for 1 min, and 72℃ for 2 min, followed by a 10 min final extension at 72℃ and storage at 20℃. The PCR product was cloned into T&A cloning vector (RBC Bioscience) and commercially sequenced (Genomics Biosci. & Tech. Company).

Construction of subtractive cDNA library
Total RNA was extracted using TRIzol® Reagent (ambion) according to the manufacturer's instructions from 10 of control and infected tissues at 7, 14 and 21 dpi and then pooled together equal quantities of three RNA samples per group to maximize detect differential expression of genes that may show variation between individuals in the exact timing of expression. Single-stranded and double-stranded cDNA were synthesized from the infected and control RNA pools with the SMART TM PCR cDNA Synthesis Kit (Clontech, USA) according to the manufacturer's protocol.
Two subtractive cDNA libraries (forward and reverse subtractive cDNA libraries) from the control and infected honeybees, were constructed by using the PCR-Select cDNA Subtraction kit (Clontech) according to the manufacturer's specifications. Briefly descript below, cDNA from the infected honeybees was used as the tester (forward subtractive cDNA library, fscl.)/driver (reverse subtractive cDNA library, rscl.), and cDNA from the control honeybees as the driver (fscl.)/tester (rscl.). Both tester and driver cDNAs were digested with RsaI to produce shorter bluntended fragments. After digestion with RsaI, the tester cDNA was divided into two portions, each of which was ligated with a different adapter (Adaptor-1 and Adaptor-2R) at 16℃ for overnight. After ligation, each tester cDNA was separately hybridized at 68℃ for 8 h with an excess of driver cDNA after denaturation at 98 ℃ for 90 sec. Then the two hybridized samples were mixed together and hybridized at 68℃ for overnight with excess of denatured driver cDNA. The resulting mixture was added with 200 ml dilution buffer and amplified by two rounds of suppression PCR. Before the primary PCR, the reaction mixture was incubated at 75℃ for 5 min to extend the adaptors. Primary PCR was performed at 94℃ for 30 sec, 66℃ for 30 sec, and 72℃ for 90 sec for 27 cycles in a reaction volume of 25μl. The PCR product was then diluted 10-fold, and 1 μl-diluted product was used as template in the next subsequently nested PCR. The subtracted secondary PCR products were purified with DNA Clean/Extraction Kit (GeneMark) and then ligated into T＆A cloning vector (RBC) followed by transformation into Escherichia coli, DH5α competent cells (RBC) to generate subtracted cDNA libraries. Each step of SSH was tested following the manufacturer's instructions. Total 192 white colonies from forward (96 colonies) and reverse libraries (96 colonies) were checked by colonies PCR. The PCR products were further subjected to dot-blotting method, which described from the manufacturer's protocol, and the potential colonies were identified and commercially sequenced (Genomics Bioscience & Techology). DNA sequencing from the 3' end and 5' end of the cDNA was conducted with M13 forward or reverse primers, respectively, on a high throughput automated sequencer (MJ Research BaseStation and ABI3730, USA) using standard protocols.

DNA sequences analysis
cDNA sequences from the each SSH libraries were sequenced. CodonCode Aligner was used for base-calling (Q > 13), trimming vector sequences, sequences assemblage. All the sequences were filtered for size (less than 100 bp). Putative functions of the unique sequences were discovered by using BLASTn and BLASTp to translate each nucleotide query sequence into all reading frames and then searching for matches in the NCBI non-redundant database. Significant hits (with E value < 10 -10 ) in the NCBI nr database were further submitted to the Gene Ontology (GO) enrichment analysis (http://geneontology.org/page/go-enrichment-analysis) for protein functions. The Gene GO annotations that classify proteins by molecular function, biological process and cellular component. Each unique sequence was tentatively assigned GO classification based on annotation of the single "best hit" match. These data were then used to classify the corresponding genes according to their GO functions.

RT-qPCR validations of microsporidia specific genes
To further validate the differential expressed microsporidia specific genes in N. cereana infected honey bees, microsporidia genes have 2 ≧ ESTs present in the forward library and dot-blotting difference ≧ 3-fold were selected and subjected to quantitative RT-qPCR analysis. Gene specific primer sets for RT-qPCR were designed by Primer Express v3.0 (Supplementary Table 1). Total RNA was extracted using TRIzol® Reagent (ambion) according to the manufacturer's instructions from 5 of control and infected midguts at 7, 14 and 21 dpi The extracted RNA (2μg) was treated with DNaseI (Roche Molecular Biochemicals), recovered by a phenol/chloroform / isoamylalcohol extraction and precipitated with ethanol. The RNA was then treated with DNase I (Invitrogen Life Technologies) following the manufacturer's instructions to reduce genomic DNA contamination before RT-qPCR. The DNase I-treated total RNA samples were conducted to the reverse-transcription by GScript RTase kit (GeneDireX, US) following the manufacturer's instructions. The reaction mixtures at 42°C for 1.5 h, were incubated at 42°C for 1.5 h and then the reaction was stopped at 70°C for 15 min. Realtime qPCR was performed by using Thermo Scientific Verso SYBR Green 1-step qRT-PCR ROX Mix kit (Thermo-Fisher) in a 96-well Bio-Rad CFX96 Real-Time PCR System (Bio-Rad). All samples were performed in triplicate. The relative gene expression levels were calculated by 2 -△△Ct method [28].

Microsporidian infection
The midgut tissues of control and infected honey bees were homogenized and mature spores in the macerated solution was counted under microscopy. The formation of mature spores was observed after 14 to 21 dpi, moreover, the spore number increased from 14 dpi. to 21 dpi ( Figure 1A and B). The DNA of honeybee midgets was extracted for confirming the infection of N. ceranae by PCR with 18f and 1537r primer sets (microsporidia degenerated primer set). The results of PCR revealed that only the infected honeybees were detected ( Figure 1B). The infection of N. ceranae was detected at 7 dpi though the mature spores were only observed at 14 dpi. (Figure 1). This result was similar to the previous report that different parasite stages of N. ceranae should present in the midgut epithelial cells after the experimental infection of N. ceranae with honey bees at 6-7 dpi [11], indicated the honey bees were under N. ceranae infection. It has been described that the pathological effects of N. ceranae infected A. mellifera resulted in the reduction of the lifespan [11,21]. From our survival rate analysis, adult honey bees infected with N. ceranae showed lower survival possibility and survival time than those of uninfected honey bees ( Figure 1D); this data proved a consistent negative effect of the N. ceranae to adult bees.

Screening of the subtracted cDNA libraries
There are two subtracted cDNA libraries was established, and total of 192 white colonies (each of 96 colonies) were first screened by colony PCR (Figure 2). The results of colony PCR showed that the positive rate was 97.9% for fscl. and 95.8% for rscl. The size of inserted cDNA fragments varied from 180 bp to 1 kb ( Figure 2). The positive PCR products were then subjected to dot blotting for screening the differential ESTs. A total of 118 differentially expressed sequence tags (ESTs) were identified in the subtracted libraries and sequenced. There are 65 and 53 ESTs were identified from infected (forward) and non-infected (reverse) subtracted libraries, respectively (Table 1; Figure 2B-C and E-F). These clones revealed the genes with highly differentially expression levels and were further sequenced.

Analysis of EST sequences
Total of 118 ESTs were sequenced and further analyzed. The average sequence length of ESTs in each library was 426 bp for the infected (forward) library and 375 bp for the non-infected (reverse) library (Table 1). According to the NCBI BLASTn analysis,75.4% ESTs (49 ESTs) were identified as microsporidia genes and 20% ESTs (13 ESTs) were belonged to Apis spp.in the fscl (Table 2; Figure  2B-C). From the rscl, there were 83% ESTs (44 ESTs) showed significant homology to insect host, especially Heminoptera insects (Apis mellifera and A. florea) ( Table 3; Figure 2E-F). This result indicated that the microsporidia genes were highly up-regulated during the infection process. There were only few host genes showed the dramatically responses of up-regulated during N. ceranae infection, while most of the up-regulated host genes were identified from the rscl, suggested the suppression of host gene expression by N. ceranae infection.
The result of Gene Ontology (GO) analysis showed total of 30 DEGs (78.9%) from host were identified and were further categorized into molecular function, biological process and cellular component ( Figure 3; Supplementary Table 2). Moreover, there were 16 GO terms belonged to three major categories ( Figure 3; Supplementary Table 2).
It has been reported that N. ceranae infection could changes in metabolism of host, besides, the N. ceranae infection also showed the negative effects on the immune system to honey bees; the honey bees infected with N. ceranae showed a down-regulation of some immune-related genes, such as abaecin, apidecin, defensin, hymenoptaecin, glucose dehydrogenase (GLD) and vitellogenin (Vg), suggesting that N. ceranae infection suppresses immune defense mechanisms in honey bees [29][30][31].
Based on the GO analysis, the genes involved in catalytic activity (GO:0003824) were highly expressed and followed by cellular process (GO:0009987) / membrane (GO:0016020) and binding  Table 2). As the matter of fact that the up-regulation of the α-glucosidase gene and genes, which involved in trehalose transport, and the down-regulation of the trehalase and the glucose-methanol-choline oxidoreductase three encoding genes during N. ceranae infection. The unbalance of the gene expressions results in the modifications of carbohydrate metabolism in the infected honey bees; suggested this changes in the gene expressions and carbohydrate metabolism were manipulative activity of the N. ceranae to get nutrients from host [21,29]. Based on our data, the genes involved in the metabolic process (GO:0008152) from the rscl showed differential expressed, revealed the host gene expression change after N. ceranae infection and lead to the unbalance of physiological mechanism.

Identification and validation of highly expressed N. cerana specific genes
From the forward (infected) library, there are 75.4% ESTs (49 ESTs belonged to 24 genes) were identified as N. ceranae genes ( Table 2). Of 24 N. ceranae genes, nine DEGs, which have more than 2 identified ESTs and >3-fold change by dot bolt analysis were selected for further validation by qRT-PCR, including four hypothetical proteins (AAJ76_1600052943, AAJ76_500092411, AAJ76_2000141845 and AAJ76_1900028347), 14-3-3 protein 1-like protein (AAJ76_300017093), Heat shock protein 90 (AAJ76_1170002375), Ubiquitin (AAJ76_1300017036), Forkhead hnf3 transcription factor (AAJ76_500091146) and Histone H3 (AAJ76_800037613) (Figure 4). Similar to the expectations, the results indicated that these genes showed highly up-regulated during N. ceranae infection. Moreover, the results also showed that all microsporidia specific gene expressions had the highest gene expression level at 7 dpi, and decreased at the 14 and 21 dpi ( Figure 5A and B). It should be noted that among these 9 N. ceranae specific genes, a hypothetical protein (AAJ76_1600052943) (the previous library clone no.= sr22 and provisional named sr22 gene) has the consistently the highest expression, which up-regulated to ~1.1× 10 4 fold at 7 dpi. to the mock infection 2340 and 1148 fold at 14 and 21 dpi, respectively ( Figure 5B). For the other eight genes, there are three genes, including histone H3, ubiquitin, and hypothetical protein (AAJ76_2000141845) showed higher expression levels, which ranged from ~2100 to 4400 fold to the mock infection at 7 dpi. and the hypothetical protein (AAJ76_2000141845) also showed consistently highly expressed at 14 to 21 dpi ( Figure 5A).
Ubiquitin is a small protein, which involved in the ubiquitin-proteasome system (UPS) The UPS is the crucial intracellular protein degradation system in eukaryotes. It has been reported that the infection of N. bombycis resulted in up-regulation of protein degradation-related enzyme in the host midgut, including the S-phase kinase-associated protein 1 and the ubiquitin-conjugating enzyme E2G. Both of which are involved in the ubiquitin-proteasome pathway [32]. From our data, the highly expression of ubiquitin was observed in N. cerana, indicated that the host defense mechanism to the parasite and therefore trigger the protein degradation-related enzyme in N. cerana.
Besides, the highly expression of histone H3 was detected in the N. ceranae infection process; histone proteins are essential for DNA packing, chromosome stabilization and gene expression in the nucleus of a eukaryotes cell. Each the nucleosome consists of two copies of each core histone H2A, H2B, H3 and H4 to be a octameric protein complex. Histones H3 and H4 are the most highly conserved histones, which play a regulatory role in chromatin formation such as retain the ability to impede transcription [33]. Therefore, it was assumed that highly expression of N. cerana histone H3 might enhance the regulation of the chromatin formation and facilitate the microsporidia DNA replication.
For the other up-regulated N. ceranae genes, they showed up-regulated expression profile during N. ceranae infection ( Figure 5A). The functions of several genes revealed the response of N. ceranae to the honey bee host and also showed the replication activity of N. ceranae during infection. For the 14-3-3 protein 1-like protein (AAJ76_300017093), it is a serine/threonine binding protein. The 14-3-3 protein 1-like protein inhibits apoptosis through sequestration of pro-apoptotic client proteins [34]. Interestingly, the highly expression levels of ubiquitin of N. cerana was detected at 7 dpi. and then decreased at 14 and 21 dpi, while the expression of 14-3-3 protein 1-like protein was higher than that of ubiquitin at 21 dpi; besides, the heat shock protein 90 (Hsp90) was also identified as highly expressed N. ceranae gene that facilitates metastable protein maturation, stabilization of aggregation-prone proteins, quality control of misfolded proteins and assists in keeping proteins in activation-competent conformations [35]; it seems that during the infection process, N. ceranae not only needs to against the 7 of 23 stress from the host, but also attempted to stabilize the physical pathway itself for the selfpropagation purpose. This hypothesis is worthy to be addressed in the future.
For the gene forkhead hnf3 transcription factor (AAJ76_500091146), it presented throughout the animal kingdom as well as in fungi and yeast (Mazet et al. 2003). The function of forkhead hnf3 transcription factor is characterized by a type of DNA-binding domain known as the forkhead box (FOX) [36]. It has also described that forkhead hnf3 transcription factor involved in a wide range of biological functions, including development, growth, stress resistance, apoptosis, cell cycle, immunity, metabolism, reproduction and ageing [37]. As the matter of fact that, in terms of these genes, they might play important roles for the replications of microsporidia, but it still need to be further evaluated. The genome of honeybee (A. mellifera) had been sequenced and it consisted of a little more than 10,000 genes, lower than other insects (D. melanogaster, 13,600 genes; A. gambiae, 14,000 genes; B. mori, 18,500 genes) [38]. In additionally, the draft assembly (7.86 MB) and highly compact (~1 gene/kb) genome of the N. ceranae genome has been fully sequenced in 2009 [39]. N. ceranae has a strongly AT-biased genome (74% A+T) and a diversity of repetitive elements, complicating the assembly. Total 2,614 predicted protein-coding sequences were predicted and 1,366 of these genes have homologs in the microsporidian Encephalitozoon cuniculi, the most closely related published genome sequence. These databases could be a reference for studying and clarifying the virulent of N. ceranae and the interaction between their host, A. mellifera in the near future. From our data, several N. ceranae genes were identified as differentially expressed genes through the SSH, which could provide information about the pathological effects of N. ceranae. In terms the validation of nine upregulated N. ceranae genes, it revealed highly potential for the detection of early nosemosis in the field and provide insight for further applications.

Conclusions
In this study, 28 honey bee genes and 24 genes N. ceranae were identified as DEGs after honey bees infected with N. ceranae via SSH approach. For the host DEGs, a high percentage of DEGs involved in catalytic activity and metabolic processes revealed the host gene expression change after N. ceranae infection might lead to the unbalance of physiological mechanism. Of 24 N. ceranae genes, nine DEGs were subjected to real-time quantitative reverse transcription PCR (real-time qRT-PCR) for validation and all of them showed high expression levels during different time post infections.
In terms the results of validation, these genes revealed highly potential for the detection of different stages of nosemosis in the field. It should be also noted that one N. ceranae gene (AAJ76_1600052943, sr22) showed the highest expression level after infection. This gene might be further developed as a biomarker for early nosemosis detection. The data from this study could provide information on the pathological effects of N. ceranae and new insight for further applications on honey bee pathogen detections.