Transcriptomic Analysis of the Early Strobilar Development of Echinococcus granulosus

Echinococcus granulosus has a complex life cycle involving two mammalian hosts. The transition from one host to another is accompanied by changes in gene expression, and the transcriptional events that underlie this transition have not yet been fully characterized. In this study, RNA-seq was used to compare the transcription profiles of samples from E. granulosus protoscoleces induced in vitro to strobilar development at three time points. We identified 818 differentially expressed genes, which were divided into eight expression clusters formed over the entire 24 h period. An enrichment of gene transcripts with molecular functions of signal transduction, enzymes, and protein modifications was observed upon induction and developmental progression. This transcriptomic study provides insights for understanding the complex life cycle of E. granulosus and contributes for searching for the key genes correlating with the strobilar development, which can be used to identify potential candidates for the development of anthelmintic drugs.


Introduction
Echinococcosis is a zoonotic parasitic infection caused by tapeworms of the genus Echinococcus and is considered as one of the 17 neglected tropical diseases prioritized by the World Health Organization [1,2]. The two most important forms of the disease are cystic echinococcosis (hydatidosis) and alveolar echinococcosis, caused by infection with Echinococcus granulosus and Echinococcus multilocularis, respectively. Echinococcus spp. have two-host life cycles with the larval stage growing in the tissues of an intermediate host (among a large variety of noncarnivorous species) and the adult stage living in the intestine of a definitive host (among few carnivore species) [3][4][5].
The E. granulosus larva, also called metacestode, is a fluid-filled vesicular cyst (the hydatid cyst) containing many infectious protoscoleces, the preadult forms of the parasite. In vitro studies demonstrated that protoscoleces have the unusual ability to differentiate in two different directions depending on the environmental stimulus. In the intermediate host, upon hydatid cyst rupture, protoscoleces released in the body cavity redifferentiate in a cystic direction, forming secondary hydatid cysts. In contrast, protoscoleces ingested by a carnivore, such as a dog, and exposed to the gut environment, differentiate into fully strobilated and sexually differentiated adult tapeworms [4,[6][7][8].
Strobilar development is directly influenced by the host-parasite relationships and configure a key point for the parasite's life cycle. Within the hydatid cyst, a protoscolex remains quiescent and invaginated. It only undergoes strobilation upon ingestion by a definitive host and exposure to the correct stimuli [8,9]. The nature of these stimuli is not fully known, but it is believed that chewing and proteolytic enzymes such as pepsin play considerable roles in differentiation induction, as well as temperature and the presence of bile salts. In addition, histological studies have already demonstrated that parasite contact or attachment to a substrate similar to that found on the surface of the canine gut also constitutes an important stimulus for strobilation [10][11][12]. These observations resulted in the elaboration of strategies for in vitro culture of Echinococcus protoscoleces in order to provide the necessary physiological conditions for parasite strobilar development [13].
Genome comparative studies carried out for E. granulosus [14][15][16] and E. multilocularis [15,17] provided evidence of considerable losses and gains of genes that may be associated with adaptations to parasitism. The 1149 megabase genome of E. granulosus comprises nine chromosomes, of which 10,231 genes have been identified [15]. Among them, however, crucial genes or even entire pathways of de novo synthesis of fatty acids, cholesterol, pyrimidines, purines, and most amino acids are absent. Thus, E. granulosus relies on the host for obtaining these nutrients. Regarding strobilar development, transcriptomic and proteomic studies have identified differentially expressed genes or proteins between larval and strobilated (adult) parasite life stages, either in the model cestode species Mesocestodes corti or in E. granulosus [18][19][20][21]. Specifically for E. granulosus, it was shown that bile acids have a crucial role in the differentiation of protoscoleces into adult worms [13], involving the expression of parasite bile acid receptors and transporters to stimulate the corresponding developmental pathways [15]. However, given the complexity of the strobilation process, many more molecular events and pathways are likely to play roles in the gradual morphological changes observed during the strobilar development of E. granulosus, but they remain essentially unknown.
In an attempt to find genes and provide evidence of molecular pathways involved with the gradual phenotypic changes triggered by strobilation stimuli in E. granulosus, we report here the transcriptomic profiling of the first 24 h after protoscolex in vitro induction to adult development. We performed a comparative analysis of the protoscolex transcriptomes of three time points within this 24 h window after strobilar induction, and provided an overview of early molecular events in strobilation. Our data reinforce the foundation required to elucidate the strobilar development in the context of host-parasite relationships, as well to improve new control strategies for echinococcosis and other cestodiases.

Summary of the RNA Sequencing Data
The RNA extracted from E. granulosus protoscoleces collected at different time points after induction to strobilar development ( Figure S1) was subjected to paired-end RNA-seq using Illumina technology. The analyzed samples comprise protoscoleces washed with phosphate-buffered saline (PBS), used as noninduced controls (Sample 1-PBS), protoscoleces washed with PBS and treated with pepsin (Sample 2-PEP), protoscoleces washed with PBS, treated with pepsin, and cultured in biphasic medium for 12 h (Sample 3-12 h) or 24 h (Sample 4-24 h). The RNA-seq resulted in a total of 30,821,916 reads for the four samples studied. The overall raw read mean quality score was high, with 98.4% of bases above Q30. After quality filtering, 30.8 million of paired-end reads (99.5%) were obtained, and 71.7% of the reads were mapped to the E. granulosus genome with known gene annotations. Table 1 shows the summary of the sequencing results.

Number of Identified Genes
A total of 9376 different genes were represent in our dataset. A total of 9019, 9029, 9051, and 9077 genes were found in PBS, PEP, 12 h, and 24 h samples, respectively. Most of the genes (8742 genes) were represented in the four analyzed samples, but 61, 72, 56, and 73 genes were exclusively detected in PBS, PEP, 12 h, and 24 h samples, respectively ( Figure 1). A complete list of the identified genes is provided in Table S1.

Number of Identified Genes
A total of 9376 different genes were represent in our dataset. A total of 9019, 9029, 9051, and 9077 genes were found in PBS, PEP, 12 h, and 24 h samples, respectively. Most of the genes (8742 genes) were represented in the four analyzed samples, but 61, 72, 56, and 73 genes were exclusively detected in PBS, PEP, 12 h, and 24 h samples, respectively ( Figure 1). A complete list of the identified genes is provided in Table S1.

Differentially Expressed Genes
We performed a pairwise comparison across different samples using GFOLD. A total of 818 genes were differentially expressed between any two samples ( Table S2). The number of differentially expressed (DE) genes (up-and downregulated) in each 2 × 2 sample comparison is shown in Figure  3. A predominance of downregulated genes was detected in all pairwise comparisons. PEP vs. 12 h showed the highest amount of DE genes (552 genes; 180 upregulated and 372 downregulated). Some important genes for host-parasite relationship and development are shown in Table S3, including genes such as dynein light chain, annexins, ankyrin, and tetraspanin. To better understand the dynamics of E. granulosus protoscolex gene expression upon strobilation induction, DE genes were clustered to search for similar patterns of expression between samples ( Figure 4). Using the sum of the RPKM values in the four samples (cutoff > 10), 744 DE genes were categorized into eight different clusters on account of their relative expression pattern (Table  S2). We computed a relative expression for each gene by dividing its expression at each time point by the sum of gene expression for all time points. In this analysis, clusters 1 and 2, including 446 genes, showed decreasing expression profile patterns throughout the treatment for protoscolex strobilation, while 277 genes in clusters 6, 7, and 8 showed increasing expression profiling patterns.

Differentially Expressed Genes
We performed a pairwise comparison across different samples using GFOLD. A total of 818 genes were differentially expressed between any two samples ( Table S2). The number of differentially expressed (DE) genes (up-and downregulated) in each 2 × 2 sample comparison is shown in Figure 3. A predominance of downregulated genes was detected in all pairwise comparisons. PEP vs. 12 h showed the highest amount of DE genes (552 genes; 180 upregulated and 372 downregulated). Some important genes for host-parasite relationship and development are shown in Table S3, including genes such as dynein light chain, annexins, ankyrin, and tetraspanin.

Differentially Expressed Genes
We performed a pairwise comparison across different samples using GFOLD. A total of 818 genes were differentially expressed between any two samples ( Table S2). The number of differentially expressed (DE) genes (up-and downregulated) in each 2 × 2 sample comparison is shown in Figure  3. A predominance of downregulated genes was detected in all pairwise comparisons. PEP vs. 12 h showed the highest amount of DE genes (552 genes; 180 upregulated and 372 downregulated). Some important genes for host-parasite relationship and development are shown in Table S3, including genes such as dynein light chain, annexins, ankyrin, and tetraspanin. To better understand the dynamics of E. granulosus protoscolex gene expression upon strobilation induction, DE genes were clustered to search for similar patterns of expression between samples ( Figure 4). Using the sum of the RPKM values in the four samples (cutoff > 10), 744 DE genes were categorized into eight different clusters on account of their relative expression pattern (Table  S2). We computed a relative expression for each gene by dividing its expression at each time point by the sum of gene expression for all time points. In this analysis, clusters 1 and 2, including 446 genes, showed decreasing expression profile patterns throughout the treatment for protoscolex strobilation, while 277 genes in clusters 6, 7, and 8 showed increasing expression profiling patterns. To better understand the dynamics of E. granulosus protoscolex gene expression upon strobilation induction, DE genes were clustered to search for similar patterns of expression between samples ( Figure 4). Using the sum of the RPKM values in the four samples (cutoff > 10), 744 DE genes were categorized into eight different clusters on account of their relative expression pattern (Table S2). We computed a relative expression for each gene by dividing its expression at each time point by the sum of gene expression for all time points. In this analysis, clusters 1 and 2, including 446 genes, showed decreasing expression profile patterns throughout the treatment for protoscolex strobilation, while 277 genes in clusters 6, 7, and 8 showed increasing expression profiling patterns.  A structural and functional annotation of the DE genes is summarized in Figure 5 and Table S2. The most representative domains found among the genes downregulated during early worm development (cluster 1 and 2) were related to "cell motility", "cell adhesion", "DNA-binding", "translation", and "DNA replication/repair". Among the genes upregulated during early worm development (cluster 6, 7, and 8), the most representative domains were "signal transduction", "other enzymes", and "protein modification". By EggNOG analysis, the most significant functions found in downregulated genes were "translation, ribosomal structure, and biogenesis", "intracellular trafficking, secretion, and vesicular transport", and "cytoskeleton". Among the upregulated genes, the most relevant functions were "inorganic ion transport and metabolism", "nucleotide transport and metabolism", and "amino acid transport and metabolism". A structural and functional annotation of the DE genes is summarized in Figure 5 and Table S2. The most representative domains found among the genes downregulated during early worm development (cluster 1 and 2) were related to "cell motility", "cell adhesion", "DNA-binding", "translation", and "DNA replication/repair". Among the genes upregulated during early worm development (cluster 6, 7, and 8), the most representative domains were "signal transduction", "other enzymes", and "protein modification". By EggNOG analysis, the most significant functions found in downregulated genes were "translation, ribosomal structure, and biogenesis", "intracellular trafficking, secretion, and vesicular transport", and "cytoskeleton". Among the upregulated genes, the most relevant functions were "inorganic ion transport and metabolism", "nucleotide transport and metabolism", and "amino acid transport and metabolism".
Uncharacterized proteins represent 270 of total DE genes, with 68 annotated as expressed conserved proteins, 62 expressed proteins and 140 hypothetical proteins (Table S2) Uncharacterized proteins represent 270 of total DE genes, with 68 annotated as expressed conserved proteins, 62 expressed proteins and 140 hypothetical proteins (Table S2). Of these, only 10 have conserved domains predicted by SUPERFAMILY and 13 have EggNOG functional categories (three of them present in both classifications).

Discussion
Important morphological and biochemical changes occur throughout the life cycle of parasitic organisms and are probably the result of regulated changes in gene expression in response to environmental stimuli, such as host change, and temperature and pH shifts [22][23][24]. These regulated responses contribute to the mechanisms related to the development of the parasite, including strobilation and sexual maturation, as well as the evasion of the host's immune response.
Based on the Jacob-Monod model, a hypothetical but logical model was proposed to explain how the gene expression regulation can be involved in Echinococcus development [8,25]. Although both the morphological characteristics of the strobilar development and the genome of E. granulosus are known, the correlation between these two pieces of information and the model previously

Discussion
Important morphological and biochemical changes occur throughout the life cycle of parasitic organisms and are probably the result of regulated changes in gene expression in response to environmental stimuli, such as host change, and temperature and pH shifts [22][23][24]. These regulated responses contribute to the mechanisms related to the development of the parasite, including strobilation and sexual maturation, as well as the evasion of the host's immune response.
Based on the Jacob-Monod model, a hypothetical but logical model was proposed to explain how the gene expression regulation can be involved in Echinococcus development [8,25]. Although both the morphological characteristics of the strobilar development and the genome of E. granulosus are known, the correlation between these two pieces of information and the model previously proposed is still poorly understood. In this work, we provided a transcriptional analysis of E. granulosus protoscoleces in vitro induced to strobilar development in attempt to find genes involved in this process.
The strobilar development in E. granulosus, as well as other cestodes, allows adult parasite adult to have larger numbers of reproductive organs, allowing an increase in fertility [26] and makes this process an important target for study. Based on the classic works of Smyth and collaborators [13], we have previously reported an in vitro culture of E. granulosus protoscolex strobilar development based on a biphasic medium containing the bile salt taurocholate [21]. In this work, we cultivated protoscoleces in biphasic medium for 12 or 24 h. Here, we subjected E. granulosus protoscoleces to induction of strobilar development and analyzed gene expression of parasite samples collected at different time points of this process, including untreated protoscoleces washed with PBS, protoscoleces treated with pepsin, and protoscoleces cultivated in biphasic medium for 12 or 24 h. By sample-to-sample correlation analysis, it was possible to observe that the most of the identified transcripts were shared between the different samples analyzed. PBS and PEP samples have a relatively high correlation coefficient. However, with the subsequent activation of the protoscoleces and cultivation biphasic medium, mimicking the developmental transition in the definitive host, a change in the identity of genes was observed.
We identified transcripts corresponding to 9376 genes (91.6%). This number represents almost 1000 genes more than the 8393 genes identified in a previous RNA-seq of protoscoleces extracted from a single porcine liver cyst [15]. Another transcriptomic analysis found 7471, 6976, 3811, and 7724 genes in the protoscoleces, cyst germinal cells and membranes, adult worms, and oncospheres, respectively [16]. Some of these genes probably are stage-specific and transiently expressed during the parasite life cycle, indicating a possible role in adaptation and strobilation of the parasite.
Among the most expressed transcripts found in our data, it was verified a remarkable presence of those corresponding to fatty acid binding protein (FABP) and the antigen B genes. These genes have already been described among those most highly expressed Echinococcus genes [15,27]. The importance of these genes lies in the fact that cestodes are unable to synthesize fatty acids and cholesterol de novo [15]. They depend essentially on the sequestration, uptake, and utilization of host lipids by proteins such as FABP and antigen B. In this study, FABP genes are downregulated during E. granulosus adult worm development, although it has already been described that egfabp2 (EgrG_000549800) is mainly transcribed in adult and cyst [16,28]. Antigen B has both up-and downregulated genes coding its subunits, in agreement with previous works [29,30]. This variation in the composition of antigen B may be related to different properties of each subunit in binding to lipids and evasion of the immune response.
Among genes downregulated during protoscolex strobilar development, we found several genes coding for dynein light chain, oxalate:formate antiporter, and annexins. Dynein is a family of cytoskeletal motor proteins involved in intracellular motility of vesicles and organelles along microtubules and are associated with transforming growth factor (TGF)-β signaling [15,31]. Previous studies showed the expansion of this family in E. granulosus and schistosomes when compared to nematodes [14,15]. Oxalate:formate antiporter is a subfamily of the major facilitator transporter family, responsible for the transport of small solutes [32], but its function is not fully understood in parasites. Annexins, in contrast, are considered to play critical roles in parasite process related to the maintenance of cell integrity and modulation of the host immune responses [33]. A similar pattern of expression was observed in M. corti, where several genes encoding annexins were more expressed in the tetrathyridium (larval) than in the strobilated worm stage [19]. Therefore, the decrease in the expression of the annexins may be related to the absence of contact with the host in the in vitro cultures.
On the other hand, we found ankyrin, tetraspanin, heat shock protein 70 (Hsp70), and sodium bile acid cotransporter among the upregulated DE genes. Ankyrins are involved in functions such as cell cycle regulation, transcriptional regulation, cytoskeleton interactions, signal transduction, development, and intracellular trafficking [34,35]. In parasites, tetraspanins are involved in the coordination of signal transduction, cell proliferation, adhesion, migration, cell fusion, and host-parasite interactions [36]. In E. granulosus, tetraspanins were mostly present in the tegument and could contribute to the parasite nutrition and differentiation [37,38]. Therefore, the result of coordinated functions performed by ankyrins and tetraspanins (upregulated) in contrast to dynein light chain and annexins (downregulated) can be important to cytoskeletal remodeling associated with evagination and elongation of protoscoleces. The Hsp70s are part of the group of the expanded domain families in E. granulosus and may have important roles in protein folding and in protecting cells from stress [15]. The expression of several Hsp70 genes may be particularly associated with the stressful conditions of strobilation induction, which involves an increase in protein synthesis [21,39]. In turn, sodium bile acid cotransporter is an integral membrane glycoprotein that, in humans, participate in the enterohepatic circulation of bile acids. Bile acids seem to play a key role in the differentiation of Echinococcus protoscoleces into adult worms, and the expression of bile acid receptors and transporters may be stimulated during strobilar development [13,16].
When we analyzed the molecular function of DE genes, we also found differences between clusters. In cluster 1, which presents a downregulated expression pattern, we observed the presence of more basal functions, such as translation (e.g., Eukaryotic translation initiation factor 5a), DNA replication, and cell motility. In contrast, the upregulated DE genes of cluster 8 are related to specialized functions such as signal transduction (e.g., tyrosine protein kinase and G protein coupled receptor), enzymes (e.g., hexokinase and phospholipase), and protein modifications (e.g., Hsp70), which might correlate with the increased morphological complexity of the adult tapeworm compared to the metacestode.
In our previous work, we identified proteins newly synthesized by E. granulosus protoscoleces upon the induction of strobilar development [21]. Although the samples analyzed by the proteomic approach and in the present transcriptomic analysis do not correspond to the same strobilar development stages, some results are similar between these two studies. We observed a predominance of upregulated genes/proteins related to the cytoskeleton, energy metabolism, and cellular communication functions. Specifically, the 2-amino-3-ketobutyrate coenzyme A ligase, whose transcript was identified here as upregulated during strobilar development, was previously identified as a protein synthesized in response to strobilation stimuli.
Our group has recently compared the genome content of 10 parasitic platyhelminth species with the aim of identify genes and proteins related to the strobilation process [40]. Among the genes that were considered strobilation-related with unknown function, three are differentially expressed in the transcriptome here analyzed. We find that EgrG_000105500 is upregulated in E. granulosus, in the same way as observed in E. multilocularis [15] and M. corti [19], in the comparison between larval (pre-strobilated) and adult (strobilated) stages. EgrG_000518800 (upregulated) and EgrG_000701500 (downregulated) also show similarity to the pattern observed in E. multilocularis [15]. It is important to note that a large number of genes (270 DE genes-33.0%; 2976 in total-31.7%) have so far not been characterized, which makes more accurate analyses difficult, and highlights the extent to which strobilation and other Echinococcus developmental processes are still poorly known.

Parasite Material and In Vitro Cultivation
E. granulosus protoscoleces (G1 genotype) were aseptically collected from one hydatid cyst recovered from a naturally infected liver of cattle routinely slaughtered at a commercial abattoir (São Leopoldo, RS, Brazil). The viability of protoscoleces was determined by trypan blue exclusion test and confirmed based on their motility characteristics under light microscopy [41]. Protoscoleces were washed three times with PBS, pH 7.4 and genotyped by one-step PCR and RFLP [42]. PSCs were cultured as previously described [21]. Briefly, after washing in PBS, they were used as noninduced controls (Sample 1-PBS), or treated for strobilation induction as follows. Protoscoleces were incubated for 15 min with pepsin (2 mg/mL), pH 2.0 (Sample 2-PEP), washed with PBS and transferred to a biphasic medium contained taurocholate for 12 (Sample 3-12 h) or 24 h (Sample 4-24 h). Protoscoleces in the control and induced conditions were further maintained in culture to confirm the characteristic morphological changes of early strobilar development.

RNA Extraction
Total RNA from each parasite sample was extracted using TRIzol reagent, according to the manufacturer's instructions, followed by treatment with RNase-free DNase I (Sigma, St. Louis, MO, USA) to remove DNA contaminants. The integrity of the extracted RNA was monitored using gel electrophoresis on a 1% agarose gel. RNA concentration was determined using Qubit (Thermo Fisher Scientific, Waltham, MA, USA).

cDNA Library Construction and Sequencing
For cDNA libraries, 4 µg of total RNA were used as start material. PBS, PEP, 12 h, and 24 h sample libraries were constructed, without replication, using the TruSeq Stranded mRNA LT Sample Preparation Kit (Illumina, San Diego, CA, USA) according to manufacturer's instructions. Library quality control was performed using the 2100 Bioanalyzer System with the Agilent High Sensitivity DNA Kit (Agilent, Santa Clara, CA, USA). The libraries were individually quantified via qPCR using a KAPA Library Quantification Kits for Illumina platforms (KAPA Biosystems, Wilmington, NC, USA). They were pooled together in equal amounts and sequenced in a MiSeq Sequencing System (Illumina). Paired-end reads (2 × 75 bp) were obtained using a MiSeq Reagents Kit v3 (150 cycles.)
The reads mapped to each transcript were used to calculate normalized transcript abundance and to perform differential gene expression analysis in GFOLD v1.1.4 [47], a software package specifically designed for unreplicated RNA-seq data. Genes with |GFOLD value| > 1 or |log2 (fold change)| > 2 were considered to be differentially expressed.

Conclusions
In this study, we have conducted RNA-Seq analyses of E. granulosus protoscoleces in the first 24 h of strobilar development. More importantly, this work provides information about differentially expressed genes and key molecular events activated upon E. granulosus strobilar development. In summary, we provide significant data that can be used to explore basic questions on the biology and evolution of cestodes, including the study of development and the host-parasite relationship. In addition, these data can be used for new echinococcosis control strategies, as well as other helminthiases.
Supplementary Materials: All sequence data (raw Illumina reads) are available on the NCBI Sequence Read Archive (SRA) database under the accession ID SRP131874. The following are available online at http://www.mdpi.com/2076-0817/9/6/465/s1, Table S1: Complete list of the identified genes, Table S2: Structural and functional annotation of the differentially expressed genes, Table S3: Selected differentially expressed genes among the samples analyzed of E. granulosus. Figure S1: Induction of strobilar development in E. granulosus protoscoleces.