Protein Discovery: Combined Transcriptomic and Proteomic Analyses of Venom from the Endoparasitoid Cotesia chilonis (Hymenoptera: Braconidae)

Many species of endoparasitoid wasps provide biological control services in agroecosystems. Although there is a great deal of information on the ecology and physiology of host/parasitoid interactions, relatively little is known about the protein composition of venom and how specific venom proteins influence physiological systems within host insects. This is a crucial gap in our knowledge because venom proteins act in modulating host physiology in ways that favor parasitoid development. Here, we identified 37 possible venom proteins from the polydnavirus-carrying endoparasitoid Cotesia chilonis by combining transcriptomic and proteomic analyses. The most abundant proteins were hydrolases, such as proteases, peptidases, esterases, glycosyl hydrolase, and endonucleases. Some components are classical parasitoid venom proteins with known functions, including extracellular superoxide dismutase 3, serine protease inhibitor and calreticulin. The venom contains novel proteins, not recorded from any other parasitoid species, including tolloid-like proteins, chitooligosaccharidolytic β-N-acetylglucosaminidase, FK506-binding protein 14, corticotropin-releasing factor-binding protein and vascular endothelial growth factor receptor 2. These new data generate hypotheses and provide a platform for functional analysis of venom components.


Introduction
Most hymenopteran parasitoids are very small insects that produce minute quantities of venom, which helps explain why research in this area is limited to about 70 of the approximately 300,000 venom-producing species, mainly stinging wasps, bees, and ants [1,2]. Beyond understanding the mechanisms of host/parasitoid relationships, research into parasitoid venom has the potential to uncover a wealth of biomolecules of value in agriculture and pharmacology [1,3,4].

Identification of Secreted Proteins
The venom proteins were separated on 12% sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) gels, with bands ranging from 10 to more than 250 kDa (apparent molecular mass, Figure 1), which we resolved into 922 proteins by LC-MS/MS. The peptide sequences of these proteins matched the VG and FAC unigene databases (Supplementary Table S2) Table S3). Some of these unigenes correspond to cellular proteins such as ribosomal proteins without predicted signal peptides (e.g., comp44503_c0, comp44417_c0, comp11049_c0, comp11512_c0, comp44398_c0, and comp44381_c0), which were likely contaminants from cell leakage during venom collection. Limiting the venom components in silico to the putative proteins with predicted signal peptides, or with venom homologs in other parasitoid species, totally resulted in 37 candidate venom proteins ( Figure 2 and Table 3). Abundant venom proteins are usually more important [1]. The LC-MS/MS system (LTQ-VELOS; Thermo Finnigan, San Jose, CA, USA) and Sequest search algorithm we used could not provide us quantitative information exactly, so here we divided the venom components into three relative transcript levels according to RPKM values: (1) low abundance, RPKM < 100; (2) medium abundance, RPKM from 100 to 1000; and (3) high abundance, RPKM > 1000 (listed in Table 3). Nine components belong to the high-abundance group, including a carboxypeptidase and three esterases implicating the important roles of enzymes played in the venom cocktail. Thirteen venom components fall into low-abundance group, including a mesencephalic astrocyte-derived neurotrophic factor (MANF), two DnaJ homolog subfamily members and two endoplasmic reticulum proteins (ERp). These proteins may be mainly expressed in secretory cells and regulate venom protein folding and secretion. Whether these proteins are actual venom components or tissue contamination is still unclear. We validated the venom proteins by determining the tissue specificity of these venom gene transcription because these venom genes would be expressed exclusively, or nearly so, in the venom gland. Although some possible venom genes, such as FKBP14 ( Figure 3E), IEP-2 ( Figure 3H), CRT ( Figure 3J) and CRF-BP ( Figure 3K) were not specifically expressed in the venom gland, we did not exclude their possible roles as virulence factors when expressed in the venom gland, even at low concentration [36,44]. These data indicate specific venom tissue expression of most of the venom genes ( Figure 3).
In the following sections we discuss the major groups of venom proteins.

Identification of Secreted Proteins
The venom proteins were separated on 12% sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) gels, with bands ranging from 10 to more than 250 kDa (apparent molecular mass, Figure 1), which we resolved into 922 proteins by LC-MS/MS. The peptide sequences of these proteins matched the VG and FAC unigene databases (Supplementary Table 2), among which 55 unigenes ( Figure 2) were included in the UVG database (Supplementary Table 3). Some of these unigenes correspond to cellular proteins such as ribosomal proteins without predicted signal peptides (e.g., comp44503_c0, comp44417_c0, comp11049_c0, comp11512_c0, comp44398_c0, and comp44381_c0), which were likely contaminants from cell leakage during venom collection. Limiting the venom components in silico to the putative proteins with predicted signal peptides, or with venom homologs in other parasitoid species, totally resulted in 37 candidate venom proteins ( Figure  2 and Table 3). Abundant venom proteins are usually more important [1]. The LC-MS/MS system (LTQ-VELOS; Thermo Finnigan, San Jose, CA, USA) and Sequest search algorithm we used could not provide us quantitative information exactly, so here we divided the venom components into three relative transcript levels according to RPKM values: (1) low abundance, RPKM < 100; (2) medium abundance, RPKM from 100 to 1000; and (3) high abundance, RPKM > 1000 (listed in Table 3). Nine components belong to the high-abundance group, including a carboxypeptidase and three esterases implicating the important roles of enzymes played in the venom cocktail. Thirteen venom components fall into low-abundance group, including a mesencephalic astrocyte-derived neurotrophic factor (MANF), two DnaJ homolog subfamily members and two endoplasmic reticulum proteins (ERp). These proteins may be mainly expressed in secretory cells and regulate venom protein folding and secretion. Whether these proteins are actual venom components or tissue contamination is still unclear. We validated the venom proteins by determining the tissue specificity of these venom gene transcription because these venom genes would be expressed exclusively, or nearly so, in the venom gland. Although some possible venom genes, such as FKBP14 ( Figure 3E), IEP-2 ( Figure 3H), CRT ( Figure 3J) and CRF-BP ( Figure 3K) were not specifically expressed in the venom gland, we did not exclude their possible roles as virulence factors when expressed in the venom gland, even at low concentration [36,44]. These data indicate specific venom tissue expression of most of the venom genes ( Figure 3).
In the following sections we discuss the major groups of venom proteins.    NC means that prediction of secretion could not be performed due to the incompleteness of the sequence and Y represents the deduced amino acid sequences with predicted signal peptides. L, M, and H represent low, medium and high abundance of unigene expression levels, respectively, according to the RPKM values.

Hydrolases
Proteases Serine protease (SP) Cc-Ven1 (comp32062_c0) shares high sequence similarity (BlastP, 50% identity, E-value = 2e −130 ) to SP homolog 42 isoform 1 precursor (GenBank: NP_001155078.1) identified from the venom of an ectoparasitoid wasp, Nasonia vitripennis (Hymenoptera: Pteromalidae). There were 15 members of the SP family screened from Nasonia vitripennis venom using bioinformatic and/or proteomic approaches. The SP family is the best represented of all Nasonia vitripennis venom constituents [46]. The functions of all Nasonia vitripennis venom SPs have not been thoroughly investigated. Bee venom SP kills insects via melanization and exhibits fibrin(ogen)olytic activity [47]. Serine protease homologs (SPH), formed by an amino acid substitution in the three catalytic triads (His-Asp-Ser), share similar amino acid sequences with active SPs, and function in several physiological processes of insects [9]. For instance, a venom SPH (Vn50) from Cotesia rubecula without proteolytic activity (because the conserved serine of the catalytic triad is glycine) significantly inhibited melanization of host hemolymph via suppressing prophenoloxidase activation. The deduced amino acid sequence of Cc-Ven1 is composed by two domains, including a CUB domain (cd00041, interval: 31-112, E-value: 1.93e −04 ) at N-terminus and a catalytic domain (cd00190, interval: 148-379, E-value: 2.20e −53 ) at C-terminus. The CUB domains are usually present in the proteins related to development [48]. This domain is commonly characterized by four conserved cysteine residues, forming two pairs of disulfide bonds ( Figure 4A). Compared to the CUB-domain SPs (CUBs), the identifications and functions of Clip-domain SPs (CLIPs) are studied very well. The sequences of CUBs and CLIPs are similar with each other in their catalytic domains. According to this point, we did the multiple sequence alignment of Cc-Ven1 with other arthropod SPs, including CLIPs and CUBs, based on their catalytic domains [49]. In the catalytic domain, Cc-Ven1 also contains the three catalytic triads and three disulfide linkages, like the condition in the catalytic domains of other SPs. We found that Cc-Ven1 is more similar to Group I SPs. A difference between Group I and II SPs is that the cleavage sites in all Group II members are followed with Lys or Arg, while the sites in most enzymes of Group I are followed by Leu or Ile. Additionally, the catalytic domains of Group II members contain a short insertion, including two specific Cys residues, present between the catalytic triads His and Asp ( Figure 4B) [49]. Group I enzyme contains coagulation factor B and proclotting enzyme of Tachypleus tridentatus (Xiphosurida Limulidae), which indicates to us it may act in regulating host hemolymph clotting. However, Cc-Ven1 differs from non-venom SPs by the replacement of the determinant Gly by Ala in the specificity pocket, thus indicating that Cc-Ven1 has a distinct protease target specificity.

Hydrolases
Proteases Serine protease (SP) Cc-Ven1 (comp32062_c0) shares high sequence similarity (BlastP, 50% identity, E-value = 2e −130 ) to SP homolog 42 isoform 1 precursor (GenBank: NP_001155078.1) identified from the venom of an ectoparasitoid wasp, Nasonia vitripennis (Hymenoptera: Pteromalidae). There were 15 members of the SP family screened from Nasonia vitripennis venom using bioinformatic and/or proteomic approaches. The SP family is the best represented of all Nasonia vitripennis venom constituents [46]. The functions of all Nasonia vitripennis venom SPs have not been thoroughly investigated. Bee venom SP kills insects via melanization and exhibits fibrin(ogen)olytic activity [47]. Serine protease homologs (SPH), formed by an amino acid substitution in the three catalytic triads (His-Asp-Ser), share similar amino acid sequences with active SPs, and function in several physiological processes of insects [9]. For instance, a venom SPH (Vn50) from Cotesia rubecula without proteolytic activity (because the conserved serine of the catalytic triad is glycine) significantly inhibited melanization of host hemolymph via suppressing prophenoloxidase activation. The deduced amino acid sequence of Cc-Ven1 is composed by two domains, including a CUB domain (cd00041, interval: 31-112, E-value: 1.93e −04 ) at N-terminus and a catalytic domain (cd00190, interval: 148-379, E-value: 2.20e −53 ) at C-terminus. The CUB domains are usually present in the proteins related to development [48]. This domain is commonly characterized by four conserved cysteine residues, forming two pairs of disulfide bonds ( Figure 4A). Compared to the CUB-domain SPs (CUBs), the identifications and functions of Clip-domain SPs (CLIPs) are studied very well. The sequences of CUBs and CLIPs are similar with each other in their catalytic domains. According to this point, we did the multiple sequence alignment of Cc-Ven1 with other arthropod SPs, including CLIPs and CUBs, based on their catalytic domains [49]. In the catalytic domain, Cc-Ven1 also contains the three catalytic triads and three disulfide linkages, like the condition in the catalytic domains of other SPs. We found that Cc-Ven1 is more similar to Group I SPs. A difference between Group I and II SPs is that the cleavage sites in all Group ΙΙ members are followed with Lys or Arg, while the sites in most enzymes of Group Ι are followed by Leu or Ile. Additionally, the catalytic domains of Group II members contain a short insertion, including two specific Cys residues, present between the catalytic triads His and Asp ( Figure 4B) [49]. Group Ι enzyme contains coagulation factor B and proclotting enzyme of Tachypleus tridentatus (Xiphosurida Limulidae), which indicates to us it may act in regulating host hemolymph clotting. However, Cc-Ven1 differs from non-venom SPs by the replacement of the determinant Gly by Ala in the specificity pocket, thus indicating that Cc-Ven1 has a distinct protease target specificity.  The two unique Cys residues in most group II proteases are shown in the green boxes. The amino acid prior to the proteolytic activation site (arrow) is included at the beginning of each sequence. The sequence analyses of catalytic domains were mainly based on [50]. Multiple alignment was performed using ClustalX2. Protein full names and sequence accession numbers are provided in Supplementary Table 4. For the phylogenetic analysis, we chose a total of 57 of CLIPs and 13 of CUBs, respectively according to recently reported literature [51][52][53][54]. We also chose venom CLIPs and CUBs from two parasitoid wasps: Nasonia vitripennis and Pteromalus puparum [46,55]. According to [52,56], CLIPs are The two unique Cys residues in most group II proteases are shown in the green boxes. The amino acid prior to the proteolytic activation site (arrow) is included at the beginning of each sequence. The sequence analyses of catalytic domains were mainly based on [50]. Multiple alignment was performed using ClustalX2. Protein full names and sequence accession numbers are provided in Supplementary Table S4. For the phylogenetic analysis, we chose a total of 57 of CLIPs and 13 of CUBs, respectively according to recently reported literature [51][52][53][54]. We also chose venom CLIPs and CUBs from two parasitoid wasps: Nasonia vitripennis and Pteromalus puparum [46,55]. According to [52,56], CLIPs are divided into four subfamilies. Subfamily A (CLIPA) is composed by SPHs solely, whereas subfamilies B (CLIPB), C (CLIPC), and D (CLIPD) comprise SPs, mainly. All of CUBs including the venom CUB, Cc-Ven1 (CUB-V-CcSP) are clustered into the same clade. Two venom CUBs from Nasonia vitripennis are clustered into the CUB clade. Three venom CLIPs from Nasonia vitripennis are clustered into CLIPB, while two venom CLIPs from Pteromalus puparum are clustered into CLIPB and CLIPD, respectively ( Figure 5). Four CLIPs may represent lineages of SP derived from ancient evolutionary events since these subfamilies already exist in Anopheles gambiae (Diptera: Culicidae), Drosophila melanogaster, and Helicoverpa armigera (Lepidoptera: Noctuidae) [51,52,56]. Moreover, expansion of individual groups occurred several times to account for the gene clusters observed in Tribolium castaneum (Coleoptera: Tenebrionidae), Anopheles gambiae, and Drosophila melanogaster genomes [51,56,57]. The three venom CLIPs from Nasonia vitripennis might originate from SPs in CLIPB, whereas two venom CLIPs of Pteromalus puparum might originate from two different subfamilies (CLIPB and CLIPD) and were recruited to venom respectively. Cc-ven1 and two venom CUBs are clustered together in CUB clade, which indicates a similar function they probably achieve. Our phylogenetic analysis implies that the evolutionary processes of these venom SPs from parasitoid wasps may be strongly diverse. However, the essential functions of Cc-ven1 and other venom SPs are still unknown. We infer that venom SPs probably play different roles in the successful parasitization, including increasing the efficiency of venom, suppressing host immunity, and regulating the host development.
Toxins 2017, 9,135 10 of 24 divided into four subfamilies. Subfamily A (CLIPA) is composed by SPHs solely, whereas subfamilies B (CLIPB), C (CLIPC), and D (CLIPD) comprise SPs, mainly. All of CUBs including the venom CUB, Cc-Ven1 (CUB-V-CcSP) are clustered into the same clade. Two venom CUBs from Nasonia vitripennis are clustered into the CUB clade. Three venom CLIPs from Nasonia vitripennis are clustered into CLIPB, while two venom CLIPs from Pteromalus puparum are clustered into CLIPB and CLIPD, respectively ( Figure 5). Four CLIPs may represent lineages of SP derived from ancient evolutionary events since these subfamilies already exist in Anopheles gambiae (Diptera: Culicidae), Drosophila melanogaster, and Helicoverpa armigera (Lepidoptera: Noctuidae) [51,52,56]. Moreover, expansion of individual groups occurred several times to account for the gene clusters observed in Tribolium castaneum (Coleoptera: Tenebrionidae), Anopheles gambiae, and Drosophila melanogaster genomes [51,56,57]. The three venom CLIPs from Nasonia vitripennis might originate from SPs in CLIPB, whereas two venom CLIPs of Pteromalus puparum might originate from two different subfamilies (CLIPB and CLIPD) and were recruited to venom respectively. Cc-ven1 and two venom CUBs are clustered together in CUB clade, which indicates a similar function they probably achieve. Our phylogenetic analysis implies that the evolutionary processes of these venom SPs from parasitoid wasps may be strongly diverse. However, the essential functions of Cc-ven1 and other venom SPs are still unknown. We infer that venom SPs probably play different roles in the successful parasitization, including increasing the efficiency of venom, suppressing host immunity, and regulating the host development.   Supplementary Table S4. We identified three metalloproteinases belonging to the metzincin protease superfamily [59]. Cc-Ven2 (comp35977_c0) displays 29% identity (BlastP, E-value = 1e −58 ) to a disintegrin and metalloproteinase with thrombospondin motifs (ADAMTS) (GenBank: EGI57486.1) from the leaf-cutting ant, Acromyrmex echinatior (Hymenoptera: Formicidae). ADAMTS proteins are structurally organized into a proteinase and an ancillary domain and they are found in all metazoans, acting in the structure and function of the extracellular matrix [59]. ADAMTS occur in venoms from snakes, social stinging wasps and endoparasitoids, contributing to tissue damage [50,60]. Cc-Ven2 contains an ADAM cysteine-rich domain (ACR) (smart00608, E-value = 4.83e −03 ), which regulates metalloprotease activity [61], but lacks an ancillary domain responsible for substrate-binding preferences [59]. Truncated single-domain proteins acting as virulence factors are for other venoms. Lacking selectivity domains could make them broadly toxic [62,63].
Cc-Ven3 and Cc-Ven4 share 42% and 44% identity (BlastP, E-value = 1.00e −94 and 7.00e −124 ), respectively, with a Tolloid-like protein 1 (TLP 1) from Apis florea (GenBank: XP_003695260.2). They act in activation of growth factors, degradation of polypeptides, serve as venom toxins of brown spiders, digest molecules and spread other toxins through host bodies [64][65][66]. A typical Tolloid structure contains an N-terminal Astacin domain and C-terminal CUB domains [65], which determine specificity.  [62]. Consistent with the previous results [62], two tolloid-like venom proteins of Nematostella vectensis (Actiniaria: Edwardsiidae), NEP-6 and NEP-14 without C-terminal CUB domains, which are different from typical Tolloid structure are clustered into the subgroup of Tolloid proteins, indicating evolved from a tolloid-like ancestor by loss of the CUB domains. By contrast, spider Astacin-like toxins are clustered out of the BMP1/Tolloid subgroup of the Astacin family, which indicates they are probably recruited into venom from other subgroups of Astacin family by losing other C-terminal domains, although the recruiting mechanism is not clear and need to be further studied in future. Here, the phylogenetic results show that Cc-Ven3 and Cc-Ven4 share a much closer evolutionary relationship with Cnidaria venom proteins than with spider venom proteins ( Figure 6B). Although the reports of Astacin-like proteins as toxins are limited, the phylogenetic results imply that there may be a great diversity of evolutionary origins in arthropod venom proteins.
NAGs have not been found in parasitoid venoms, but a 52-kDa chitinase was identified in the venom of the egg parasitoid Chelonus near curvimaculatus (Hymenoptera: Braconidae) containing all the conserved residues for chitinolytic activity. A role in dissociation and degradation of host tissue cells for endoparasitoid larval feeding has been proposed [72,73]. comp36742_c0 shows sequence similarity (BlastP, 88% identity, E-value = 0) to a chitinase domain-containing protein 1 from Microplitis demolitor (GenBank: XP 008553138.1) containing a Glyco_hydro_18 domain (pfam00704), corresponding to the glycosyl hydrolases family 18 (E-value = 3.17e −21 ).

Other Venom Proteins
Alongside the enzymatic proteins mentioned above, we identified other components with high identities to the venom components previously characterized from other parasitoid wasps. Those are discussed in the following sections.

Proteins Involved in the Folding and Export of Secreted Venom Proteins
Additionally, of the PDIs mentioned above, we also identified other proteins probably involved in the folding and export of secreted venom proteins. Cc-Ven27 (comp10689_c0), MANF is able to protect cells against the endoplasmic reticulum stress [91]. Cc-Ven28 (comp32038_c0) and Cc-Ven29 (comp45528_c0), annotated as DnaJ proteins, probably function as cofactors of heat shock proteins 70 kDa (Hsp70) and may modulate protein assembly, disassembly and translocation [92]. Cc-Ven30 (comp32086_c0), annotated as ERp29, is structurally similar to PDI and its function may be for protein folding and secretion [93], Cc-Ven31 (comp11316_c0), annotated as ERp44, probably functions as a multifunctional chaperone of the PDI family [94]. Cc-Ven32 (comp35875_c0) is annotated as a member of HSP70 family, and is one of the common chaperone of ubiquitous class. It is able to control the aspects of cellular proteostasis [95]. Hsp70 proteins were also identified in the venom of the parasitoids Pteromalus puparum [87] and two Psyttalia species [84], however, their functions are unknown. Actually, molecular chaperones were identified as venom proteins in some other parasitoid wasps [37,84]. For example, HSP70, calreticulin and PDI, were found in two Psyttalia species as venom components. It is speculated that they may play roles in the stabilization of other venom proteins, and/or their transport and targeting to the host cells [84]. Endoplasmins, another kind of molecular chaperones, have been identified from both Aphidius ervi and Psyttalia species venoms. It is reported that this protein is associated with the secretion of pancreatic lipases and their further internalization by intestinal cells in the rat [96]. It is speculated that it may help the secretion, stabilization, transport and host targeting of other venom proteins.

Venom Gland Collection and Total RNA Isolation
Mated female wasps aged 1-3 days were anaesthetized at −70 • C for 5 min, swabbed with 75% ethanol (v/v), dried and then dissected in sterile Pringle's phosphate-buffered saline (PBS) with 3 µL RNase inhibitor at 1 unit/µL (TOYOBO, Osaka, Japan) on an ice plate under a Leica MZ 16A stereomicroscope (Leica, Wetzlar, Germany). Venom glands and carcasses without venom apparatus were collected into TRIzol reagent (Invitrogen, Carlsbad, CA, USA). Total RNA was extracted using TRIzol reagent according to the manufacture's protocol. We used agarose gel electrophoresis, Qubit Fluorometer (Thermo Scientific, Wilmington, DE, USA), Agilent 2100 (Agilent Technologies, Santa Clara, CA USA), and nanodrop 2000 (Thermo Scientific, Wilmington, DE, USA) to determine the quality and quantity of the total RNA samples, respectively.

Construction and Sequencing of cDNA Library
The construction and sequencing of cDNA library were done by Novogene Bioinformatics Institute (Beijing, China). Briefly, mRNA was purified from total RNA using poly-T oligo-attached magnetic beads. Under elevated temperature (94 • C), fragmentation was carried out using divalent cations in an Illumina proprietary fragmentation buffer. After first and second strand cDNA synthesis, the remaining overhangs were converted into blunt ends via exonuclease/polymerase and enzymes were removed. Illumina PE adapter oligonucleotides were ligated to prepare for hybridization following adenylation of 3 ends of DNA fragments. The library fragments of preferentially 200 bp with ligated adaptors on both ends were selectively enriched using Illumina PCR Primer Cocktail in a 10 cycle PCR reaction. Products were purified (AMPure XP system) and quantified using the Agilent high sensitivity DNA assay on an Agilent Bioanalyzer 2100 system.
After cluster generation using TruSeq PE Cluster Kit v3-cBot-HS (Illumina, San Diego, CA, USA), the library preparations were sequenced on an Illumina Hiseq 2000 platform and 90 bp paired-end reads were generated.

Data Analysis
Raw reads in fastq format were processed through an in-house Perl scripts and then deposited into the NCBI Short Read Archive (SRA) database [98] by Novogene Bioinformatics (Beijing, China). We obtained the clean reads by removing the reads containing adapters, reads including ploy-N and low quality reads from the raw data. The remaining clean reads were assembled using Trinity v2012-10-05, as described for de novo transcriptome assembly without reference genome using the parameter min_kmer_cov set to two by default, and all other parameters set to default [43]. After assembling, the longest transcript from each transcript cluster was chosen as the unigene, according to the proposal recommended by Trinity [43]. All unigenes were annotated by blastx search with a cutoff of 1e −5 against the following databases: NCBI non-redundant protein sequences (Nr), NCBI non-redundant nucleotide sequences (Nt), Protein family (Pfam), Clusters of Orthologous Groups of proteins (KOG/COG), Swiss-Prot (a manually annotated and reviewed protein sequence database), KEGG Ortholog database (KO), and Gene Ontology (GO).
Gene expression levels were estimated with the software RSEM (rsem-1.2.0) for each sample. Clean data were mapped back onto the assembled transcriptome, and the readcount for each gene was calculated from the mapping results. Prior to differential gene expression analysis, for each sequenced library, the read counts were adjusted by edgeR (3.0.8) through one scaling normalized factor [99]. Differential expression analysis of two samples, including venom glands and carcasses, was performed using the R package DEGSeq v1.2.2 [100]. The p-values were adjusted to q-values to account for multiple testing [101]. The q-value < 0.005 and |log2(foldchange)| > 1 was set as the threshold for significantly-differential expression. GO enrichment analysis of the differentially-expressed genes was implemented by GOseq R based Wallenius non-central hyper-geometric distribution [102], which adjusts for gene length bias in DEGs. GO terms at the second level was used to perform GO annotation.

SDS-PAGE of Venom and Protein Identification
Mated female wasps, aged 1-3 days, were dissected in sterile Pringle's PBS with 1 mM phenylmethanesulfonyl fluoride (PMSF) (Sigma, St. Louis, MO, USA) as described above. The venom reservoirs were isolated, washed three times, and transferred into 1.5 mL Eppendorf tubes. After centrifugation at 8000× g for 10 min at 4 • C, the supernatant was filtered with a 0.22 µm Millipore filter and stored at −70 • C until use. The concentration of venom protein was determined using the Modified Bradford Protein Assay Kit (Sangon Biotech, Shanghai, China).
The relative accumulations of mRNAs in each tissue was calculated using the comparative 2 −∆∆CT method [104], following the guidelines of Bustin et al. [105]. We took the lowest mRNA abundance level as the calibrator and assessed the relative mRNA abundances by comparing the abundances of each target gene in other tissues to the lowest one. The results are presented as mean mRNA abundances of three independent biological replicates. mRNA abundances among tissues were compared using one-way analysis of variance (ANOVA) and Tukey's test with statistical significance set at p < 0.05. All statistical analyses were performed using the Data Processing System (DPS) package (Version 9.5) [106].

Sequence Alignment and Phylogenetic Analysis
Prediction of signal peptide was performed by online software SignalP 4.1 [107]. Multiple sequence alignments of the amino acid sequences were performed with ClustalX2 [108] and edited with GeneDoc. For the phylogenetic analysis of CLIPs, we selected the sequences according to recent references [51][52][53], from different insect species including Tribolium castaneum, Drosophila melanogaster, Anopheles gambiae, Bombyx mori (Lepidoptera: Bombycidae), Manduca sexta, Aedes aegypti (Diptera: Culicidae), and Helicoverpa armigera. We selected CUBs from the same species described above. SPs of Drosophila melanogaster, Anopheles gambiae, Bombyx mori were obtained from NCBI RefSeq protein database, using Blastp with Cc-Ven1 as the query sequence. The sequences from database with the identities of E-value < e −5 were chosen. Additionally, we searched the Cc-Ven1 against the BeetleBase [109] to identify the SP of Tribolium castaneum (E-value < e −5 ). We used the library of Pfam HMM [110] to confirm the CUBs with a cutoff of 0.001. CUB of Manduca sexta we used was according to the reported literature [111]. Three venom CLIPs and one CUB (containing two isoforms) identified from Nasonia vitripennis [46], and two venom CLIPs from Pteromalus puparum [55] were also selected to construct the phylogenetic tree. For the phylogenetic analysis of proteins containing Astacin domains, the sequences we used were selected accordingly [62]. For the selected proteins used in analyses, not all proteins contained several domains (CUB, EGF, MAM, and Shk), but they all contained the common Astacin domain. The boundaries of catalytic domain (for SPs), proteinase-like domain (for SPHs), and Astacin domain were determined using Pfam. The multiple sequences containing the same domain were aligned using ClustalX2. For the catalytic and proteinase-like domains of SP and SPH, respectively, low quality alignment regions were removed by Gblocks Server [112]. PhyML with Smart Model Selection [58] was used to perform the phylogenetic analyses with 1000-fold bootstrap re-sampling. The full names of the proteins and accession numbers of the sequences used in this study are shown in Supplementary Table S4.

Conclusions
We offer several conclusions meant to provide a context for research into parasitoid venoms and their components. First, in their natural and in human-managed population dynamics, parasitoids provide a wide range of biological control services in agroecosystems. These services lead to reduced pest insect damage in cropping systems, reduced use of chemical insecticides, and they help slow the evolution of resistance in pest populations. It follows that research into all aspects of parasitoid biology, including biosystematics, ecology, physiology, biochemistry, and molecular biology contributes critically valuable new knowledge that can be applied to improve parasitoid efficacy. Recent research has led to new understandings and concepts in host-parasitoid relationships. Discovery of the immunosuppressive actions of parasitoid venoms and their active components guided the understanding that parasitoids have evolved specific mechanisms that operate to protect juvenile wasps from the dangerous host immunological reactions to overcome invaders. Looking forward, a large number of findings will drive research into the identities of specific venom components responsible for influencing various aspects of host biology.
Many parasitoid venom components, including proteins, have been reported [1]. Here, we report 37 putative venom proteins of Cotesia chilonis using the combination of transcriptomic and proteomic analyses. Our previous physiological research indicates that this parasitoid venom is able to partially inhibit host hemocyte-spreading behavior and encapsulation response to beads [42]. The results also imply that Cotesia chilonis uses a combination of passive and positive strategies to keep its offspring alive in its host larvae [42]. Consistent with the physiological results, we identified not only the proteins involved in active immune suppression, but also the proteins including IEP-2 and Crp32 in contribution to "passive" strategy as the previous descriptions. A passive strategy may include deposition of proteins to protect parasitoid eggs and young larvae. Three egg surface proteins have been identified, including IEP from Cotesia kariyai, Crp32 from Cotesia rubecula and an O-glycosylated protein called hemomucin from Macrocentrus cingulum (Hymenoptera: Braconidae), that protect the offspring from encapsulation during the early stages of growth [45,113,114]. Here, we report more evidence on the active and passive mechanisms used by Cotesia chilonis to avoid host immune responses.
Changing of selective binding sites of proteases probably alter the enzymes to be the toxins. The replacement of the determinant Gly by Ala in the specific pocket structure of SP (Cc-Ven1) and the loss of an ancillary domain responsible for substrate-binding specification of disintegrin and metalloproteinase (Cc-Ven2) could make them better candidates as virulence factors in the venom of Cotesia chilonis, as the condition in other noxious animal species [62]. However, for two Astacin-domain containing metalloproteinases (Cc-Ven3 and Cc-Ven4), the present CUB domains may improve their target specificity. This is probably why two TLPs from Cotesia chilonis venom possess relatively low similarity (only 49% of identities).
Without belaboring the point, the idea here is identifying potential biological functions of venom proteins almost immediately leads to testable hypotheses about specific mechanisms of host-parasitoid interactions. Various organisms, including insects, are sources of commercially-and/or biomedically useful proteins and chemistries. Workers in insect science grasp the economic significance of the soil bacterium, Bacillus thuringiensis, whose toxins and cognate genes have been highly developed into potent insect management technologies [115]. Commercially valued insect products include honey and pollination services from honey bees, Apis mellifera (Hymenoptera: Apidae), silk from the silkworm, Bombyx mori, shellac from the lac scale insect, Laccifer lacca (Homoptera: Coccidae), indelible iron gall ink from Oak-produced Aleppo galls triggered by the cynipid wasp, Cynips gallae-tinctoriae (Hymenoptera: Cynipidae), and the red cochineal dye from the scale Dactylopius coccus (Hemiptera: Dactylopiidae), formerly used as a fabric dye and now used in cosmetics [116]. The discovery of insect antimicrobial peptides in the early 1980s (>500 now known) has been developed into a research platform for development of medical antibiotics [117], a still on-going enterprise. Here, the point is that current and future research into parasitoid venoms will lead to discoveries of new chemistries, new proteins and new genes, some of which may become products useful in agriculture and many other areas.
Overall, we foresee research into parasitoid venom components will yield new understanding of parasitology, help generate new concepts and hypotheses and possibly lead to new valuable products.

Conflicts of Interest:
The authors declare no conflict of interest.