2.2. Most Abundant Proteins in the Venom Gland Transcriptome and in the Venom of C. salei
Venom gland proteins were grouped into three functional categories: (1) proteins involved in the protein and peptide-processing machinery, (2) proteins possibly recruited and neofunctionalized in the venom gland, and (3) proteins with putative functions in the innate immune system of the spider. Eighteen out of nineteen identified protein groups exhibit a signal peptide showing that these proteins are synthesized in the endoplasmic reticulum (ER), and may act as enzymes within the ER or are synthesized for the excretion process in the venom gland [
17]. Five of these proteins are enzymes, belonging to the protein-/peptide-processing machinery. These are a signal peptidase (SP), specific serine proteases (VSP), protein disulfide isomerases (PDI), carboxypeptidase (CPA), and peptidylglycine α-amidating monooxygenase (PAM), together representing 30.0% of all expressed venom gland-specific protein transcripts (
Figure 1). Several protein groups are thought to be recruited and neofunctionalized: amylases (α-AMY, 40.9%), cysteine-rich secretory proteins (CRISPs, 15.7%), angiotensin-converting enzyme (ACE, 3.4%), hyaluronidase (HYAL, 2.2%), cystatin (CST, 0.4%), thyroglobulin type-1-like protein (TT1LP, 0.3%), insulin-like growth factor-binding protein-related protein 1 (IGFBP-rP1, 0.1%) [
18], Kunitz domain-containing protein (KCP, 0.06%), and phospholipase (PLA2, 0.04%). Immune-relevant proteins might be tachylectin 5A (TL5A, 1.2%), and leucine-rich repeat protein (5.7%) (
Table 1).
2.6. Cysteine-Containing (Putative) Neurotoxins
To obtain comparable expression data of different neurotoxins, besides real-time PCR, two possibilities exist. (A) Normalizing and counting sequencing reads that map to contigs of a given neurotoxin, or (B) counting only reads that map to the mature peptide sequences within the contigs. For a given contig, we often observed a great imbalance of the number of normalized reads mapping to signal, pro-peptide or mature peptides. For quantification, therefore, we only considered reads mapping to the full mature peptide sequences of contigs (n(full reads) = 2420) related to venom neurotoxins. For two peptides (CsTx-39 and CsTx-20a, b), and 13 further isoforms of different peptides, no full reads were available, and therefore overlapping reads (n(no full reads) = 15) were used and counted each as one read.
Identified (putative) neurotoxins were classified into peptide families based on an updated version of HMMs [
79]. Peptides were named according to the present valid nomenclature for spider peptide toxins [
80]. Most of the toxins exhibit the inhibitor cystine knot (ICK) fold [
81], the second most abundant fold is the colipase MIT1-like fold [
82]. Two sequences show unknown cystine-folding patterns (
Table 2). For a venom gland-specific defensin, the conserved cystine-stabilized α/β structural fold is supposed [
83].
To our astonishment, 93.7% of all expressed neurotoxin-like transcripts are classified in only two different peptide families. These are the SN_19 family (83.1%), with subgroups SN_19_06, 12, 13 and 14, and the SN_02 family (10.6%) with subgroups SN_02_03, 04, 07 and 16. In contrast, the lower-expressed (putative) neurotoxins (6.3%) belong to 13 different peptide families and several subgroups (
Figure 2). The transcripts of these peptides are all composed of a signal peptide, a pro-peptide with a C-terminal processing motif (PQM, in rare cases a dibasic “KR” motif), and the mature peptide. Some mature peptides feature a C-terminal glycine residue for PAM-mediated amidation [
24] (
Supplementary Table S1). Astonishingly, transcripts that belong to the SN_02 family were likewise identified in the transcriptome of the pseudoscorpion
Synsphyronus apimelus [
65]. The majority of
C. salei’s (putative) neurotoxins exhibit similarities to neurotoxins of other araneomorph spiders, but not mygalomorph spiders. The only toxin family found in
C. salei and in mygalomorph spiders, scorpions, and pseudoscorpions is the SN_32 family (MIT1-like AcTx family) [
86]. Besides the SN_32 family, transcripts of the related SN_20 family were identified in
C. salei and other araneomorph spiders. Peptides of these families are composed of a signal peptide, directly followed by the mature peptide and a stop signal. Although nothing is known about the target of these peptides, they might represent, besides enzymes and protease inhibitors, one of the phylogenetically first peptides recruited into spider venom glands.
In total, we identified 81 transcripts of (putative) neurotoxins resulting in 66 different precursors and 54 mature variants. The majority of the mature peptide variants were confirmed by sequence analysis via Edman sequencing and/or top-down/bottom-up proteomics. Two further peptides, which we have not identified in the transcriptome, have been formerly determined by Edman degradation (CsTx-11b and CsTx-18b); thus, the total number of neurotoxins is 56.
2.9. SN_19 Family
The SN_19 family is the most abundant peptide family (83.6%) identified in the venom and venom gland transcriptome of
C. salei [
12]. This family combines neurotoxins of different structural motifs (
Table 2,
Figure 3). The main neurotoxin CsTx-1 (26.4%), is characterized by an N-terminal ICK motif and a C-terminal α-helix. This α-helical part acts in a cytolytic manner [
15], whereas the N-terminally located ICK-fold seems to be responsible for the inhibition of L-type Ca
2+ channels [
88]. The presence of two different motifs within CsTx-1 enhances insecticidal activity when compared with the C-terminally truncated form [
27]. Such a motif combination is also present in CsTx-10a, b, and 11a, b (1.7%) with a shorter C-terminal α-helical part. No C-terminal α-helix, but an ICK motif is predicted for CsTx-9a, b, c (6.7%), as well as for CsTx-33a,b (0.4%).
CsTx-1 exhibits identities between 39.4% and 52.3% with peptides from two distant spider species within the RTA clade:
V. fasciatus (VIRFA_DN65866_c0_g1_i1_4) and
L. singoriensis (B6DCP0). A comparable peptide is identified in
Nephila pilipes (NEPPI_DN30656_c0_g6_i1_5, 42.9%), belonging to the Araneoidea. All these related peptides are composed of two structural motifs, an N-terminal ICK motif and a C-terminal α-helix. Of interest is the high identity of CsTx-1 with the C-terminal domain of δ-miturgitoxin-Cp2a (A0A059T2H4, 44.6%) from
Cheiracanthium punctorium, which exhibits N-terminally the ICK motif and C-terminally a short α-helical structure. So far, three different peptide groups have been mainly described from the venom of
C. punctorium and all belong to the SN_19 family [
89]. These peptides, in which two neurotoxins succeed each other, might be interpreted as a specific further development of the SN_19 toxin family. The N-terminal domain is characterized by an ICK motif followed by a short-extended strand, randomly coiled, or an α-helical region that connects to the C-terminal domain. This domain is again characterized first by an ICK motif, followed by differently pronounced α-helical tails comparable to CsTx-1, 10, and 11 (
Supplementary Figure S2.1/S2.2).
Up to now, CsTx-1 has been the only known neurotoxin that exhibits such a long cationic C-terminal tail of 30 amino acid residues, and therein an α-helical region composed of 14 amino acid residues. However,
L. singoriensis expresses several two-motif neurotoxins in which the C-terminal tail is composed of 21 and 28 amino acid residues with an α-helix buildup of six to 12 amino acid residues [
90]. It is supposed that such C-terminal structures are acting as possible anchors, attracting negatively charged lipid rafts or glycoproteins on different membrane types [
15,
91]. Dependent on the length and charge of their C-terminal α-helices, the neurotoxic activity of such peptides is enhanced by the cytolytic activity towards different cell types. Missing a specific target (e.g., specific ion channels), the cytolytic-acting C-terminus can still harm a prey [
92].
The highest-expressed mature peptides belong to the two-chain peptides CsTx-8, 12, and 13 (48.3%) that exhibit, besides the N-terminal ICK motif, a C-terminal α-helical motif of 11 amino acid residues [
26]. The main difference to other peptides from the SN_19 family is a specific post-translational modification. Here, the PQM-protease, which is typically responsible for cutting pro-peptides from mature peptides, recognizes a PQM, as well as an inverted PQM within the mature peptide chain. As a result, the loop, defined by the disulfide bridge between C6 and C7, is opened by cutting out a six amino acid polypeptide and two short peptide chains remain [
25]. These heterodimeric peptides are alone less insecticidal than the main toxin CsTx-1, but in combination with other monomeric peptides from the SN_19 family, (e.g., CsTx-9 or CsTx-1), a synergistic increase of toxicity is observed [
26]. With this,
C. salei exhibits a strategy to enhance the insecticidal activity of the SN_19 peptide family by a so far unknown peptide-peptide interaction between two-chain peptides and single-chain peptides [
26]. From an evolutionary point of view, it can be assumed that the synergistic action and in this respect, the production of low toxic heterodimers (e.g., CsTx-13), provide greater benefit than higher production rates of a more toxic monomer (e.g., CsTx-1).
In-depth transcript analysis of the SN_19_00 family discloses a recombination and/or splicing process concerning signal peptides, pro-peptides, and mature peptides between the different genes of the two-chain peptides CsTx-8, 12, 13, and the single chain peptides 10, 11a, and 9. All these neurotoxins exhibit the unique sequence “EVQR” as PQM. In multiple cases, heterodimeric toxins exhibit identical coding sequences for signal and pro-peptide with monomeric toxins. This holds true for CsTx-13a, b; CsTx-10a, b, and CsTx-11a, CsTx-8a and CsTx-9c, CsTx-12b and CsTx-9a, and CsTx-12a and CsTx-9b. It is thereby astonishing that some peptides with identical signal and pro-peptides show higher sequence differences in their mature chain (e.g., heterodimers CsTx-8 and 9c), than other peptides with different signal and pro-peptides (e.g., heterodimers CsTx-8, 12, and 13) (
Figure 3,
Supplementary Figures S3.1–S3.3).
CsTx-23a,b precursors (0.08%) are characterized by an unusually short pro-peptide part composed of six amino acid residues including the PQM, followed by mature peptides. Compared to other low abundant precursors identified in
C. salei, these peptides show higher identities with peptides so far described in sparassids (36.0%–52.0%), lycosids (60.0%–62.0%), and viridasiids (46.0%–48.0%) (
Supplementary Figure S2.3). Interestingly, the length of the pro-peptide for the related precursors (
C. salei and
L. singoriensis) is conserved with four or six amino acid residues.
2.10. SN_02 Family
The SN_02 family includes the shortest neurotoxin-like peptides identified in the venom of
C. salei. The (putative) neurotoxins are characterized N-terminally by an ICK motif, a fourth disulfide bridge (C6–C7), and a C-terminal tail composed of five to 15 amino acid residues. CsTx-19 (8.0%) is the most abundant peptide of this family. Its signal and pro-peptide sequence exhibits a high identity (94.1%) with the corresponding sequence of CsTx-18a (1.4%). The mature peptides show a lower identity of 57.1% due to N- and C-terminal elongations of CsTx-18a. So far, similar peptides are not reported in UniProtKB (BLAST against with an e-value cut off of 10
−5), providing evidence that CsTx-18 and CsTx-19 might represent a specific development within the genus
Cupiennius (
Figure 4).
Further peptide groups of the SN_02 family, which were identified in low abundance in the venom of
C. salei, are widespread among araneomorph spiders, especially within the RTA clade. CsTx-36 (0.37%) and CsTx-28 (0.08%) show identities with peptides from the ctenid
P. nigriventer (52.9%) and the lycosid
L. singoriensis (42.6%). For CsTx-17 (1.1%) and CSTx-31 (0.57%), related peptides were identified in
Agelenopsis aperta (identities of 39.5–53.7%) and
V. fasciatus (identities of 32.5–42.1%) (
Supplementary Figures S2.4 and S2.5).
A peptide similar to CsTx-34 (0.25%) was identified in
P. nigriventer (TX3-5_PHONI) showing 40.5% identity. This peptide inhibits L-type calcium channels [
93], produces paralysis in the posterior limbs and decreases movements after intracerebro-ventricular injection in mice [
94]. Moreover, an identity of 48.8% has been found between CsTx-34 and ω-lycotoxin-Gsp2671g from
A. marikovskyi [
95], which modulates P-type voltage-gated calcium channels in vertebrate cerebellar Purkinje cells [
96] (
Supplementary Figure S2.6). CsTx-25 (0.08%) shows identities between 47.4% and 53.7%, with a fragment identified in the transcriptome of
V. fasciatus and ω-agatoxin Iva,b, from the agelenid
A. aperta, a P-type calcium channel antagonist of insects and vertebrates [
96,
97], pointing to a broad appearance of this peptide type within the RTA clade. The identity towards CsTx-17 is very high (60.1%) (
Supplementary Figure S2.7).
In our in-house hemocyte transcriptome of
C. salei, one peptide fragment was identified that we classified into the SN_02 family. Could this peptide be an evolutionary precursor of the SN_02 family neurotoxins that was recruited into spider venom glands? The agatoxin-like peptide from hemocytes exhibits 35.9% identity to CsTx-25, but possesses a dibasic KR motif instead of the PQM as cleavage motif. Surprisingly, the agatoxin-like peptide shows identities between 44.2% and 90.7% to peptides identified in the genome or in transcriptomes of the honeybee
Apis mellifera [
98], the remipede crustacean
Xibalbanus tulumensis [
99], the tick
Ixodes ricinus (A0A147BFN0_IXORI), and the spider
Agelena orientalis [
100] (
Figure 5,
Supplementary Figure S2.7). Recently, the above-mentioned agatoxin-like peptide from the honeybee was shown to be located in the neuroendocrine tissue (glandular part of the corpora cardiaca) and might have a function as a neuropeptide and/or ion channel modulator [
98]. It may be possible that this widespread peptide from the neuronal tissue of several major arthropod groups was convergently recruited into the venom glands of different venomous arthropods [
98].
Recruitment of agatoxin-like tissue peptides occurs not only in the venom glands of spiders, but also in the venom glands of pseudoscorpions. In such a venom gland transcriptome of
Synsphyronus apimelus, 11 transcripts have been identified that exhibit high identities with precursors from different spiders [
65]. Based on HMMs [
79], we classified ten of these peptides as belonging to the SN_02 family. The precursors are composed of a signal peptide, a pro-peptide with the dibasic “KR” motif in place of a PQM as cutting site, and the mature peptide. For such a peptide (Sapi_DN110686_c0_gl_i1), a high identity of 74.4% was found towards the hemocyte-derived agatoxin-like peptide of
C. salei and 38.5% towards CsTx-25 (
Supplementary Figure S2.7).
2.12. SN_04, SN_05, and SN_31 Family
Five different neurotoxin precursor fragments, belonging to the SN_04 family, are present in concentrations between 0.04% (CsTx-41) and 0.78% (CsTx-42a, b, c) in the venom gland of
C. salei. With seven disulfide bridges, these (putative) neurotoxins exhibit the highest number of cysteines in venomous peptides so far identified in spider venoms. They show identities between 30.3% and 40.6% with neurotoxins from ctenids, such as µ-ctenitoxin-Pn1a (P17727|TXL1_PHONI), a state-dependent inhibitor of neuronal sodium channels [
101], and ω-ctenitoxin-Pn3a (P81790|TX34_PHONI) [
102,
103], an antagonist of calcium channels (Cav2.1 and Cav2.2). Interestingly, according to UniProtKB the last two disulfide bridges for omega-ctenitoxin-Pn3a connect C11–C12 and C13–C14 and, for µ-ctenitoxin-Pn1a, C11–C13 and C12–C14. Identities over 42% were found between CsTx-41 and CsTx-42a, b, c with putative precursors published for the pisaurid
Dolomedes fimbriatus [
104] (
Figure 6,
Supplementary Figure S2.8).
CsTx-37 (0.08%) was classified into the SN_05 family [
79]. The peptide exhibits high identities with transcripts from pisaurids (67.7%), and sparassids (41.5%). Moreover, CsTx-37 exhibits an identity of 56.7% with ω-agatoxin-Aa3a from the agelenid
A. aperta, which acts as antagonist of synaptosomal Ca
2+ channels [
105] (
Figure 6,
Supplementary Figure S2.9). Besides CsTx-1, we identified another putative neurotoxin, CsTx-38a, b, c (0.41%), which exhibits a strikingly long C-terminal stretch of 23 amino acid residues mainly composed of five different amino acids: Pro, Leu, Gly, Asn and Arg. This stretch is at minimum twice as long as the longest C-terminal stretch identified in peptides of other spider families and might be an innovation of
C. salei to enhance toxic activity. Identities of 40.9%–45.5% (ctenids, pisaurids, and oxyopids), and 51.5%–53% (lycosids) were calculated for these (putative) neurotoxins (
Figure 6,
Supplementary Figure S2.10).
The SN_31 family includes CsTx-24a, b, c (0.21%) and CsTx-39 (0.04%), which are characterized by five disulfide bridges. So far, an Interpro scan could not identify any known protein family memberships for both peptide groups. CsTx-24a, CsTx-24b, CsTx-24c shows high identities with putative neurotoxins from other spiders, such as pisaurids (68.6%–76.5%), viridasiids (52.0%–54.0%), lycosids (40.0%–41.2%), and eresids (52.9%–56.9%), one of the oldest araneomorph spider groups after haplogynes. Therefore, it is tempting to speculate that this family of (putative) neurotoxins could be widespread at least among entelegyne spiders (
Supplementary Figure S2.11).
2.13. SN_34, SN_29, SN_33, SN_19, and SN_35 Family
The SN_34 family includes peptides that exhibit the ICK motif composed of three disulfide bridges without a fourth disulfide bridge for C6–C7. CsTx-29 (0.2%) is the only
C. salei venom peptide belonging to this family. The peptide shows identities between 61.8% and 69.4% with peptides of unknown function from viridasiids and pisaurids (
Figure 6,
Supplementary Figure S2.12). Peptide families SN_29, and _33 are characterized by the ICK fold as cysteine framework, and feature a fourth disulfide bridge between C6 and C7. With CsTx-30 (0.16%), a peptide related to P-type Ca
2+ channel inhibitor ω-Lsp-IA [
95], and to a putative neurotoxin from
Dolomedes fimbriatus was identified in
C. salei [
104] (
Supplementary Figure S2.13).
Peptides similar to CsTx-26 (SN_33_00, 0.66%) were identified in many spider families of the RTA clade (ctenids, lycosids, pisaurids, sparassids) with high sequence identities of between 63.2% and 82.9%, pointing to a functionally highly conserved structure of the mature peptide. First insights into spider venom gland transcriptomes of other spider families support the wide distribution of the conservative peptide family SN_33 (Kuhn-Nentwig and Langenegger, personal communication). Wide distribution is further supported by the high amino acid sequence identity of 80% between CsTx-26 and purotoxin 1 (PT1). PT1, which shows antinociceptive activity by the inhibition of P2X3 receptors of rat dorsal root sensory neurons [
106], was first isolated from the lycosid
Alopecosa marikovskyi. In contrast to the conserved sequences of mature peptides, signal peptide and pro-peptide exhibit lower sequence identities, between 33.3% and 64.9%, and might be more spider family-specific (
Figure 6,
Supplementary Figure S2.14).
Precursors of CsTx-33 (SN_19_33, 0.4%) (
Supplementary Figure S2.15) and CsTx-35a, b (SN_35_00, 0.12%) (
Supplementary Table S1, Supplementary Figure S2.16) both exhibit dibasic recognition motif “KR”, and only CsTx-33 exhibits an additional PQM motif between the end of the signal peptide and the first cysteine of the mature peptide. Dibasic motifs have been postulated to serve as pro-peptide cleavage sites in some neurotoxin precursors of mygalomorph spiders. Some peptides of
Trittame loki (W4VS08) [
107] and
Haplopelma hainanum (D2Y299) show cleavage motif “KR” [
108,
109], and some of
Macrothele gigas (P83560) [
110] and
Atrax robustus (P83580) [
111] the cleavage motif
“RR”. However, some toxins also feature a PQM downstream of the dibasic motif (
Figure 7A). This presence of multiple known cleavage motifs at possible pro-peptide cleavage sites shows the importance of proteomic data for accurate determination of the actual cleavage site. Proteomic top-down analysis revealed that, in the case of CsTx-33, the PQM motif is used as pro-peptide-cutting site. In contrary, CsTx-35 and some peptides of
H. hainanum are cleaved after dibasic motif “KR” (
Figure 7A) as shown by mass-spectrometry and Edman degradation [
112], respectively. Further investigations are needed to explain the observed specificity in pro-peptide cleavage. However, we observed an evident similarity between the nucleotide sequences of the non-dibasic motif containing CsTx-9, -10, -11, and CsTx-33 in the region of the pro-peptide-mature peptide junction (
Figure 7B), possibly indicating an evolutionary relationship of these transcript parts. The only mutations within the first 21 N-terminal nucleotides of the mature peptides of CsTx-33, -10, and -11 are two point-mutations causing the dibasic motif in CsTx-33.
Top-down proteomics of CsTx-35 revealed another post-translational modification of the CsTx-35 precursor. The last twelve C-terminal amino acid residues are post-translationally removed. This post-translational modification is comparable to the processing of the precursors of CsTx-8, 12, and 13 by the PQM-protease and a so far unknown carboxypeptidase [
25] (
Figure 6). Remarkably, mature CsTx-35 showed 92% identity to LDTF-11, a putative neurotoxin from
Dolomedes fimbriatus [
104], whereas their signal peptides and pro-peptides showed only 71.4% identity. The mature chains of CsTx-35 and CsTx-26 are less variable than their signal and pro-peptides when compared with the corresponding peptides of other related spiders. These findings are in contrast to the present opinion that the predominant mutation sites should be in the mature peptides when comparing peptides within a peptide family of one species. However, Kozlov and coworkers showed, for putative neurotoxin precursors of
D. fimbriatus that the most variable region is the pro-peptide region, followed by the signal peptide and N-terminal parts of the mature peptides [
104].
2.14. SN_42 and SN_44 Family
A high identity of 70.3% was found between CsTx-40 (0.08%) and omega-agatoxin-1A (agelenids), a heterodimeric neurotoxin and selective L-type calcium channel blocker (Cav/CACNA1) [
113,
114]. The disulfide bridge pattern for the present 10-Cys-containing peptides has not yet been solved. Interestingly, this cysteine pattern is widespread within spiders of the RTA clade and can be found in pisaurids (73.0% identity), viridasiids (76.8%), thomisids (61.1%), and lycosids (80.0%) (
Supplementary Figure S2.17). Comparable to the two-chain neurotoxins CsTx-8, 12, 13 and omega-agatoxin-1A, CsTx-40 exhibits in its C-terminal sequence an inverted PQM as well as a PQM. The post-translational modification of this peptide by a PQM protease produces a heterodimeric structure as shown previously [
25]. This also holds true for the related sequences in the above-mentioned spider families. The resulting long chain, C-terminally comprises 10 amino acids after the last Cys residue. This C-terminal part is about two times longer than the corresponding sequence lengths of CsTx-8, -12, and -13. Such long chains might be highly flexible and may interact with other peptides, resulting in increased toxic activity, comparable to CsTx-8, CsTx-12, and CsTx-13.
CsTx-27 (0.7%) is structurally comparable to CsTx-40 but C-terminally misses two cysteine residues. Related precursors have been identified only in lycosids (54.0%–58.1% identity). As a result of top-down proteomics, post-translational modification has been identified for CsTx-27. Here again, the C-terminal Arg residue is removed by an unknown carboxypeptidase [
25] (
Supplementary Figure S2.18).
2.15. SN_20 and SN_32 Family
Precursors corresponding to these families are characterized by the missing pro-peptide region, which seems to be a “requisite” for most so far described (putative) neurotoxins of mygalomorph and araneomorph spiders [
115]. CsTx-20 (SN_20_01, 0.08%), CsTx-21a,b,c,d,e,f,g (SN_32_01, 0.62%), and CsTx-22a,b,c (SN_32_02, 0.16%) lack these pro-peptides and are present only in low abundances in the venom. In contrast to (putative) neurotoxins of other peptide families, the mature peptides of the SN_20 and SN_32 family are more anionic peptides with only net charges between –4 and 1. All these peptides possess five disulfide bridges (
Figure 8).
With 86 amino acid residues and a molecular mass of 9.9 kDa, CsTx-20 is the largest peptide that we have purified from the venom. Interpro sequence analysis showed no relationship to any protein family and no domain could be identified. Disulfide bridge connectivity was determined as C1–C4, C2–C5, C3–C7, C6–C9, and C8–C10 [
85], which corresponds to the disulfide pattern of black mamba intestinal toxin 1 (MIT1) [
116]. In contrast to MIT1 (only 23.7% identity), CsTx-20 lacks the N-terminal AVIT sequence, characteristic for a part of the prokineticin domain that is essential for biological activity, e.g., pain sensation and stimulation of smooth muscle contraction [
86,
117]. Blast results show a broad distribution of CsTx-20 homologs in araneomorph spiders of the RTA clade (pisaurids, 89.5% identity; sparassids, 67.1%) araneids (67.1%), and eresids, (68.2%), but also in scorpions (
Hadrurus spadix, 31–35.4%). Identifying similar peptides in spider and scorpion venoms points to a common ancient precursor or a convergent evolution in both arachnid orders. So far, no biological activity is described for these peptides isolated from spider and scorpion venom (
Supplementary Figure S2.19).
Interpro analysis shows that CsTx-22a, b, c comprise the prokineticin domain (IPR023569) nearly over the whole length of the peptides (amino acid residues 5–59, CsTx-22) but the crucial N-terminal AVIT sequence part, responsible for its biological activity, is lacking. The prokineticin domain is identified in several putative toxin precursors from different araneomorph and mygalomorph spiders, but also, surprisingly, from ticks. Sequence identities between CsTx-22a, b, c and such peptides are medium to high: for araneomorph spiders 43.9%–7.6%, for mygalomorph spiders 41.8%–49.2%, and for ticks 37.5%–39.1%. Sequence alignments even show 22.4%–33.4% identity to the prokineticin Bm8f from the toad
Bombina maxima and to MIT1 from the elapid black mamba (
Supplementary Figure S2.20).
CsTx-21a, b, c, d, e, f, g are classified as belonging to the atracotoxin family (IPR020202). This family classification is based on ACTX–Hvf17 [
118] and six more MIT1-like ACTX orthologs isolated from the venom of the mygalomorph funnel web spiders
Hadronyche versuta and
H. infensa [
86]. They share sequence homologies to the above-mentioned MIT1 and Bm8f, but no pharmacological activity or biological function in the venom is known. Mature CsTx-21 isoforms show amino acid sequence identities to peptides of other araneomorph spider in the range of 51.6–60.0% and with mygalomorph spiders in the 38.1%–45.3% range (
Supplementary Figure S2.21).
Taking all arguments into account, it is most likely that CsTx-21a, b, c, d, e, f, g, and CsTx-22a, b, c can be classified as peptides that might exhibit the ancestral disulfide-directed beta-hairpin (DDH) domain as shown for the nontoxic atracotoxin-Hvf17 (ACTX–Hvf17) identified in the atracid
Hadronyche infensa. The corresponding amino acid consensus sequence is defined as CX
5-9CX
2[G or P]X
2CX
6-19C, which is in accordance with the amino acid consensus sequence CX
4-5CX
2[G]X
2CX
8C of isoforms of CsTx-21 and CsTx-22. Furthermore, loop 3 of this domain is highly conserved as C–GXGXC–C, comparable to loop 3 of MIT1-like ACTXs [
86]. Together with the determined disulfide bridge pattern of CsTx-20, it seems that CsTx-20, 21, and 22 are the only peptides in the venom of
C. salei that exhibit a DDH fold (Colipase MIT1-like fold), hypothesized to be the evolutionary precursor of the ICK motif [
82]. Identifying related peptides to CsTx-20, 21, and 22, not only in araneomorph and mygalomorph spider venoms [
107], but also in the venom of scorpions [
119], pseudoscorpions [
65], and in the salivary glands of ticks [
120], may give a clue that these peptides may be one of the first compounds recruited into venom and salivary glands. Unfortunately, their targets still need to be elucidated.
2.16. Defensin-Like Peptide
We identified a defensin-like peptide in the venom gland, with a so far unknown function, which we named defensin-2. Transcripts coding for this peptide have not been identified in our
C. salei hemocyte transcriptome, indicating that defensin-2 is a venom gland-specific peptide. Defensin-2 shows 54% sequence identity to defensin-1, a peptide from
C. salei that was shown to be expressed in ovaries, subesophageal nerve mass, hepatopancreas, hemocytes, and muscle tissue. Neither reverse-transcriptase-PCR nor 454-sequencing showed any expression in the venom glands of the spider [
121]. Illumina sequencing, however, revealed defensin-1 and defensin-2 homolog transcripts in the venom glands of
Cupiennius getazi, a sister species of
C. salei. It is tempting to assume that this inconsistency is due to the higher read-depth of Illumina sequencing compared to 454-sequencing, allowing to detect very low-abundant transcripts that may emerge from a few hemocytes present in dissected venom glands. The amino acid differences between hemocyte defensins-1 from both sister species are small (91.9% identity). The same holds true for defensins-2 from venom glands (97.7%) (
Figure 9).
Defensins so far identified in other arachnids show higher sequence identities to
C. salei defensin-1 than to the venom specific defensin-2. BmKDfsin4 [
122], a classical defensin identified in the scorpion
Mesobuthus martensii, exhibits a conserved cystine-stabilized α/β structural fold (C1–C4, C2–C5, C3–C6), which can be likewise assigned to spider defensins. In fact, BmKDfsin4 shows inhibitory activity against Gram-positive bacteria, and potassium channel current-blocking activity. It is hypothesized that scorpion defensins and some scorpion neurotoxins originated from one precursor [
83,
122]. To the best of our knowledge, it is the first time that a venom gland-specific defensin has been identified in spider venom. Further investigations are necessary to elucidate the recruitment and possible neofunctionalization of defensins in terms of antimicrobial and potassium channel-blocking activities of these spider venom gland peptides.
2.18. The Dual Prey-Inactivation Strategy of Spiders
Previously published data on low molecular mass compounds [
92,
126] and cytolytic peptides (cupiennins) [
127,
128], together with the here presented proteins and (putative) neurotoxins open a holistic view on the synergistic mode of action of
C. salei venom compounds after injection into a prey or aggressor. Analyzing all interacting compounds, we hypothesize a specific and an unspecific prey inactivation pathway, resulting in a dual prey-inactivation strategy (
Figure 10).
Compounds of the specific pathway are neurotoxins, low molecular mass compounds, a highly active hyaluronidase, phospholipase A2 and the cupiennins. The unspecific pathway includes α-amylase, CRISPs, angiotensin converting enzyme, cystatin and IGFBP-rP1. In the specific pathway, a great variety of neurotoxins act synergistically [
26], but also with small molecular mass compounds and cupiennins, all affecting ion channel targets of the nervous system and in muscle tissues, finally resulting in paralysis, convulsion and death. The spreading of these toxins into the tissue is supported by hyaluronidase, phospholipase A2 and the cupiennins, through destruction of negatively charged membrane types. The unspecific inactivation pathway is characterized by different enzymes, which play a central part by interacting with the regulation of important metabolic pathways, thus unbalancing the homeostasis of an organism. The main actors are α-amylases, CRISPs, and angiotensin-converting enzymes. Furthermore, some of the cupiennins inhibit the formation of nitric oxide by neuronal nitric oxide synthase, which dramatically disturbs numerous processes using nitric oxide as a neurotransmitter [
129]. The dual prey-inactivation strategy of spiders reduces the development of resistance against single venom compounds and the risk of losing prey due to escape.