Proteo-Transcriptomic Analysis Identifies Potential Novel Toxins Secreted by the Predatory, Prey-Piercing Ribbon Worm Amphiporus lactifloreus

Nemerteans (ribbon worms) employ toxins to subdue their prey, but research thus far has focused on the small-molecule components of mucus secretions and few protein toxins have been characterized. We carried out a preliminary proteotranscriptomic analysis of putative toxins produced by the hoplonemertean Amphiporus lactifloreus (Hoplonemertea, Amphiporidae). No variants were found of known nemertean-specific toxin proteins (neurotoxins, cytotoxins, parbolysins or nemertides) but several toxin-like transcripts were discovered, expressed strongly in the proboscis, including putative metalloproteinases and sequences resembling sea anemone actitoxins, crown-of-thorn sea star plancitoxins, and multiple classes of inhibitor cystine knot/knottin family proteins. Some of these products were also directly identified in the mucus proteome, supporting their preliminary identification as secreted toxin components. Two new nemertean-typical toxin candidates could be described and were named U-nemertotoxin-1 and U-nemertotoxin-2. Our findings provide insight into the largely overlooked venom system of nemerteans and support a hypothesis in which the nemertean proboscis evolved in several steps from a flesh-melting organ in scavenging nemerteans to a flesh-melting and toxin-secreting venom apparatus in hunting hoplonemerteans.


Introduction
Nemertea is a phylum of unsegmented worms (ribbon worms) featuring 1300 mostly marine species. It belongs to the disputed superphylum Lophotrochozoa, which contains one third of all marine animals, including sensu stricto polychaetes, mollusks, brachiopods and phoronids [1][2][3]. Nemertean toxins were first discovered in the epidermal mucus [2] and research has focused on small-molecule components, such as tetrodotoxin and other alkaloids. Few peptide or protein components have been isolated, expressed and characterized in terms of structure and activity [2]. Comprehensive sequence, structural and pharmacological data are only available for a handful of proteins from three species in the class Pilidiophora: Cerebratulus lacteus (cytotoxin A-III and neurotoxins B-II and B-IV), Parborlasia corrugatus (parbolysin), and Lineus longissimus (nemertides α-1, α-2 and β-1). These toxins have been discussed in a comprehensive review [2].
The limited understanding of nemertean toxins in part reflects the absence of a centralized venom system with a distinct venom gland and venom duct that can be milked or dissected easily. Instead, glandular cells that are thought to secrete toxins are distributed throughout the proboscis [15]. Furthermore, nemerteans produce a characteristic mucus layer (as with many marine organisms), and this also contains toxins, which are probably delivered passively as a poison or toxungen [16,17]. These mucosal toxins may be utilized simultaneously for defense and predation without a clear separation of roles.
Few studies have attempted to mine genome or transcriptome data to identify known nemertean toxins and potential novel variants. A whole-animal transcriptomic toxin-profiling study of nine nemertean species revealed several transcripts that match known toxin families from other venomous animals, especially sequences similar to plancitoxin from the crown-of-thorns sea star [18]. These toxins were identified in all nine species, representing all three nemertean classes: Pilidiophora (Cerebratulus marginatus, Lineus lacteus, L. longissimus, and Lineus ruber), Palaeonemertea (Cephalothrix hongkongiensis, Cephalothrix linearis, and Tubulanus polymorphus) and Hoplonemertea (Malacobdella grossa and Paranemertes peregrina). However, the only transcripts Nemertean toxins were first discovered in the epidermal mucus [2] and research has focused on small-molecule components, such as tetrodotoxin and other alkaloids. Few peptide or protein components have been isolated, expressed and characterized in terms of structure and activity [2]. Comprehensive sequence, structural and pharmacological data are only available for a handful of proteins from three species in the class Pilidiophora: Cerebratulus lacteus (cytotoxin A-III and neurotoxins B-II and B-IV), Parborlasia corrugatus (parbolysin), and Lineus longissimus (nemertides α-1, α-2 and β-1). These toxins have been discussed in a comprehensive review [2].
The limited understanding of nemertean toxins in part reflects the absence of a centralized venom system with a distinct venom gland and venom duct that can be milked or dissected easily. Instead, glandular cells that are thought to secrete toxins are distributed throughout the proboscis [15]. Furthermore, nemerteans produce a characteristic mucus layer (as with many marine organisms), and this also contains toxins, which are probably delivered passively as a poison or toxungen [16,17]. These mucosal toxins may be utilized simultaneously for defense and predation without a clear separation of roles.
Few studies have attempted to mine genome or transcriptome data to identify known nemertean toxins and potential novel variants. A whole-animal transcriptomic toxin-profiling study of nine nemertean species revealed several transcripts that match known toxin families from other venomous animals, especially sequences similar to plancitoxin from the crown-of-thorns sea star [18]. These toxins were identified in all nine species, representing all three nemertean classes: Pilidiophora (Cerebratulus marginatus, Lineus lacteus, L. longissimus, and Lineus ruber), Palaeonemertea (Cephalothrix hongkongiensis, Cephalothrix linearis, and Tubulanus polymorphus) and Hoplonemertea (Malacobdella grossa and Paranemertes peregrina). However, the only transcripts identified representing known nemertean-specific toxin candidates were those resembling cytotoxin A-III in C. marginatus, L. longissimus, L. ruber, and L. lacteus [18]. Interestingly, the cytotoxin A-III family also contains the only nemertean-specific toxin class, which was identified in the first nemertean whole genome sequence (Notospermus geniculatus) and appears to have undergone evolutionary expansion [3].
Predicting toxins based on genome or transcriptome data is unreliable without corresponding proteome data because transcriptome-only data in particular increase the number of false positives and therefore overestimates putative toxin matches [19,20]. However, integrated proteomic and genomic/transcriptomic data (proteogenomics and proteotranscriptomics) are not yet available for nemertean species. Here we present the first proteotranscriptomic analysis of putative toxins from the hoplonemertean A. lactifloreus that inhabits lower shores-for example, under rocks or stones across European costs-and was collected in the tidal zone of the German Wadden Sea on Sylt. Samples of proteins secreted from the proboscis (PrS), epidermis (EpS), and entire mucus layer (MuS) were compared to the proboscis-specific transcriptome (posterior and anterior proboscis, including the stylet apparatus) to assess the abundance of toxin-related transcripts and corresponding proteins, focusing on the proboscis as the main structure potentially used for toxin secretion.

No Nemertean-Specific Toxin Transcripts Are Present in the A. lactifloreus Proboscis Transcriptome
We analyzed putative toxin sequences in the A. lactifloreus proboscis transcriptome (Table 1). These were inferred from 41,377 open reading frames (ORFs) predicted with Transdecoder v5.0.2 [21] based on 96,851 transcripts that were assembled with Trinity v2.8.4 [22]. A BLASTP search (Basic local alignment search tool with a relaxed e-value of ≤0.001) was used to screen a manually curated database of all known nemertean-specific toxins (Additional File 1) against the A. lactifloreus proboscis transcripts. Additionally, the only published nemertean proboscis transcriptome (N. geniculatus [3]), which was assembled and processed in the same manner as the transcriptome of A. lactifloreus, was analyzed in parallel to test whether nemertean-specific toxins are generally expressed in the proboscis.
The A. lactifloreus proboscis transcriptome did not contain any transcripts with significant similarity to the known nemertean-specific toxin groups: nemertide α, neurotoxins B-II and B-IV, parbolysin or cytoxoin A-III. However, we found sequences similar to N. geniculatus nemertides α-1 and α-2 and parbolysin (Additional Files 2 and 3) that were not identified in the original study on N. genicularis [3]. Interestingly, cytotoxin A-III was not found in the N. geniculatus proboscis transcriptome, despite the presence of a corresponding gene in the recently published N. geniculatus genome assembly, where it appears to have expanded and is predominantly expressed in egg tissue [3]. We were unable to obtain sufficient protein for mass spectrometry from A. lactifloreus proboscis samples and it was, therefore, not possible to validate proboscis-expressed transcripts against the corresponding proteomic data.

Metalloproteinase M12 and Actitoxin-Like Transcripts Are the Most Abundant Putative Toxin Transcripts in the Proboscis of A. lactifloreus
In addition to nemertean-specific toxins, we mined the A. lactifloreus proboscis transcriptome for toxin proteins identified in other animals. To reduce false-positive matches, we compared the 1335 transcripts from the resulting BLASTP search against the ToxProt database of known toxins (e ≤ 0.001) versus normal physiological variants from SwissProt (e ≤ 0.001). Several hits against known toxins were identified as matches (Table S1). These were stringently filtered to exclude false-positives [19,20]. We only accepted transcripts (1) with similar or higher e-values and bitscore values in their ToxProt annotation compared to the SwissProt matches and (2) that were validated in comparative alignments with the corresponding toxin and, if available, non-toxin sequences. We also applied a threshold minimum expression level of two transcripts per million (≥TPM 2).
Four major toxin-related groups were present in the A. lactifloreus proboscis transcriptome ( Table 1). The first and most abundant group comprised proteinases, including metalloproteinase M12 candidates with particularly high expression levels. The second group included actitoxin-like transcripts similar to short cnidarian actitoxins, which feature a cysteine-rich Kunitz-BPTI domain, and larger plancitoxin-like transcripts identified in other nemertean transcriptomes [2,18]. This group also included transcripts related to three clades of the inhibitor cystine knot (ICK)/knottin family, specifically conotoxin-like proteins featuring 4-C scaffolds (putative R-superfamily conotoxins [23] known from Villepin's cone, Conus villepinii), conotoxin-like proteins featuring 8-C scaffolds (similar to xibalbin-1 ICK sequences in remipede and several spider-like venoms [24,25]) and ICK-like calcium channel inhibitors from scorpions. The actitoxin-like transcripts were the most abundant, followed by the plancitoxin-like and ICK-like transcripts. The third group comprised non-protease enzymes, including galactose-specific lectin and calglandulin transcripts that were particularly abundant. Finally, the fourth group accounted for the remaining proteins, including several growth factor-like transcripts with low expression levels. A subset of protein families expressed in the proboscis transcriptome was also identified in the mucus proteome, which supports our findings concerning the nemertean-specific toxins of N. geniculatus (Table 1 and Figure 2).   (Table S2) are included).

Proteinases Dominate the Skin and Mucus Secretions Accompanied by Mucins, Enzymes and Putative Toxins
The epidermal secreted proteome (EpS) contained mostly degraded proteins, with two exceptions, making further characterization difficult (Table S1, see Additional File 4 for complete Mascot output). Most of our protein data were therefore derived from the mucus secreted proteome (MuS). Some of the nemertean-specific toxins identified in the mucus were also clearly expressed in the proboscis. We therefore used the proboscis tissue as a specific database and SwissProt as a nonspecific database for the proteome search. Complementary A. lactifloreus body transcriptomes could not be prepared.
Recent studies have shown that the exclusive analysis of proteome-supported transcripts can avoid the over-interpretation of false-positive hits from transcriptome-only data [19,20]. We therefore restricted our downstream analysis to the transcripts also represented in the proteome data (Table S2). Furthermore, we only included transcripts identified in the EpS and MuS proteomic datasets with Mascot values ≥ 24 based on at least two matching peptides (Table S2; see Additional Files 4 and 5 for complete Mascot output). We therefore retrieved 314 transcripts with predicted ORFs that survived these stringent filtering criteria (Table S2). Full annotation tables and corresponding sequences for all predicted ORFs using BLASTP (e ≤ 0.001) against nemertean-specific toxins, SwissProt, ToxProt, antimicrobial peptides, and Interpro scan v5 [26] are provided in the additional data.
Many toxins arise from the duplication of genes encoding proteins and peptides with normal physiological functions, resulting in high similarities between the neofunctionalized toxin and its ancestral protein [17,27,28]. To reduce false-positive matches, we compared the BLASTP results of  (Table S2) are included). Table 1. The inferred protein families of toxin candidates from the A. lactifloreus proboscis transcriptome. The original taxon in which the family was identified is named and, if known, its activity is described. The numbers of transcripts that survived our filter criteria are given (Similar or higher e-values and bitscore values compared to SwissProt annotations and in alignments). Numbers in brackets indicate the overall number of transcripts annotated via ToxProt (Table S1). The expression levels in transcripts per million (TPM) are only provided for validated transcripts. Protein families in bold were also identified in the mucus proteome and asterisks (*) indicate families with at least one identical transcript in both the proboscis transcriptome and mucus proteome ( Figure 2).

ToxProt Annotation Protein Family, "Actual" Scaffold
Original Asterisks (*) indicate families with at least one identical transcript in both the proboscis transcriptome and mucus proteome.

Proteinases Dominate the Skin and Mucus Secretions Accompanied by Mucins, Enzymes and Putative Toxins
The epidermal secreted proteome (EpS) contained mostly degraded proteins, with two exceptions, making further characterization difficult (Table S1, see Additional File 4 for complete Mascot output). Most of our protein data were therefore derived from the mucus secreted proteome (MuS). Some of the nemertean-specific toxins identified in the mucus were also clearly expressed in the proboscis. We therefore used the proboscis tissue as a specific database and SwissProt as a nonspecific database for the proteome search. Complementary A. lactifloreus body transcriptomes could not be prepared.
Recent studies have shown that the exclusive analysis of proteome-supported transcripts can avoid the over-interpretation of false-positive hits from transcriptome-only data [19,20]. We therefore restricted our downstream analysis to the transcripts also represented in the proteome data (Table S2). Furthermore, we only included transcripts identified in the EpS and MuS proteomic datasets with Mascot values ≥ 24 based on at least two matching peptides (Table S2; see Additional Files 4 and 5 for complete Mascot output). We therefore retrieved 314 transcripts with predicted ORFs that survived these stringent filtering criteria (Table S2). Full annotation tables and corresponding sequences for all predicted ORFs using BLASTP (e ≤ 0.001) against nemertean-specific toxins, SwissProt, ToxProt, antimicrobial peptides, and Interpro scan v5 [26] are provided in the additional data.
Many toxins arise from the duplication of genes encoding proteins and peptides with normal physiological functions, resulting in high similarities between the neofunctionalized toxin and its ancestral protein [17,27,28]. To reduce false-positive matches, we compared the BLASTP results of the 314 candidate transcripts (e ≤ 0.001) against nemertean-specific toxins, known toxins (ToxProt), and known proteins including normal physiological variants (SwissProt). We accepted the transcripts as putative toxins only if the bitscore values of the nemertean toxin or ToxProt-derived transcripts were similar or higher than the SwissProt matches. The final 23 transcripts surviving this selection process represented 15 protein families in four major categories with different biological functions and expression levels ( Figure 2).
The putative toxin cocktail defined by transcriptome data and refined by proteomic validation is dominated by a primary group of proteinase transcripts, particularly those encoding M12A and M12B metalloproteinases and serine proteinases. The second group of transcripts encodes other enzymes, including abundant chitinase and serpin transcripts ( Figure 2). The third group encodes mucus-related proteins (mucins) that have also been found in amphibian skin secretions [29]. The fourth group encodes four toxin protein families: (1) IGFBP-like proteins (insulin-like growth factor binding proteins), which are found in the venoms of invertebrates, such as scorpions and remipedes [24,30]; (2) plancitoxins, previously identified in the crown-of-thorns starfish, a venomous echinoderm [31,32]; (3) actitoxins, previously identified in sea anemones [33]; (4) ShK-like proteins (Stichodactyla-like proteins) that are also major venom components in sea anemones [34]. The expression levels of each transcript are presented in relative values as a pie chart ( Figure 2A) and also in absolute TPM values as a bar chart ( Figure 2B). Interestingly, several representatives of protein families that were identified in the proboscis transcriptome also appear to be secreted into the mucus proteome. However, some of the more abundant transcripts from these same families are not present in the mucus proteome ( Figure 2; Table 1 and Table S1).
Our results support a hypothesis in which the identified actitoxin-like and plancitoxin-like proteins resemble two potential nemertean-typical families in the toxin arsenal of this phylum. Variants of both toxin families were also detected in the proboscis transcriptome of N. geniculatus and in the full-body transcriptome assemblies of seven other species from all three nemertean clades [2,18]. We name these new putative toxin classes U-nemertotoxin-1 and U-nemertotoxin-2, following the convention of Undheim and colleagues [35,36]. The N. geniculatus and A. lactifloreus U-nemertotoxin-1 variants are distinct, separating into different clades ( Figure 3A). Furthermore, U-nemertotoxin-2 appears to have diversified into three different clades: clade I is similar to PI actitoxins from sea anemones and consists only of A. lactifloreus sequences; clade II consists only of the N. geniculatus; clade III features sequences from both nemerteans, most similar to U-actitoxins from the sea anemone Actinia viridis ( Figure 3B).
Mar. Drugs 2020, 18, x FOR PEER REVIEW 7 of 18 the 314 candidate transcripts (e ≤ 0.001) against nemertean-specific toxins, known toxins (ToxProt), and known proteins including normal physiological variants (SwissProt). We accepted the transcripts as putative toxins only if the bitscore values of the nemertean toxin or ToxProt-derived transcripts were similar or higher than the SwissProt matches. The final 23 transcripts surviving this selection process represented 15 protein families in four major categories with different biological functions and expression levels ( Figure 2). The putative toxin cocktail defined by transcriptome data and refined by proteomic validation is dominated by a primary group of proteinase transcripts, particularly those encoding M12A and M12B metalloproteinases and serine proteinases. The second group of transcripts encodes other enzymes, including abundant chitinase and serpin transcripts (Figure 2). The third group encodes mucus-related proteins (mucins) that have also been found in amphibian skin secretions [29]. The fourth group encodes four toxin protein families: (1) IGFBP-like proteins (insulin-like growth factor binding proteins), which are found in the venoms of invertebrates, such as scorpions and remipedes [24,30]; (2) plancitoxins, previously identified in the crown-of-thorns starfish, a venomous echinoderm [31,32]; (3) actitoxins, previously identified in sea anemones [33]; (4) ShK-like proteins (Stichodactyla-like proteins) that are also major venom components in sea anemones [34]. The expression levels of each transcript are presented in relative values as a pie chart (Figure 2A) and also in absolute TPM values as a bar chart ( Figure 2B). Interestingly, several representatives of protein families that were identified in the proboscis transcriptome also appear to be secreted into the mucus proteome. However, some of the more abundant transcripts from these same families are not present in the mucus proteome ( Figure 2; Table 1 and Table S1).
Our results support a hypothesis in which the identified actitoxin-like and plancitoxin-like proteins resemble two potential nemertean-typical families in the toxin arsenal of this phylum. Variants of both toxin families were also detected in the proboscis transcriptome of N. geniculatus and in the full-body transcriptome assemblies of seven other species from all three nemertean clades [2,18]. We name these new putative toxin classes U-nemertotoxin-1 and U-nemertotoxin-2, following the convention of Undheim and colleagues [35,36]. The N. geniculatus and A. lactifloreus U-nemertotoxin-1 variants are distinct, separating into different clades ( Figure 3A). Furthermore, U-nemertotoxin-2 appears to have diversified into three different clades: clade I is similar to PI actitoxins from sea anemones and consists only of A. lactifloreus sequences; clade II consists only of the N. geniculatus; clade III features sequences from both nemerteans, most similar to U-actitoxins from the sea anemone Actinia viridis ( Figure 3B).

Several Secreted Proteins Are Strongly Expressed but Remain Mostly Uncharacterized
Several of the proteins we identified were strongly expressed but only rudimentary annotations are available from InterPro scan (e.g., indicating whether or not a signal peptide or non-cytoplasmic domain is present). This group included the most abundant transcript in the proboscis transcriptome (DN187_c0_g1_i14.p1). None of these sequences generated hits in the NCBI non-redunant database. We do not discuss these matches further, given the lack of additional information, but it is possible that some of them represent novel toxin proteins (Table 2).

Several Secreted Proteins Are Strongly Expressed but Remain Mostly Uncharacterized
Several of the proteins we identified were strongly expressed but only rudimentary annotations are available from InterPro scan (e.g., indicating whether or not a signal peptide or non-cytoplasmic domain is present). This group included the most abundant transcript in the proboscis transcriptome (DN187_c0_g1_i14.p1). None of these sequences generated hits in the NCBI non-redunant database. We do not discuss these matches further, given the lack of additional information, but it is possible that some of them represent novel toxin proteins (Table 2).

No Putative Antimicrobial Peptides Were Identified in the A. lactifloreus Proteotranscriptome
Marine organisms often produce strong antimicrobial secretions on the epidermis as protection against pathogens, so nemerteans offer a novel source of potential antimicrobial peptides [38,39]. Given that A. lactifloreus is a littoral species and probably more exposed to microbial organisms, we used BLASTP (e ≤ 0.001) to screen our proteome data against the antimicrobial peptide database 3 (APD 3) [40]. We did not find any matches, which suggests that A. lactifloreus does not secrete any proteins or peptides that are similar to known antimicrobial peptides.

Are Known Nemertean Toxins Taxon-Specific?
Nemerteans (ribbon worms) use toxins for predation and defense, similar to annelids and mollusks (also representing the superphylum Lophotrochozoa) and more distant phyla, such as echinoderms and cnidarians ( Figure 4). Research on nemertean toxins has focused on small-molecule compounds in the mucosal secretions, whereas little is known about larger toxin proteins secreted by the epidermis and proboscis. We addressed this issue by comparing the proboscis transcriptome and mucus proteome of A. lactifloreus in order to cross-validate the toxin-related transcripts we identified. Our proteotranscriptomic analysis revealed no variants of known nemertean-specific toxin proteins (neurotoxins, cytotoxins, parbolysins, or nemertides), which were discovered in three non-hoplonemertean species: C. lacteus, P. corrugatus, and L. longissimus (Figure 4). These proteins may therefore be restricted to particular species, genera, or other taxa, rather than representing typical nemertean toxins. Interestingly, expressed variants of two of the described toxins (parbolysin and both nemertides) are present in the proboscis transcriptome of N. geniculatus (class Pilidiophora, order Heteronemertea), while they lack in the proboscis transcriptome of A. lactifloreus. Hoplonemerteans, such as A. lactifloreus, have probably recruited toxins that differ from those identified in the limited range of nemertean species sampled thus far. Potential new neurotoxin candidates include the three distinct groups of ICKs in the A. lactifloreus proboscis transcriptome, similar to those isolated from cones (C. villepinii), remipedes/spiders, and scorpions, respectively. The A. lactifloreus sequences are new variants that provide insight into the evolutionary diversity of ICKs among different invertebrate taxa. However, given the lack of complementary proteome data and the low abundance of transcripts, we do not discuss these candidates further.
Mar. Drugs 2020, 18, x FOR PEER REVIEW 10 of 18 Figure 4. A cladogram illustrating the relationships among venomous, marine animal lineages (emphasis on nemerteans), showing the distribution of toxin protein classes. A green x indicates a known characterized toxin, a red x indicates proteotranscriptomic matches from A. lactifloreus, and a black x indicates toxin candidates supported by transcriptome data alone. The phylogeny is based on a recent genomics analysis [3]. Some lineages have been pruned for simplicity.
We introduce here two new putative nemertean toxin classes, U-nemertotoxin-1 and U-nemertotoxin-2, noting that the naming of toxins in general has yet to be standardized. It is likely that future research will identify additional sequences of known nemertean toxins and of new (untested and putative) candidates based on extended taxon sampling, so the ultimate naming convention for nemertean toxins should include evolutionary/systematic and biochemical aspects to cover these candidates, better integrating the phylogenetic context.

The Putative Venom Cocktail of A. lactifloreus and Its Mode of Action
Predatory nemerteans attack with fast strikes and return later to consume their incapacitated prey, which is consistent with the delivery of known paralytic neurotoxins [6,7]. Although we identified none of the known nemertean toxin families in A. lactifloreus, it is likely that nemertotoxins 1 and 2 act in a similar manner. The predator attacks, pierces the prey with the stylet, introduces the cocktail of toxins, and waits for the prey to become immobile before consumption. We propose that the mixture of proteinases and other enzymes supports the activity of the neurotoxins and also begins the process of pre-digestion, as reported for many other venomous taxa, including assassin bugs, robber flies, and remipedes [24,41,42]. Pre-digestion is necessary because A. lactifloreus belongs A green x indicates a known characterized toxin, a red x indicates proteotranscriptomic matches from A. lactifloreus, and a black x indicates toxin candidates supported by transcriptome data alone. The phylogeny is based on a recent genomics analysis [3]. Some lineages have been pruned for simplicity.
We introduce here two new putative nemertean toxin classes, U-nemertotoxin-1 and U-nemertotoxin-2, noting that the naming of toxins in general has yet to be standardized. It is likely that future research will identify additional sequences of known nemertean toxins and of new (untested and putative) candidates based on extended taxon sampling, so the ultimate naming convention for nemertean toxins should include evolutionary/systematic and biochemical aspects to cover these candidates, better integrating the phylogenetic context.

The Putative Venom Cocktail of A. lactifloreus and Its Mode of Action
Predatory nemerteans attack with fast strikes and return later to consume their incapacitated prey, which is consistent with the delivery of known paralytic neurotoxins [6,7]. Although we identified none of the known nemertean toxin families in A. lactifloreus, it is likely that nemertotoxins 1 and 2 act in a similar manner. The predator attacks, pierces the prey with the stylet, introduces the cocktail of toxins, and waits for the prey to become immobile before consumption. We propose that the mixture of proteinases and other enzymes supports the activity of the neurotoxins and also begins the process of pre-digestion, as reported for many other venomous taxa, including assassin bugs, robber flies, and remipedes [24,41,42]. Pre-digestion is necessary because A. lactifloreus belongs to the order Monostillifera (nemerteans with only one stylet), which feeds by sucking liquefied prey tissues through the joint opening of the mouth and proboscis pore.
Several of the proteins expressed in the proboscis of A. lactifloreus are also secreted in the mucus. We speculate that proteins secreted in the mucus facilitate predation and the paralysis of prey by exacerbating the wound so that neurotoxic components reach their target more rapidly, a mechanism deployed by some amphibians [43]. The nemertean proboscis evolved as a venom apparatus by recruiting toxic proteins from the epidermal glands into proboscis-specific cells, supporting a hypothesis in which the proboscis outer epithelium is developmentally analogous to an inverted part of the epidermis. The stylet apparatus (one stylet in A. lactifloreus but multiple stylets in some other nemerteans) enables hoplonemerteans to mechanically rupture the prey's skin and deliver the venom cocktail more efficiently. We therefore argue that hoplonemerteans possess a classic venom system and venom apparatus that allows them to penetrate their prey and inject a toxic cocktail [16,17].

General Use of Toxin Proteins and Their Mode of Action in Nemerteans
The toxins produced by nemerteans occupy a grey area in terms of definition, because some (predatory) species deploy them as venoms whereas others (scavengers) use them solely for passive defense and pre-digestion. One example of the latter is P. corrugatus, which uses its secretions to melt its way into prey flesh and gain access to internal tissues, facilitated by secreted proteinases and other enzymes ( Figure 5A). In more recent evolutionary history, the proteins in the mucus (and possibly in the proboscis) have gained the ability to paralyze and kill prey enveloped by predatory nemertean species. In such cases, the secreted proteinases and other enzymes are initially used to rupture the prey and promote the activity of the toxins, at the same time gaining access to the body cavity for pre-digestion before ingestion, similar to the role in scavenging species ( Figure 5B). The efficiency of these enzymes makes them attractive targets for applied research. Hoplonemerteans mastered this predatory strategy by evolving a piercing structure to deploy their venom cocktail more efficiently, which might be accompanied by different toxins that evolved in parallel with the stylet apparatus.

Conclusions
Nemertean secreted protein toxins have received little attention from the research community. We carried out the first proteotranscriptomic analysis of a species from the hoplonemertean lineage, leading to a model of toxin evolution coincident with the transition of nemerteans from scavengers to predators. A more comprehensive understanding of nemertean toxins will require the comparison of diverse species representing the entire phylum, studies that consider the role of nemertean mucus as a mechanically protective, toxin-carrying matrix, and an analysis of the proboscis as a structure for defense and predation. Of particular interest are spatial differences in the expression and secretion of proteins by the epidermis and proboscis. Hoplonemerteans provide an ideal model to characterize the transition from defensive toxins to predatory toxins in combination with the evolutionary adaptation of a distinct venom apparatus. The venom secreted by nemerteans also offers a potentially valuable source of novel enzymes and antimicrobial proteins.

Collection and Preparation of A. lactifloreus Specimens
Four A. lactifloreus specimens were collected in August 2019 from the tidal zone of the Wadden Sea near the Wadden Sea Station of the Alfred-Wegener Institute in List on the island Sylt, Germany. For proteomics analysis, each specimen was briefly washed in sterile phosphate-buffered saline (PBS) and agitated in a Petri dish with sterile salt water to induce evertion of the proboscis. This was transferred to a second Petri dish and scraped with a modified pipette tip to collect secreted proteins (PrS). The epidermis of each specimen was also scraped (EpS). Three minutes later, the mucus was scraped from the entire body (MuS). Each sample of secreted protein was separately deposited in UltrPure distilled water (Thermo Fisher Scientific, Waltham, MA, USA) and the corresponding samples from all four individuals were pooled and stored at −80 • C for further analysis ( Figure 6). The proboscis (including the anterior and posterior parts, and stylet apparatus) was dissected in sterile PBS and immediately macerated and stored in RNAlater (Thermo Fisher Scientific) at −20 • C for transcriptome sequencing.
Mar. Drugs 2020, 18, x FOR PEER REVIEW 13 of 18 as A. lactifloreus, uses its venom apparatus to overpower an isopod crustacean with its prey-piercing stylet and toxins expressed in the proboscis.

Conclusions
Nemertean secreted protein toxins have received little attention from the research community. We carried out the first proteotranscriptomic analysis of a species from the hoplonemertean lineage, leading to a model of toxin evolution coincident with the transition of nemerteans from scavengers to predators. A more comprehensive understanding of nemertean toxins will require the comparison of diverse species representing the entire phylum, studies that consider the role of nemertean mucus as a mechanically protective, toxin-carrying matrix, and an analysis of the proboscis as a structure for defense and predation. Of particular interest are spatial differences in the expression and secretion of proteins by the epidermis and proboscis. Hoplonemerteans provide an ideal model to characterize the transition from defensive toxins to predatory toxins in combination with the evolutionary adaptation of a distinct venom apparatus. The venom secreted by nemerteans also offers a potentially valuable source of novel enzymes and antimicrobial proteins.

Collection and Preparation of A. lactifloreus Specimens
Four A. lactifloreus specimens were collected in August 2019 from the tidal zone of the Wadden Sea near the Wadden Sea Station of the Alfred-Wegener Institute in List on the island Sylt, Germany. For proteomics analysis, each specimen was briefly washed in sterile phosphate-buffered saline (PBS) and agitated in a Petri dish with sterile salt water to induce evertion of the proboscis. This was transferred to a second Petri dish and scraped with a modified pipette tip to collect secreted proteins (PrS). The epidermis of each specimen was also scraped (EpS). Three minutes later, the mucus was scraped from the entire body (MuS). Each sample of secreted protein was separately deposited in UltrPure distilled water (Thermo Fisher Scientific, Waltham, MA, USA) and the corresponding samples from all four individuals were pooled and stored at −80 °C for further analysis ( Figure 6). The proboscis (including the anterior and posterior parts, and stylet apparatus) was dissected in sterile PBS and immediately macerated and stored in RNAlater (Thermo Fisher Scientific) at −20 °C for transcriptome sequencing.

RNA Isolation, Library Preparation and Illumina Sequencing
RNA extraction and sequencing were outsourced to Macrogen, Seoul, Korea. After total RNA extraction using a low-input protocol, RNA quality and integrity were tested using a Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA). The cDNA library was constructed using the TruSeq Stranded mRNA LT sample kit (Illumina, San Diego, CA, USA) for sequencing on an

RNA Isolation, Library Preparation and Illumina Sequencing
RNA extraction and sequencing were outsourced to Macrogen, Seoul, Korea. After total RNA extraction using a low-input protocol, RNA quality and integrity were tested using a Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA). The cDNA library was constructed using the TruSeq Stranded mRNA LT sample kit (Illumina, San Diego, CA, USA) for sequencing on an Ilumina HiSeq

Transcriptome Assembly, ORF Prediction and Identification of Venom Proteins
The raw sequencing reads were processed using the Animal Venomics Group in-house assembly and annotation pipeline. The reads were inspected using FastQC v.0.11.7 [44] and trimmed using Trimmomatic v0.38 [45] with standard settings, except the differing parameters-SLIDINGWINDOW: 4:30 MINLEN: 70. Additionally, a manually curated adapter file, which includes all possible sequencing adapters, was used instead of the provided adapter sequences (Additional File 6). All trimmed reads were then assembled in Trinity v2.8.4 [21,22] with standard settings, except a modified minimum contig length of 70 bp. Expression levels were quantified with Kallisto 0.46.1 [46] using standard settings. In parallel to the proboscis transcriptome of A. lactifloreus, the published proboscis transcriptome of N. geniculatus (SRR5811996) was processed using the same settings. The resulting assembly files are provided in Additional Files 7 and 8.
For all assembled transcripts, the ORFs were predicted and annotated with Transdecoder v5.0.2 [21] at the amino acid level, running Interproscan v5.27.6 [26] and BLASTX searches against the NCBI NR database (e ≤ 10 −6 ). All ORFs predicted by Transdecoder (Additional Files 9 and 10) were analyzed to identify possible toxin peptides or proteins by screening against four specific protein databases using BLASTP (best match only, e ≤ 10 −6 ): SwissProt with 561,690 entries, ToxProt with 7133 entries, NemerteanToxins (a manually curated, in-house nemertean-specific database) with 64 described toxins/venom proteins from ribbon worms, and the antimicrobial peptide database APD 3 [40] with 2338 activity-tested antimicrobial peptides. All database files used for BLAST searches are provided as Additional Files 1, 11-13 (all databases were accessed and downloaded for local searches on 15.03.2020). All BLASTP search results are provided in Additional Files 14-21.
The toxin candidate sequences were identified using BLASTP and subsequently aligned with known toxin sequences and, where available, with non-toxin variants to exclude false-positive matches using the mafft-LINSI algorithm [47]. All alignments of putative toxin protein classes are provided in Additional Files 22-23. All relevant information concerning the transcriptome data is accessible via the NCBI Bioproject PRJNA636261 and the SRA entry SRR11906528.

Peptide and Protein Identification
10 µg of protein were dissolved in 25 mM ammonium bicarbonate with 0.1 nM ProteasMax. Cysteine residues were reduced with 5 mM dithiothreitrol for 30 min at 50 • C and then modified for 30 min at 24 • C with 10 mM iodacetamide. The reaction was leveled with excess cysteine before adding 0.025 ng/µL trypsin in a total volume of 100 µL. The reaction was stopped by adding trifluoroacetic acid to a final concentration of 1% after incubation at 37 • C for 16 h. The sample was purified with a C18-ZipTip (Merck-Millipore, Darmstadt, Germany), then dried under vacuum and re-dissolved in 10 µL 0.1% trifluoroacetic acid.
1 µg of the sample was loaded onto a 50 cm µPAC C18 column (Pharma Fluidics, Ghent, Belgium) in 0.1% formic acid at 35 • C for analysis. Peptides were eluted in a 3-44% linear gradient of acetonitrile over 240 min and afterwards washed with 72% acetonitrile at a constant flow rate of 300 nL/min using a Thermo Fisher Scientific UltiMate 3000RSLCnano device (Thermo Fisher, Waltham, MA, USA). Eluted samples were injected via an Advion TriVersa NanoMate (Advion BioSciences, Harlow, UK) with a spray voltage of 1.5 kV and a source temperature at 250 • C into an Orbitrap Eclipse Tribrid MS (Thermo Fisher, Waltham, MA, USA) in positive ionization mode. Full mass spectrum scans were acquired every 3 s over a mass range of m/z 375-1500 with a resolution of 120,000 and auto-gain control set to standard with a maximum injection time of 50 ms applying data-independent acquisition mode. The most intense ions (charge state 2-7) above a threshold ion count of 50,000 were selected in each cycle with an isolation window of 1.6 m/z for higher-energy collisional dissociation at a normalized collision energy of 30%. Fragment ion spectra were gained in the linear ion trap with the scan rate set to rapid, a normal mass range, and a maximum injection time of 100 ms. After fragmentation, selected precursor ions were excluded for 15 s. Data were received using Xcalibur v4.3.73.11. (Thermo Fisher, Waltham, MA, USA) and analyzed with Proteome Discoverer v2.4.0.305 (Thermo Fisher, Waltham, MA, USA).

Matching Proteome and Transcriptome Data
Mascot v2.6.2 was used to search against the proboscis transcriptome using a precursor ion mass tolerance of 10 ppm. Cysteine carbamidomethylation was considered as a global modification, the oxidation of methionine was considered as a variable modification, and one missed cleavage site was allowed. Fragment ion mass tolerance was set to 0.8 Da for the linear ion trap MS 2 detection. The false discovery rate for peptide identification was limited to 0.01 using a decoy database. For subsequent analysis, only matches with a Mascot score > 24 and at least two verified peptides were included (Table S2). Candidates fitting these criteria were de-grouped to find all similar sequences or isoforms in the transcriptome that corresponded to groups within Mascot. The expression levels in TPM were combined for transcripts with several sequences representing one group (Table S2). The proteome raw data are made available via the ProteomeXchange server with the accession numbers PXD019867 and PXD019872.