Investigation of Peptide Toxin Diversity in Ribbon Worms (Nemertea) Using a Transcriptomic Approach

Nemertea is a phylum of nonsegmented worms (supraphylum: Spiralia), also known as ribbon worms. The members of this phylum contain various toxins, including peptide toxins. Here, we provide a transcriptomic analysis of peptide toxins in 14 nemertean species, including Cephalothrix cf. simula, which was sequenced in the current study. The summarized data show that the number of toxin transcripts in the studied nemerteans varied from 12 to 82. The most represented groups of toxins were enzymes and ion channel inhibitors, which, in total, reached a proportion of 72% in some species, and the least represented were pore-forming toxins and neurotoxins, the total proportion of which did not exceed 18%. The study revealed that nemerteans possess a much greater variety of toxins than previously thought and showed that these animals are a promising object for the investigation of venom diversity and evolution, and in the search for new peptide toxins.


Introduction
Animal poisons and venoms are potential sources of new toxic proteins and peptides, representatives of which have already found applications in many areas as therapeutic agents and physiological tools [1][2][3][4]. A large number of these proteins and peptides have been found in marine organisms, and the majority of studies to date have been focused on toxins from cone snails, sea anemones, fish, jellyfish, sea stars, hydras, sea urchins, sea hares, etc. [5,6]. In recent decades, due to the increasing popularity of next-generation sequencing (NGS) technologies, more results from the genome/transcriptome sequencing of neglected animals have appeared in free databases, which makes it possible to use them as a source of new valuable biological information. This kind of study elucidates the ecological aspects of animals' adaptation and evolution. As a rule, toxic peptides are part of multicomponent mixtures, i.e., venoms or poisons, which have been studied in many animals such as snakes [7], scorpions [8], spiders [9], sea anemones [10], and jellyfishes [11]. However, the study of less popular animals is no less important and allows us to not only find new bioactive peptides, but also to study the evolution of poison and to better understand adaptation processes by conducting comparative analyses of the compositions of animal toxins [12].
One of these little-studied but promising sources of toxins is the marine worms of the Nemertea phylum consisting of more than 1300 species and subdivided into three phylogenetic groups: Palaeonemertea, Pilidiophora, and Hoplonemertea [13,14]. Nemerteans are a rich source of various toxic compounds [15]. Despite the absence of specialized glands for venom and poison secretion, they contain a multicomponent cocktail of toxins that provides them with protection and allows efficient predation [15]. To date, there have been only four works devoted to the screening of toxins using these approaches in nemerteans. The first toxin screening work was carried out in 2014 by Whelan and colleagues on the transcriptomes of nine nemertean species; the number of putative toxin genes found varied from three to seven in different species [16]. More recent works have demonstrated a greater number of toxin genes in nemerteans. A study of the genome of the heteronemertean species N. geniculatus revealed 32 putative toxin genes [17]. In 2020, a proteo-transcriptomic analysis of the hoplonemertean Am. lactifloreus was carried out by von Reumont and colleagues, and resulted in the identification of 26 peptides that potentially play a role in prey capture, immobilization, and predigestion [18]. The most recent study (carried out in 2022 by Verdes and colleagues), devoted to the proteo-transcriptomic analysis of A. valida venom, demonstrated the presence of 85 putative toxins, classified as potentially predatory, defensive, or having dual functions [19]. Based on the results of Luo and colleagues [17], von Reumont and colleagues [18], and Verdes and colleagues [19], it can be assumed that nemertean transcriptomes may contain a much greater variety of toxin-like transcripts than what was shown by Whelan and colleagues [16].
To date, most of the available studies provided information on toxins in a limited number of ribbon worms species and did not reveal the general principles of toxin content in different nemertean classes. The present research was devoted to revealing the toxin transcripts in 14 species of nemerteans belonging to three classes (Palaeonemertea, Pilidiophora, and Hoplonemertea) using the transcriptomic approach, including Cephalothrix cf. simula, which was sequenced in the current study. The data obtained allowed the discovery of general trends in the diversity of peptide toxins in nemerteans.

Transcriptomes Assembly
The transcriptomes of 12 nemertean species (A. lactifloreus, Malacobdella grossa, Paranemertes peregrina, Carinoma hamanako, Cephalothrix hongkongiensis, Tubulanus polymorphus, Baseodiscus unicolor, Hubrechtella ijimai, Lineus longissimus, Lineus ruber, Lineus sanguineus, and Riseriellus occultus) were assembled and annotated using reads, which were downloaded from the SRA. The previously published N. geniculatus transcriptome was annotated [17], and the assembly and annotation of the Ce. cf. simula sequences obtained in the current work were carried out. Table 1 presents data processing statistics. The quality of the final assembly was assessed by BUSCO and ranged from 63.9% (P. peregrina) to 93.7% (L. longissimus). The largest number of nonredundant annotated unique transcripts was obtained for the Ce. cf. simula transcriptome, which was assembled using a hybrid approach using reads from two platforms, Illumina and MinION Oxford Nanopore. This resulted in 25,895 open reading frames (ORFs) with unique BLAST hits, with an average number of 11,564 ORFs with unique BLAST hits in the remaining 13 nemertean species. ORFs with unique BLAST hits were obtained by removing redundant contigs with identical annotations according to the method proposed by Ono and colleagues [20]. This approach allowed reduction of transcriptome redundancy by 17.8% (M. grossa) to 67.6% (Ce. cf. simula).

Putative Toxin Transcripts in Transcriptomes
To identify toxins, transcripts were annotated using the Tox-Prot and SWISS-PROT/UniProt (E-value 10 −6 ) databases. Two of the resulting annotations were compared, and the annotation with the highest E-value was considered significant (Supplementary File S2).
A total of 14 nemerteans belonging to three classes (Palaeonemertea, Pilidiophora, and Hoplonemertea) were shown to possess from 12-16 (in P. peregrina and M. grossa, respectively) to 76-82 (in N. geniculatus and Ce. cf. simula, respectively) toxin transcripts ( Figure 1). In palaeonemertean species, on average, 49 transcripts were found, from 32 in Ca. hamanako to 82 in Ce. cf. simula. The average number of toxins in pilidiophorans was 47, ranging from 33 transcripts in B. unicolor and L. ruber to 76 transcripts in N. geniculatus ( Figure 1). The number of toxin transcripts in three annotated hoplonemerteans were 12, 16, and 30 in P. peregrina, M. grossa, and A. lactifloreus, respectively, whereas the average number was about 19 transcripts.

Putative Toxin Transcripts in Transcriptomes
To identify toxins, transcripts were annotated using the Tox-Prot and SWISS-PROT/UniProt (E-value 10 −6 ) databases. Two of the resulting annotations wer compared, and the annotation with the highest E-value was considered significant (Sup plementary File 2).
A total of 14 nemerteans belonging to three classes (Palaeonemertea, Pilidiophora and Hoplonemertea) were shown to possess from 12 -16 (in P. peregrina and M. grossa respectively) to 76-82 (in N. geniculatus and Ce. cf. simula, respectively) toxin transcript ( Figure 1). In palaeonemertean species, on average, 49 transcripts were found, from 32 in Ca. hamanako to 82 in Ce. cf. simula. The average number of toxins in pilidiophorans wa 47, ranging from 33 transcripts in B. unicolor and L. ruber to 76 transcripts in N. geniculatu ( Figure 1). The number of toxin transcripts in three annotated hoplonemerteans were 12 16, and 30 in P. peregrina, M. grossa, and A. lactifloreus, respectively, whereas the averag number was about 19 transcripts. Toxin transcripts were divided into seven groups and assigned to 86 families (Figur 2); up to 14 transcripts belonged to each family. The toxins families were determined using UniProtKB/SWISS-PROT [24] Family & Domains section that provides information about the sequence similarity with other proteins. The most represented group was en zymes, which included 32 families, accounting for 16.7-18.8% (P. peregrina and M. grossa respectively) to 36.5-47.8% (L. sanguineus and T. polymorphus, respectively) of all toxin transcripts. The enzyme group was followed by a group of ion channel inhibitors (1 families), where the toxin transcripts occupied from 15.2-16.3% (B. unicolor and R. occul tus, respectively) to 30.3-39.4% (L. ruber and Ca. hamanako, respectively) of all toxin transcripts. These two groups, as well as the group of other toxin candidates, were pre sent in the transcriptomes of all 14 species of ribbon worms. The least represented group was neurotoxins; their toxin transcripts were found in 6 out of 14 nemerteans and only in the Palaeonemertea and Pilidiphora classes. The largest number of neurotoxin familie (three families) was found in L. sanguineus; and five other species bore one family each Most of the studied species were shown to possess unique toxin families belonging to al toxin groups, except pore-forming toxins. The largest number of unique toxins, fou families, was found in L. longissimus, L. sanguineus, and N. geniculatus; no unique toxin were found in Ca. hamanako and M. grossa ( Figure 2). Toxin transcripts were divided into seven groups and assigned to 86 families ( Figure 2); up to 14 transcripts belonged to each family. The toxins families were determined using UniProtKB/SWISS-PROT [24] Family & Domains section that provides information about the sequence similarity with other proteins. The most represented group was enzymes, which included 32 families, accounting for 16.7-18.8% (P. peregrina and M. grossa, respectively) to 36.5-47.8% (L. sanguineus and T. polymorphus, respectively) of all toxin transcripts. The enzyme group was followed by a group of ion channel inhibitors (10 families), where the toxin transcripts occupied from 15.2-16.3% (B. unicolor and R. occultus, respectively) to 30.3-39.4% (L. ruber and Ca. hamanako, respectively) of all toxin transcripts. These two groups, as well as the group of other toxin candidates, were present in the transcriptomes of all 14 species of ribbon worms. The least represented group was neurotoxins; their toxin transcripts were found in 6 out of 14 nemerteans and only in the Palaeonemertea and Pilidiphora classes. The largest number of neurotoxin families (three families) was found in L. sanguineus; and five other species bore one family each. Most of the studied species were shown to possess unique toxin families belonging to all toxin groups, except pore-forming toxins. The largest number of unique toxins, four families, was found in L. longissimus, L. sanguineus, and N. geniculatus; no unique toxins were found in Ca. hamanako and M. grossa ( Figure 2).  The qualitative compositions of the toxins of the three nemertean classes-Palaeonemertea, Pilidiophora, and Hoplonemertea-had both similarities and differences. Figure 3 demonstrates a Venn diagram of the toxin families' overlaps between nemertean classes. All the classes had toxin families belonging to all the groups of toxins except neurotoxins, which were only found in palaeonemerteans (one family) and pilidiphorans (four families); toxin families of this group did not intersect between the two classes of nemerteans ( Table 2). The largest number of common toxin families between the two classes was found in Palaeonemertea and Pilidiophora species, most of which belonged to the enzyme group. Moreover, most of the unique toxin families of all three nemertean classes were also enzymes (from 37 to 50%). The qualitative compositions of the toxins of the three nemertean classes-Palaeonemertea, Pilidiophora, and Hoplonemertea-had both similarities and differences. Figure 3 demonstrates a Venn diagram of the toxin families' overlaps between nemertean classes. All the classes had toxin families belonging to all the groups of toxins except neurotoxins, which were only found in palaeonemerteans (one family) and pilidiphorans (four families); toxin families of this group did not intersect between the two classes of nemerteans ( Table 2). The largest number of common toxin families between the two classes was found in Palaeonemertea and Pilidiophora species, most of which belonged to the enzyme group. Moreover, most of the unique toxin families of all three nemertean classes were also enzymes (from 37 to 50%).

Assessment of Distance between Species Based on the Presence/Absence of Toxin Families
Based on the presence/absence of toxin families in 14 nemertean transcriptomes, principal coordinate analysis (PCoA) was performed using Jaccard distance, which expresses the distance between species (pseudo-F = 3.099, p = 0.0001, PERMANOVA, 10,000 permutations in each test) ( Figure 4). The results presented in the figure demonstrate that nemertean species were grouped according to the classes to which they belong.
In a recent study by von Reumont and colleagues [18], non-nemertean-specific to transcripts were also identified, including those that were plancitoxin-like, origina isolated from crown-of-thorns starfish, and those that were actitoxin-like, isolated fr sea anemones, named U-nemertotoxin-1 and U-nemertotoxin-2, respectively. Accord to their study, both toxins were typical for nemerteans representing Palaeonemertea, lidiophora, and Hoplonemertea. U-nemertotoxin-1 transcripts were found in the A lactifloreus and N. geniculatus proboscis transcriptomes and the full-body transcriptom of seven species from all three nemertean classes: Ce. hongkongiensis, Cephalothrix linea Cerebratulus marginatus, T. polymorphus, M. grossa, P. peregrina, L. lacteus, L. longissim and L. ruber. The transcripts of U-nemertotoxin-2 were found in the proboscis transcr tomes of Am. lactifloreus and N. geniculatus. In the current study, transcripts correspo ing to U-nemertotoxin-1 were found in most nemertean species, except for P. peregr and H. ijimai. Transcripts presumably related to U-nemertotoxin-2 were identified in out of 14 nemertean species, with the exceptions being L. sanguineus and H. ijimai.

Assessment of Distance between Species Based on the Presence/Absence of Toxin Families
Based on the presence/absence of toxin families in 14 nemertean transcriptom principal coordinate analysis (PCoA) was performed using Jaccard distance, which presses the distance between species (pseudo-F = 3.099, p = 0.0001, PERMANOVA, 10, permutations in each test) (Figure 4). The results presented in the figure demonstrate t nemertean species were grouped according to the classes to which they belong.

Preliminary Assessment of Toxin Transcripts Expression
The toxin transcripts abundance was quantified for preliminary assessment of to expression in all nemertean species using the Salmon tool (Tables S1-S14 in Supplem tary File 1) and expressed as a percentage; the transcripts with the highest expressi which accounted for 90% of the total expression of the species, were selected (Table For more accurate toxin gene expression levels, the same specimen-preparing conditi should be provided. In all the studied nemerteans, among the transcripts with the high expression, there were toxins from all seven groups, except for the hoplonemertean s cies, in which six groups were identified. For the palaeonemerteans, 90% of the total pression was from 10 to 22 toxin transcripts, and the average expression was the high in representatives of the neurotoxin group (up to 30.3% in Ce. hongkongiensis, from neurotoxin 20 family) and ion channel inhibitors (up to 33.4% in Ca. hamanako, from CRISP family) (Table 3). Between 5 and 23 toxin transcripts of pilidiophorans accoun

Preliminary Assessment of Toxin Transcripts Expression
The toxin transcripts abundance was quantified for preliminary assessment of toxin expression in all nemertean species using the Salmon tool (Tables S1-S14 in Supplementary File S1) and expressed as a percentage; the transcripts with the highest expression, which accounted for 90% of the total expression of the species, were selected (Table 3). For more accurate toxin gene expression levels, the same specimen-preparing conditions should be provided. In all the studied nemerteans, among the transcripts with the highest expression, there were toxins from all seven groups, except for the hoplonemertean species, in which six groups were identified. For the palaeonemerteans, 90% of the total expression was from 10 to 22 toxin transcripts, and the average expression was the highest in representatives of the neurotoxin group (up to 30.3% in Ce. hongkongiensis, from the neurotoxin 20 family) and ion channel inhibitors (up to 33.4% in Ca. hamanako, from the CRISP family) ( Table 3). Between 5 and 23 toxin transcripts of pilidiophorans accounted for 90% of the expression, and the highest average expression was shown by representatives of the neurotoxin group (up to 35.3% in L. longissimus, from the neurotoxin 02 (plectoxin) family. 02 (plectoxin) subfamily) and proteinase inhibitors (up to 40.2% in L. sanguineus, from the venom Kunitz-type family). Among the Hoplonemertea members, between five and eight transcripts from 90% of the most expressed toxins were identified. The most expressed were other toxin candidates (up to 47.8% in A. lactifloreus, from the TCTP family) and proteinase inhibitors (up to 56.3% in P. peregrina, from the venom Kunitz-type family sea anemone type 2 potassium channel toxin subfamily) ( Table 3). Despite the grouping of nemertean species according to their classes demonstrated by PCoA (Figure 4), the abundance of the toxin transcripts in closely related species varied (Table 4). This was also true for major transcripts, which made up 50% of the total toxin expression. Three species from the Lineus genus-L. longissimus, L. ruber, and L. sanguineus-with three, two, and two major toxins, respectively, showed more differences than similarities; the major toxins comprised the Kunitz-type family (proteinase inhibitor group and ion channel inhibitors) and did not contain pore-forming toxins, enzymes and hemostasis-impairing toxins. Two Cephalothrix species possessed three major toxins each and were similar only in terms of the presence of MACPF domain-containing toxins and the absence of hemostasis-impairing toxins (Table 4).   Insulin-like growth factor-binding protein-related protein 1 Insulin-like growth factor-binding protein-related protein 1 12.5

Discussion
Nemerteans possess various toxins with defensive and offensive functions. These include pyridine toxins (anabaseine, nemertelline, 2,3'-bipyridyl, 3-methyl-2,3'-bipyridyl), tetrodotoxin and its analogues (TTXs), and various peptide toxins. According to current data, pyridine toxins are characteristic of hoplonemerteans [28]. The highest concentration and greatest variety of TTXs are specific to palaeonemerteans [29][30][31][32][33], although trace concentrations have been found in pilidiophorans and hoplonemerteans [32]. Peptide nemerteanspecific toxins have been identified in Pilidiophora class representatives [34], and transcripts of non-nemertean-specific toxins have been found in all nemertean classes [16]. Recently, due to the fast development of NGS techniques, the transcriptomic approach has become popular, leading to the complete and efficient identification of peptides and their expression evaluation, which permits the comparison of the mechanisms by which toxins are used in animals. In the current study, we reassembled and annotated the transcriptomes of Whelan and colleagues [16] (M. grossa, P. peregrina, T. polymorphus, Ce. hongkongiensis, L. longissimus, L. ruber, and Am. lactifloreus [12]). The transcriptomes of Ca. hamanako, B. unicolor, H. ijimai, L. sanguineus, and R. occultus were assembled and annotated from the reads deposited in the SRA (NCBI). The previously published transcriptome of N. geniculatus [17] was processed and annotated. In addition, in the present study, the Ce. cf. simula transcriptome was sequenced, assembled, and annotated for the first time. For all of these transcriptomes, the content of toxin transcripts was evaluated; as a result, a total of 588 toxin transcripts were identified, which were divided into 86 families and assigned to seven groups of toxins according to the annotations from the UniProtKB/Swiss-Prot and Tox-Prot databases (Tables S1-S14 in Supplementary File S1). These groups were neurotoxins (5 families), pore-forming toxins (5 families), enzymes (31 families), proteinase inhibitors (10 families), ion channel inhibitors (9 families), hemostasis-impairing toxins (4 families), and other toxin candidates (22 families).
The function of animals' toxic cocktails is reflected by the composition of toxins and their mechanism of action. The mixtures used to deter predators consist predominantly of compounds that induce an immediate reaction and interfere with fast-acting physiological processes such as nerve transmission. Consequently, many defensive poisons contain toxins that quickly cause paralysis by blocking neuromuscular receptors or acting on pain receptors, causing instant and intense pain [35]. At the same time, venoms used to subdue prey are more diverse in the composition and physiological effects of their toxins [36]. In representatives of all nemertean classes, a mixture of toxins with different activities have been identified-Palaeonemertea and Pilidiophora species contained all seven groups of toxins, and Hoplonemertea species contained six groups; no neurotoxins were found in the latter. Presumably, some of them can play the role of a repellent agent to protect against predators, and others can be used during hunting as immobilizing agents or digestive enzymes. To specify the peptide toxins role for nemerteans, more detailed investigation is necessary to carry out. One of the directions is proteotranscriptomic differential gene expression analyses. To date, there are two articles, devoted to investigation of peptides in mucus, covered the nemertean body and proboscis, a specific weapon organ, demonstrating the characteristics of toxins function based on their expression patterns and proteomic distribution [18,19]. According to Verdes with colleagues, proteins of Antarctonemertes valida with insulin-like growth factor-binding domain (identified in the current research in R. occultus, A. lactifloreus, Ca. hamanako, Ce. hongkongiensis, Ce. cf. simula, T. polymorphus, B. unicolor, H. ijimai, L. longissimus, L. ruber, L. sanguineus, N. geniculatus) and galactose-binding domain (identified in the current research in R. occultus) (Supplementary File S1) were detected in proboscis and were suggested to be a response for predation; protein with Kunitz domain, identified in all nemerteans studied in the current research except for Ce. hongkongiensis (Supplementary File S1), was detected in whole specimen, mucus and proboscis, and presumably had dual functions-predatory and defensive [19]. The research of von Reumont with colleagues provided proteotranscriptomic analysis of Am. lactifloreus mucus and found out the secretion of protein toxins, also identified in the  (Supplementary File S1). Presumably, identified toxins secreted in the mucus could play a defensive role and contribute to predation and the paralysis of prey by facilitating the action of other components of a toxic cocktail [18]. Identified in An. valida and Am. Lactifloreus, toxic peptides with predatory and/or defensive functions were detected in up to 13 nemertean species, studied here, and the common role of peptides for these species may be assumed.
Based on the presence or absence of toxin families, we analyzed the distance between 14 nemertean species for the first time, demonstrating the grouping of species within classes ( Figure 4). The largest number of common toxin families between the two classes was found in Palaeonemertea and Pilidiophora species; the toxins families' profile of Hoplonemertea species was the most unique ( Figure 4, Table 2). The variety of toxin composition of nemertean classes may be caused by their different feeding ecology, including diet preferences and hunting strategies. Thus, it was revealed that palaeonemerteans and pilidiophorans within classes and species individually are characterized by a wide range of potential types of prey: palaeonemerteans prefer nematodes, oligochaetes, polychaetes, and other nemerteans, and pilidiophorans prefer all of the above and additionally bivalves and crustaceans [37][38][39][40][41]. On the contrary, for the Hoplonemertea class, small range of victims is typical, most species prefer one systematic group, mainly crustaceans, as prey, and reject others [38,40,42,43]. Therefore, it could be assumed that the diversity of peptide toxins depends on diversity of potential prey, and since diet of hoplonemerteans is limited to one type of victim, they do not require a great variety of toxic agents. In the case of palaeonemerteans and pilidiophorans, a wide range of potential victims may result in a wide range of toxins, while each type of animal can be affected by a specific toxin. An assumption about the relation between toxin composition and diet preferences was put forth by Verdes and colleagues, due to the revealing of toxin specificity for different nemertean classes [19]. The hunting strategy of hoplonemerteans also increases the discrepancy with other nemertean classes: their proboscis is armed with a stylet that pierces the victim and directly injects the venom cocktail, while Palaeonemertea and Pilidiophora representatives wrap the proboscis around the object without piercing [44,45]. This aspect can also be associated with qualitative composition of toxins and may explain the absence of peptide neurotoxins in the representatives of hoplonemerteans-since the prey capture is accompanied by body piercing, its immobilization is not required.
Despite the similar qualitative composition of toxins in representatives of one class according to PCoA (Figure 4), their quantitative compositions variates greatly. The preliminary expression of toxin transcripts was measured, and it was found that the abundance of the same toxin families differed significantly even in closely related species (Table 4). Thus, in three representatives of Lineus genius, two common toxin groups were identified within major toxins between L. longissimus and L. sanguineus (neurotoxins, proteinase inhibitors), and L. longissimus and L. ruber (ion channel inhibitors) ( Table 4). This may indicate differences in diet preferences of each species: Lineus species live in the same area; however, the victims are distinguishing [37,38,40,46]. The abundance of major toxins in another closely related species-Ce. hongkongiensis and Ce. cf. simula-was similar for pore-forming toxins and varied significantly for neurotoxins, enzymes (presented in Ce. hongkongiensis) and proteinase inhibitors, ion channel inhibitors (presented in Ce. cf. simula) ( Table 4). The victim types, preferred by this nemertean species, have not been studied, therefore, correlation between diet and toxins composition could not be established. However, the variability in expression levels for the same toxin families can also result from different RNA preparation conditions: Ce. cf. simula RNA was extracted from the middle of the nemertean body and did not contain proboscis, and Ce. hongkongiensis RNA sample was obtained from three individuals and the tissue or body parts were not mentioned, there was no information about proboscis presence ( Table 1). The same situation was observed within hoplonemerteans, where all transcriptomes were obtained from dissimilar tissues (Table 1), which correlates with the absence of major toxins overlapping between them (Table 4). Nemerteans toxins are thought to be secreted by glandular cells located in the epidermis of the integument for potential use against predators, and by cells located in the proboscis epidermis to contribute to prey capture [15]. Depending on toxins' functions, their expression levels in these organs can be different [19]. The provided toxin transcripts abundancy estimation was preliminary and needs to be evaluated using the same tissue type and library preparation conditions.

Conclusions
Resulting from the transcriptomic analysis, high diversity of toxins and general trends in the distribution of peptide toxins within Nemertea phylum were revealed. The principal coordinate analysis of the distance between 14 nemertean species based on the presence/absence of toxin families in transcriptomes demonstrated that nemertean species were grouped according to the classes to which they belong-Palaeonemertea, Pilidiophora, and Hoplonemertea. The qualitative comparison of the toxin composition of the three nemertean classes showed the toxin families' overlaps between nemertean classes; the largest number of common toxin families between the two classes was found in Palaeonemertea and Pilidiophora species. The correlation between number of common toxins with evolution distance between the classes is a question for further investigation. Palaeonemertea and Pilidiophora representatives, as the nemerteans with the largest number of toxin transcripts, may be the most promising objects for future studies. The results obtained point to the need for further study of the toxic composition of nemerteans, including proteo-transcriptomic analysis, in order to clarify the spectrum of toxins and study their expression and localization, as well as to search for new, unstudied toxic peptides. studied, therefore, correlation between diet and toxins composition could not be estab lished. However, the variability in expression levels for the same toxin families can also result from different RNA preparation conditions: Ce. cf. simula RNA was extracted from the middle of the nemertean body and did not contain proboscis, and Ce. hongkongiensi RNA sample was obtained from three individuals and the tissue or body parts were no mentioned, there was no information about proboscis presence ( Table 1). The same situ ation was observed within hoplonemerteans, where all transcriptomes were obtained from dissimilar tissues (Table 1), which correlates with the absence of major toxins over lapping between them (Table 4). Nemerteans toxins are thought to be secreted by glan dular cells located in the epidermis of the integument for potential use against predators and by cells located in the proboscis epidermis to contribute to prey capture [15]. De pending on toxins' functions, their expression levels in these organs can be different [19] The provided toxin transcripts abundancy estimation was preliminary and needs to be evaluated using the same tissue type and library preparation conditions.

Conclusions
Resulting from the transcriptomic analysis, high diversity of toxins and genera trends in the distribution of peptide toxins within Nemertea phylum were revealed. The principal coordinate analysis of the distance between 14 nemertean species based on the presence/absence of toxin families in transcriptomes demonstrated that nemertean spe cies were grouped according to the classes to which they belong-Palaeonemertea, Pi lidiophora, and Hoplonemertea. The qualitative comparison of the toxin composition o the three nemertean classes showed the toxin families' overlaps between nemertean classes; the largest number of common toxin families between the two classes was found in Palaeonemertea and Pilidiophora species. The correlation between number of common toxins with evolution distance between the classes is a question for further investigation Palaeonemertea and Pilidiophora representatives, as the nemerteans with the larges number of toxin transcripts, may be the most promising objects for future studies. The results obtained point to the need for further study of the toxic composition of nemerte ans, including proteo-transcriptomic analysis, in order to clarify the spectrum of toxin and study their expression and localization, as well as to search for new, unstudied toxi peptides.

Animal Collection
Specimens of the nemertean species Ce. cf. simula were collected in October 2019 and August 2020 from rhizoids of Saccharina sp. in the Spokoynaya Bay, Peter the Great Gulf and the Sea of Japan (42.7090° N, 133.1809° E) ( Figure 5). The species were kindly identi fied by Dr. Alexey V. Chernyshev from the Far Eastern Branch of the Russian Academy of Sciences, A.V. Zhirmunsky National Scientific Center of Marine Biology (Vladivostok Russia). Before RNA isolation, animals were kept in aerated aquaria at 17 °C.

RNA Isolation, Library Preparation, and Sequencing
The total RNA of Ce. cf. simula middle of body was isolated using TRIzol Reagent (Thermo Fisher Scientific, Waltham, MA, USA), according to the manufacturer's protocol. RNA concentration and quality were assessed using a BioSpec-nano analyzer (Shimadzu, Kyoto, Japan) at 260/280 nm and 260/230 nm. The length of the fragments was estimated by electrophoresis in 1.2% agarose gel in TAE buffer stained with ethidium bromide with an RNA length marker, i.e., the RiboRuler High Range RNA Ladder (Thermo Fisher Scientific). mRNA was isolated from total RNA using the NEBNext Poly(A) mRNA Magnetic Isolation Module kit (New England Biolabs, Ipswich, MA, USA), followed by double-stranded cDNA synthesis using the Mint2 kit (Eurogen, Moscow, Russia). The result was evaluated by electrophoresis in 1.2% agarose gel in TAE buffer stained with ethidium bromide using the 1 kb DNA ladder DNA length marker (Thermo Fisher Scientific). The double-stranded cDNA was isolated from the reaction mixture with the Bioline ISOLATE II PCR and Gel Kit (Meridian Bioscience Inc., Cincinnati, OH, USA), following the manufacturer's protocol. The concentration of isolated cDNA was assessed using a Qubit fluorometer (Thermo Fisher Scientific). For sample enrichment with low-represented sequences, cDNA was normalized using the Trimmer-2 kit (Evrogen).
Normalized and double-stranded cDNA sequencing was performed on a MinION Mk1B Oxford Nanopore platform (Oxford Nanopore Technologies, Oxford, UK) using the Direct cDNA Sequencing Kit SQK-DCS109 (Oxford Nanopore Technologies), following the manufacturer's protocol.
To prepare the cDNA library for the Illumina platform (San Diego, CA, USA), normalized and non-normalized cDNA samples were amplified. The library for Illumina was prepared by the NEBNext Ultra II FS DNA Library Prep Kit for Illumina (New England Biolabs), using the protocol for >100 ng cDNA samples. Sequencing was outsourced to JSC "TsGRM "GENETIKO" (Moscow, Russia).
The Ce. cf. simula transcriptome assembly was carried out using a pipeline, developed in the current study, based on a hybrid method combining data obtained on the MinION Oxford Nanopore and Illumina platforms. The pipeline included the following steps. First, adapters and chimeric sequences were removed from Oxford Nanopore long reads using Porechop (v. 0.2.4) (https://github.com/rrwick/Porechop, accessed on 23 January 2022). Then, Illumina short reads were corrected using prepared Oxford Nanopore reads by minimap2 (v. 2.24-r1122) (ten cycles) [50] and racon (v. 1.4.13) [51]. Unigenes were obtained using the CAP3 program [52]. Illumina short reads were used separately for de novo transcriptome assembly in SPAdes v.3.15.3. Then, assembled from Illumina short reads, the transcriptome was combined with the corrected Oxford Nanopore long reads using minimap2 (v. 2.24-r1122) and racon (v. 1.4.13).
PCoA was performed using the QIIME 2 software package [59]. Initially, a matrix was formed, in which 86 families of toxins were correlated (0: absence; 1: presence) with the studied nemertean species. Based on this matrix, the Jaccard distance was calculated and the graph of beta diversity was plotted.