Proteomic Analysis of the Venom of Jellyfishes Rhopilema esculentum and Sanderia malayensis

Venomics, the study of biological venoms, could potentially provide a new source of therapeutic compounds, yet information on the venoms from marine organisms, including cnidarians (sea anemones, corals, and jellyfish), is limited. This study identified the putative toxins of two species of jellyfish—edible jellyfish Rhopilema esculentum Kishinouye, 1891, also known as flame jellyfish, and Amuska jellyfish Sanderia malayensis Goette, 1886. Utilizing nano-flow liquid chromatography tandem mass spectrometry (nLC–MS/MS), 3000 proteins were identified from the nematocysts in each of the above two jellyfish species. Forty and fifty-one putative toxins were identified in R. esculentum and S. malayensis, respectively, which were further classified into eight toxin families according to their predicted functions. Amongst the identified putative toxins, hemostasis-impairing toxins and proteases were found to be the most dominant members (>60%). The present study demonstrates the first proteomes of nematocysts from two jellyfish species with economic and environmental importance, and expands the foundation and understanding of cnidarian toxins.


Introduction
Phylum Cnidaria Hatschek, 1888, is one of the most ancient phyla that can be traced back to Cambrian [1]. This phylum is divided into five classes: Anthozoa Ehrenberg, 1834 (corals and sea anemones), Cubozoa Werner, 1973 (box jellyfish), Hydrozoa Owen, 1843, Scyphozoa Götte, 1887 (true jellyfish), and Staurozoa Marques & Collins, 2004 (stalked jellyfish) [2]. Approximately 12,000 extant species are known from freshwater and marine habitats worldwide, from shallow coastal waters to the deep seas [3,4]. Cnidarians are characterized by the presence of nematocytes, which constitute an important specialized cell type that assists prey capture and predator deterrence. Each nematocyte houses a unique organelle called the nematocyst. This Golgi apparatus-derived organelle consists of a capsule containing an inverted tubule immersed in a mixture of venomous substances [5]. Upon mechanical or chemical stimulation, the tubule everts and injects a venomous mixture into the prey or predator. Jellyfish possess nematocytes mainly on the tentacles [6]. The composition of the venom inside the nematocyst varies between different jellyfish species and may be a mixture of proteinaceous and non-proteinaceous toxins with hemolytic, neurotoxic, cytotoxic, and dermonecrotic activities [6]. Depending on the composition of the venom, the symptoms of jellyfish envenomation vary from species to species, ranging from mild local symptoms such as pruritus, pain,

Transcriptome and Protein Database Construction
Next-generation sequencing (NGS) was used to construct the R. esculentum appendages and the S. malayensis tentacle transcriptome followed by gene model predictions using funannotate [15]. Based on the results of transcriptomic analysis, the R. esculentum and S. malayensis protein databases were generated with 18,923 and 26,914 protein sequences, respectively. Gene Ontology (GO) analysis was performed by the eggNOG-mapper [16] and annotations were assigned to three primary GO domains: biological process (BP), cellular component (CC), and molecular function (MF). In total, 8786 (46.43%) R. esculentum proteins and 9138 (33.95%) S. malayensis proteins were successfully annotated with 143,350 and 153,009 GO terms, respectively (Table 1 and Figure 1A,B). In addition, 4187 and 4485 enzymes were identified in R. esculentum and S. malayensis, respectively, classified according to their Enzyme Commission (EC) number. The proportional distributions of the enzymes in both species were similar, which were dominated by transferases and hydrolases (Table 1 and Figure 2). Furthermore, to identify and annotate the putative toxins, the protein sequences generated were run against the UniProt animal venom proteins and toxins database (Tox-Prot) using BLASTp [17]. Protein sequences with an e-value of <1.0 × 10 −5 to entries in the database were used as input into "ToxClassifier" to exclude non-toxic homologs [18]. A total of 190 and 186 putative toxins were found in R. esculentum and S. malayensis, respectively. The toxin profiles of both species were similar in which the toxins could be classified into eight toxin families: hemostasis-impairing toxins, proteases, phospholipases, neurotoxins, cysteine-rich proteins, protease inhibitors, pore-forming toxins, and other toxins (Tables S1 and S2).    The nematocysts from R. esculentum and S. malayensis were purified and their protein profiles were revealed by proteomic analysis. A total of 3083 and 3559 proteins were identified in R. esculentum and S. malayensis, respectively (Tables S3 and S4), including 40 R. esculentum and 51 S. malayensis putative toxins. According to their predicted biological function, these toxins were classified into the eight toxin families ( Table 2, Tables S5 and S6). The proportional distributions of these toxin families were also similar between the two species ( Figure 3). Hemostasis-impairing toxins comprised the most abundant class of identified toxins, representing 32.5% and 39.2% of the R. esculentum and S. malayensis toxins, respectively, most of which were homologous to ryncolin, a family of proteinaceous toxins originally described from Cerberus rynchops. This family also includes a variety of C-type lectins (i.e., C-type lectin, C-type lectin lectoxin-Lio2, and galactose-specific lectin nattectin). In addition, other toxins such as prothrombin activator, coagulation factor V, coagulation factor X, and snaclec bothrojaracin subunit homologs were also identified in the proteome of both species.
Proteases comprised the second most predominant toxin family in both R. esculentum and S. malayensis proteome, accounting for 27.5% and 21.5% of the R. esculentum and S. malayensis putative toxins, respectively. Among this family, metalloproteinases were the most abundant. In the R. esculentum toxin proteome, three out of the seven metalloproteinases found were homologous to zinc metalloproteinase-disintegrin proteins. Meanwhile, a further two were neprilysin-1 homologs, another two were homologs of astacin-like metalloproteases. In the S. malayensis toxins, nine metalloproteinases were found, four of which were zinc metalloproteinase-disintegrin proteins. Four astacin-like metalloproteases and one neprilysin-1 were also detected.
Besides these two major classes of toxins, the R. esculentum and S. malayensis venoms also exhibited similar proportional distributions of other toxins. Meanwhile, l-amino-acid oxidase, acetylcholinesterase, and venom acid phosphatase were only found in S. malayensis venom, and U-actitoxin-Avd3j and calglandulin were only detected in R. esculentum venom.

Identification of R. esculentum and S. malayensis Nematocyst Proteins by nano-LC-ESI MS/MS
The nematocysts from R. esculentum and S. malayensis were purified and their protein profiles were revealed by proteomic analysis. A total of 3083 and 3559 proteins were identified in R. esculentum and S. malayensis, respectively (Tables S3 and S4), including 40 R. esculentum and 51 S. malayensis putative toxins. According to their predicted biological function, these toxins were classified into the eight toxin families (Table 2, Tables S5, and S6). The proportional distributions of these toxin families were also similar between the two species ( Figure 3). Hemostasis-impairing toxins comprised the most abundant class of identified toxins, representing 32.5% and 39.2% of the R. esculentum and S. malayensis toxins, respectively, most of which were homologous to ryncolin, a family of proteinaceous toxins originally described from Cerberus rynchops. This family also includes a variety of C-type lectins (i.e., C-type lectin, C-type lectin lectoxin-Lio2, and galactose-specific lectin

Functional Analysis of the Putative Toxins
A total of 282 and 408 GO terms were assigned to 20 (50%) R. esculentum and 27 (54.9%) S. malayensis putative toxins, respectively (Tables S3 and S4). The 10 most represented GO terms in the three domains of biological process (BP), cellular component (CC), and molecular function (MF) are shown in Figure 4. Furthermore, the presence of signal peptides was predicted by SignalP, showing that 52.5% and 29.4% of the R. esculentum and S. malayensis putative toxins contain secretory signal peptides, respectively. Moreover, DeepLoc analysis indicated that 52.5% and 54.9% of the putative toxins were located in the extracellular region. Taken together, there 75% of the putative toxins in R. esculentum and 62.7% in S. malayensis were predicted as extracellular proteins (by SignalP and/or DeepLoc). Additionally, 16 and 20 enzymes were identified in the R. esculentum and S. malayensis putative toxins. In both species, hydrolase (EC 3) was the predominant enzyme (13 and 16 of R. esculentum and S. malayensis putative toxins, respectively; Figure 5A,C), the majority of which was comprised by esterase (EC3.1) and peptidase (EC 3.4) ( Figure 5B,D).

InterProScan Analysis
The protein domains of the putative toxins were annotated by InterProScan. A total of 47 and 55 protein domains were assigned to the R. esculentum and S. malayensis putative toxins, respectively (Tables S7 and S8). The top ten most represented domains are shown in Figure 6. In general, the results agreed with the BLASTp-ToxClassifier results. Two domains related to hemostasis were found, namely, fibrinogen, alpha/beta/gamma chain, C-terminal globular domain, and coagulation factor 5/8 C-terminal domain, the former being highly represented in both species. Although there were no predominant single protease-related domains, a variety of protease-related domains (peptidase M12B, peptidase M12A, astacin-like metallopeptidase domain, serine proteases/trypsin domain, and disintegrin domain) were found. Furthermore, the domains related to cysteine-rich proteins (CAP domain and SCP domain), the C-type lectin-like domain, and the phospholipase A2 domain were screened in both species. In addition, the ShKT domain and the Kunitz domain, the protein domains with high potential therapeutic value, were also detected in the putative toxins of both species. A total of three and six ShKT domain-containing proteins were identified in the R. esculentum and S. malayensis putative toxins, respectively, most (seven out of nine) of which were protease-type toxins. In these ShKT domain-containing proteases, the ShKT domain is either associated with the trypsin domain or the peptidase_M12A domain (a metalloproteases domain). Although the sequence of the ShKT domain found in trypsin was slightly different form that of metalloproteases, both of them were characterized by the typical pattern of six cysteines, except the second ShKT domain of Sma_022066-T1 (Sma_022066-T1_M12_ShKT_2), which lacked the fifth cysteine residue of the motif ( Figure 7A,B). Furthermore, a total of five Kunitz domains were detected in four protease inhibitor-type putative toxins, with Sma_015170-T1 containing two domains. Most of the detected Kunitz domains displayed the conserved motif of C-8X-C-15X-C-4X-YGGC-12X-C-3X-C. The only exception was the Kunitz domains of Sma_021821-T1, which deviated from the conserved architecture by a Y31F substitution ( Figure 7C).

InterProScan Analysis
The protein domains of the putative toxins were annotated by InterProScan. A total of 47 and 55 protein domains were assigned to the R. esculentum and S. malayensis putative toxins, respectively (Tables S7 and S8). The top ten most represented domains are shown in Figure 6. In general, the results agreed with the BLASTp-ToxClassifier results. Two domains related to hemostasis were found, namely, fibrinogen, alpha/beta/gamma chain, C-terminal globular domain, and coagulation factor 5/8 C-terminal domain, the former being highly represented in both species. Although there were no predominant single protease-related domains, a variety of protease-related domains (peptidase M12B, peptidase M12A, astacin-like metallopeptidase domain, serine proteases/trypsin domain, and disintegrin domain) were found. Furthermore, the domains related to cysteine-rich proteins (CAP domain and SCP domain), the C-type lectin-like domain, and the phospholipase A2 domain were screened in both species. In addition, the ShKT domain and the Kunitz domain, the protein domains with high potential therapeutic value, were also detected in the putative toxins of both species. A total of three and six ShKT domain-containing proteins were identified in the R. esculentum and S. malayensis putative toxins, respectively, most (seven out of nine) of which were protease-type toxins. In these ShKT domain-containing proteases, the ShKT domain is either associated with the trypsin domain or the peptidase_M12A domain (a metalloproteases domain). Although the sequence of the ShKT domain found in trypsin was slightly different form that of metalloproteases, both of them were characterized by the typical pattern of six cysteines, except the second ShKT domain of Sma_022066-T1 (Sma_022066-T1_M12_ShKT_2), which lacked the fifth cysteine residue of the motif (Figure 7A,B). Furthermore, a total of five Kunitz domains were detected in four protease inhibitor-type putative toxins, with Sma_015170-T1 containing two domains. Most of the detected Kunitz domains displayed the conserved motif of C-8X-C-15X-C-4X-YGGC-12X-C-3X-C. The only exception was the Kunitz domains of Sma_021821-T1, which deviated from the conserved architecture by a Y31F substitution ( Figure 7C). Mar. Drugs 2020, 18, x FOR PEER REVIEW 12 of 19

Discussion
Animal venoms comprise a mixture of bioactive molecules that include different types of toxic proteins. Most of these proteinaceous toxins arise from gene duplication and contain a non-toxic physiological function [19,20]. Owing to this nature of proteinaceous toxins, it is difficult to

Discussion
Animal venoms comprise a mixture of bioactive molecules that include different types of toxic proteins. Most of these proteinaceous toxins arise from gene duplication and contain a non-toxic physiological function [19,20]. Owing to this nature of proteinaceous toxins, it is difficult to

Discussion
Animal venoms comprise a mixture of bioactive molecules that include different types of toxic proteins. Most of these proteinaceous toxins arise from gene duplication and contain a non-toxic physiological function [19,20]. Owing to this nature of proteinaceous toxins, it is difficult to distinguish between toxic proteins and their non-toxic homologs using BLAST alone. Therefore, most studies based on identifying toxic proteins apply different manual filters to filter out the non-toxic homologs, which may lead to problems in verifying the results. In this study, the protein sequences obtained from proteomic analysis were compared against the UniProt animal venom proteins and toxins database (Tox-Prot) using BLASTp. After that, ToxClassifier, a machine learning model, was used to differentiate toxins from other proteins having non-toxic physiological functions [18]. By using this approach, 190 and 186 putative toxins were predicted from R. esculentum and S. malayensis transcriptomic data, respectively (Table 1). Furthermore, 40 R. esculentum and 51 S. malayensis putative toxins were identified at the protein level (Table 2).
Hemostasis-impairing toxins were the most predominant toxin family of both the R. esculentum and S. malayensis venom proteomes. Most of the toxins in this family share sequence similarity with ryncolins, a group of hemostasis-impairing toxins originally described from C. rynchops [21]. Ryncolin genes [22], transcripts, and proteins [23][24][25][26] have also been found in other species of jellyfish, but their functions are not well-characterized. C-type lectins, another major group of hemostasis-impairing toxins, are commonly found in the venoms of a wide variety of animals [27][28][29][30], including several jellyfish species: Pacific sea nettle Chrysaora fuscescens [31], Lion's mane jellyfish Cyanea capillata, Nomura's jellyfish N. nomurai [26], and cannonball jellyfish Stomolophus meleagris [23]. C-type lectins are calcium-dependent carbohydrate-binding proteins that exhibit pro/anticoagulant, and pro/antithrombotic activities. This type of toxin is also involved in pain and itch sensitization through Toll-like receptors [32]. Therefore, the C-type lectins in R. esculentum and S. malayensis venom might explain the pruritus and pain caused by the sting.
The prediction of eggNOG-mapper revealed that protease was the most abundant enzyme in the venoms of both species ( Figure 5). Consistent with the prediction, 11 proteases were identified as putative toxins in both the R. esculentum and S. malayensis venoms, which comprise the second most abundant class of toxins. In the venoms of both species, metalloproteases were the predominant proteases. Mainly, they were homologous to zinc metalloproteinase-disintegrin-like and astacin-like metalloprotease, which were also identified in the proteome of the venom of other cnidarians, including Lion's mane jellyfish C. capillata [26], sea wasp Chironex fleckeri [33], Pacific sea nettle C. fuscescens [31], ghost jellyfish Cyanea nozakii [34], Cyanea sp. [25], Nomura's jellyfish N. nomurai [24,26], cannonball jellyfish S. Meleagris [23], and starlet sea anemone Nematostella vectensis [35]. It has been suggested that metalloprotease induces protease-mediated tissue damage by the degradation of the extracellular matrix, which eventually results in necrosis, edema, and hemorrhage [36]. Correspondingly, the skin and tissue necrosis caused by S. malayensis envenomation [9] are most likely associated with the highly represented metalloproteases in its venom. In addition, our results are also in agreement with another study, which demonstrated significant metalloprotease activity in R. esculentum venom [36].
Other than metalloproteases, several serine proteases and serine carboxypeptidases were also identified in the venom proteomes. The presence of proteases has also been reported in the venom of several jellyfish species [24][25][26]31,33]. The functional role of proteases in jellyfish venom is not well understood; while based on observations in other venomous animals, we suggest that the proteases in the venom may be responsible for promoting the spreading and activation of other toxins [37,38].
In addition to these two predominated toxin families, other toxins were also identified. Phospholipases, for example, were identified in the venom proteomes of both species, all of which were phospholipases A2 (PLA2s) with two and four copies of PLA2s in R. esculentum and S. malayensis venom, respectively. PLA2s has also been found in the venom of various other jellyfish species [23][24][25][26]31,33,34,39,40], and the PLA2 activity has been detected in the oral arm of R. esculentum [41]. PLA2s are considered hemolysin in jellyfish venom [42]; thus, we suspect that the presence of PLA2s in R. esculentum venom might be associated with its hemolytic activities reported previously [43]. Meanwhile, more experiments are required to investigate whether S. malayensis also exhibits PLA2-induced hemolytic activity. In addition, the homologs of reticulocalbin, the mediators of PLA2 toxins [44,45], were detected in the venom of both species. This finding implies that PLA2s play an important role in R. esculentum and S. malayensis envenomation.
Proteases inhibitors, another group of toxins commonly found in jellyfish venom [24][25][26]31,34,46], were also found in the proteomes of R. esculentum and S. malayensis venom. In our proteomic study, four Kunitz-type serine protease inhibitors were found, two in R. esculentum and another two in S. malayensis venom, characterized by the conserved motif of C-8X-C-15X-C-4X-YGGC-12X-C-3X-C ( Figure 7C). Kunitz-type serine protease inhibitors inhibit proteases, which cause inflammation and interferes with blood coagulation [47]. It has also been suggested that the protease inhibitors also play an auxiliary role in envenomation by maintaining the integrity of the proteinaceous toxins [42]. Moreover, in sea anemones and scorpions, Kunitz-type serine protease inhibitors exhibit an ion channel-blocking function, which might result in paralysis [47]. Further investigation is required to clarify the functions of the protease inhibitors in jellyfish venom.
Other toxins rarely reported in jellyfish venom were also identified in this study. A venom acid phosphatase was identified in the S. malayensis venom proteome. This type of toxin is commonly found in honeybees and weever fish, which can induce allergic reaction through the induction of histamine release [48]. To the best of our knowledge, the venom acid phosphatase protein has not been identified in jellyfish venom proteome before. Furthermore, some lethally pore-forming toxins were found in the venom of both species. Three stonustoxins (SNTXs) were identified: One SNTX subunit beta homolog (a hemolysin from estuarine stonefish) and two neoverrucotoxin subunit beta homologs (a hemolysin from reef stonefish) detected in R. esculentum and S. malayensis venom, respectively. SNTXs show hemolytic and hypotensive activity in stonefish venom [49,50]; however, the function of these toxins in jellyfish venom remains to be tested.
In terms of the discovery of novel therapeutic compounds, our data suggest that R. esculentum and S. malayensis venoms might be a potential source for drug screening. Our InterProScan results have identified two protein domains with potential therapeutic value, namely, the ShKT domain and the Kunitz domain. The ShKT domain is one of the best-studied protein domains with high therapeutic potential. ShKT is a potent potassium ion channel blocker with high affinity for KV1.3 channels [51]; this channel is essential for effector memory T (TEM) cell activation, which is a hallmark of autoimmune diseases [52,53]. Therefore, KV1.3 is considered a promising target of autoimmune disease treatment. Several modified ShKT peptides with increased selectivity of KV1.3 have been synthetized as potential drugs for treating autoimmune diseases [54][55][56][57][58]. Among them, dalazatide has completed clinical phase 1b [52]. Moreover, Kunitz domain-containing proteases inhibitors were also found in the venoms of both species, which could be considered potential sources of therapeutic compounds. A previous study demonstrated that the potassium ion channel blockade function of Kunitz domain-containing peptides grants them the potential application as neuroprotective drugs [59]. In addition, Kunitz domain-containing proteins also show therapeutic potential in the treatment of cancer [60] and hereditary angioedema [61], attributed to their protease inhibition ability.
The putative toxins identified in this study were predicted based on the toxins recorded in the Tox-Prot database. Further investigations will be needed to confirm the toxicities of these putative toxins and to elucidate their role in envenomation. Regardless, this work revealed that R. esculentum and S. malayensis nematocysts contain various putative toxins with a high diversity of biological functions. Moreover, our study provides valuable information for the screening of novel therapeutic compounds and an enriched jellyfish toxin database.

Jellyfish Collection
Specimens of R. esculentum and S. malayensis were wild-caught and obtained from a local supplier in Hong Kong. S. malayensis was also provided by the Ocean Park Hong Kong. Medusae of both species were cultured in circulating artificial seawater (salinity 30 ppt) at room temperature at the Chinese University of Hong Kong. Individuals of R. esculentum were fed once per week with newly hatched Artemia and were starved for at least two days prior to sampling. Individuals of S. malayensis were not fed for several days after arrival in the laboratory before sampling.

Sample Preparation for Proteomic Analysis
Nematocysts were isolated according to methods described in previous studies [26,31]. Protein was extracted from the cleaned nematocysts by dissolving in lysis buffer (6 M urea, 2 M thiourea, and 1 mM dithiothreitol (DTT)). After removing the insoluble impurities by centrifugation at 21,000× g, the protein samples were then alkylated with 5 mM of iodoacetamide for 30 min in the dark at room temperature and digested in 1/20 sequencing-grade trypsin (Promega) overnight at 37 • C. The digested peptides were fractionated into four fractions with increasing acetonitrile (ACN) concentrations (7.5%, 12.5%, 17.5%, and 50%) using a high-pH reversed-phase fractionation kit (Thermo Fisher Scientific, Waltham, MA, USA). The fractions were dried in a SpeedVac and resuspended in 5% formic acid (v/v) and 5% ACN (v/v). Then, 1 µg of peptides was subjected to nano-flow liquid chromatography separation using a Dionex UltiMate 3000 RSLC nano-system. The sample was separated using a 25-cm-long, 75 µm internal diameter C18 column. The peptides were eluted from the column at a constant flow rate of 0.3 µl/min with a linear gradient from 2% to 35% of ACN over 120 min. The eluted peptides were analyzed by an Orbitrap Fusion Lumos Tribrid mass spectrometer (Thermo Fisher Scientific). MS and MS/MS scans were acquired in the Orbitrap with a mass resolution of 60,000 and 15,000, respectively. MS scan range was from 375 to 1500 m/z with an automatic gain control (AGC) target 4e5, and the maximum injection time was 50 ms. The AGC target and the maximum injection time for MS/MS were 5e4 and 250 ms, respectively. The higher-energy collisional dissociation (HCD) mode was used as the fragmentation mode with 30% collision energy. The precursor isolation windows were set to 1.6 m/z.

Spectral Searches and Bioinformatics Analysis
Data were analyzed by Proteome Discoverer version 2.3 with SEQUEST as a search engine. The searching parameters were as follows: oxidation of methionine (+15.9949 Da) and carbamidomethylation of cysteine (+57.0215 Da) was set as the dynamic modification; precursor ion mass tolerance, 10 ppm; fragment ion mass tolerance, 0.02 Da. The data were searched against the translated protein sequences from our constructed transcriptome database obtained previously [15]. The protein level false discovery rate was estimated by Percolator at an experimental q-value (exp. q-value) threshold of 0.05. To identify the putative toxins, the proteins sequences were run against the UniProt animal toxin and venom database to identify toxin contents using BLASTp (e-value of <1.0 × 10 −5 ) [17]. Then, BLASTp annotation was validated by ToxClassifier to exclude the proteins with non-toxic physiological functions [18]. GO annotations were done by eggNOG-mapper using the default setting [16]. The signal peptides and subcellular localization were predicted by SignalP-5.0 [62] and DeepLoc-1.0 [63], respectively. The protein domains were screened by InterProScan (5.47-82.0) [46].