Insights into Common Octopus (Octopus vulgaris) Ink Proteome and Bioactive Peptides Using Proteomic Approaches

The common octopus (Octopus vulgaris) is nowadays the most demanded cephalopod species for human consumption. This species was also postulated for aquaculture diversification to supply its increasing demand in the market worldwide, which only relies on continuously declining field captures. In addition, they serve as model species for biomedical and behavioral studies. Body parts of marine species are usually removed before reaching the final consumer as by-products in order to improve preservation, reduce shipping weight, and increase product quality. These by-products have recently attracted increasing attention due to the discovery of several relevant bioactive compounds. Particularly, the common octopus ink has been described as having antimicrobial and antioxidant properties, among others. In this study, the advanced proteomics discipline was applied to generate a common octopus reference proteome to screen potential bioactive peptides from fishing discards and by-products such as ink. A shotgun proteomics approach by liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) using an Orbitrap Elite instrument was used to create a reference dataset from octopus ink. A total of 1432 different peptides belonging to 361 non-redundant annotated proteins were identified. The final proteome compilation was investigated by integrated in silico studies, including gene ontology (GO) term enrichment, pathways, and network studies. Different immune functioning proteins involved in the innate immune system, such as ferritin, catalase, proteasome, Cu/Zn superoxide dismutase, calreticulin, disulfide isomerase, heat shock protein, etc., were found in ink protein networks. Additionally, the potential of bioactive peptides from octopus ink was addressed. These bioactive peptides can exert beneficial health properties such as antimicrobial, antioxidant, antihypertensive, and antitumoral properties and are therefore considered lead compounds for developing pharmacological, functional foods or nutraceuticals.


Introduction
The common octopus, Octopus vulgaris, belongs to the coleoid cephalopods group, together with cuttlefish and squids, which are well known for their inking behavior, one of their most distinctive and defining characteristics [1]. In 1797, Cuvier first described O. vulgaris, which belongs to the family Octopodidae, as a benthic, neritic species that can be found in a variety of habitats, including rocks, coral reefs, and grass, from the shore to the outer edge of the continental shelf at depths ranging from 0 to 200 m [2]. Octopuses have the ability to learn, play and regenerate their damaged tissues, and they can also exhibit predatory and exploratory behavior [2,3]. In case of danger, they can squirt water at intruders to scare them away or cover themselves with ink for camouflage [4,5]. Reproductive behavior is exposed by the copulatory activity of the males; when females are ready to deposit the spawn, they hide in the dens and place the clusters of eggs on the walls [6]. Afterward, females take care of the eggs alone (the brooding period is between 25-65 days) until many of them die when the eggs hatch. They have a short life cycle of Additionally, to visualize the protein profile of the octopus ink samples, the three biological ink samples were separated by SDS-PAGE 10% ( Figure 1). This gel illustrates that all extracts show a similar protein profile.
Mar. Drugs 2023, 21,206 The main aim of this study was to apply the advanced proteomics discipline erate an octopus ink reference proteome for the screening of potential bioactive p from fishing and aquaculture discards and by-products.

Octopus Ink Samples
Due to varying sampling methods and ink sac sizes, we collected different a of ink samples, which directly influenced the recovery of ink proteins. Table 1 sum all the sample collection methods and the recovery of ink proteins.
Additionally, to visualize the protein profile of the octopus ink samples, t biological ink samples were separated by SDS-PAGE 10% (Figure 1). This gel ill that all extracts show a similar protein profile.

Octopus Ink Proteome
A reference octopus ink proteome was created by merging 2,693 identified (Peptide Spectrum Matches, PSMs) from 1,432 different peptides obtained fro different octopus ink samples (Supplementary File S1). Finally, a total of 3

Octopus Ink Proteome
A reference octopus ink proteome was created by merging 2693 identified spectra (Peptide Spectrum Matches, PSMs) from 1432 different peptides obtained from three different octopus ink samples (Supplementary File S1). Finally, a total of 361 non-redundant annotated proteins were identified from these peptides (Supplementary File S2). This discovery stage was based on the LC-MS/MS analysis and SEQUEST-HT search of the tryptic digestion for the total protein extracts from each octopus ink sample compiled in a unique dataset. Raw data and analyses outputs are publicly available in the MassIVE data repository under accession number MSV000089896 and the ProteomeXchange database under accession number PXD035359. One of the major limitations to work with non-model organisms is the scarcity of public protein and gene databases. This is the reason why protein identification was conducted using Proteome Discoverer 2.4 (Thermo Fisher Scientific, San Jose, CA, USA), using both a global database according to phylogenetic similarity for the class "Cephalopoda" with about 125,800 protein entries, including canonical and isoforms sequences in UniProtKB protein database, and also the UniGene transcriptomic database of octopus paralarvae traduced to proteins [3], containing 77,838 protein sequences, which considerably increased the number of protein identifications. Supplementary File S2 summarizes the proteins followed by their respective gene name, gene homologs, PMS, unique peptides, and percentage of protein coverage. Of the 361 proteins detected, a total of 208 proteins were assigned to the species O. vulgaris. A total of 37 uncharacterized proteins were observed, 17 of them related to O. vulgaris.
The final global dataset of the octopus ink proteome was subsequently analyzed by protein-based bioinformatics, such as gene ontologies, pathways, and network analyses, and by the prediction of potential bioactive peptides to gather more functional insights of the octopus ink.

Label-Free Quantification (LFQ) of O. vulgaris Ink Samples
Relative label-free quantification of each O. vulgaris ink sample (OVI1, OVI2, OVI3) was also performed to determine the protein abundance of each sample. Supplementary File S1 contains these results. High-abundance proteins for each sample were analyzed and compared. Figure 2 shows the distribution of high-abundance proteins detected for each O. vulgaris ink sample, while Figure 3 (Venn diagram) shows the distribution and overlapping of high-abundance proteins for all O. vulgaris ink samples. The determination of proteins in these samples was directly influenced by ink sac sizes, sampling techniques, and protein precipitation. As was demonstrated in Figures 2 and 3, the majority of the high-abundance proteins were detected in the OVI2 sample. In our study, the syringe method (OVI1) retrieved the fewest proteins, but the milking method after euthanasia generated the opposite outcome (Table 1, Figures 2 and 3). Moreover, protein precipitation resulted in a minor loss of proteins for OVI1 and OVI3, whereas no purification led to the identification of significant protein abundance for OVI2. These hybrid collection methods offer a high coverage of the octopus ink proteome.
One of the major limitations to work with non-model organisms is the scarc public protein and gene databases. This is the reason why protein identification conducted using Proteome Discoverer 2.4 (Thermo Fisher Scientific, San Jose, CA, using both a global database according to phylogenetic similarity for the "Cephalopoda" with about 125,800 protein entries, including canonical and iso sequences in UniProtKB protein database, and also the UniGene transcriptomic dat of octopus paralarvae traduced to proteins [3], containing 77,838 protein sequences, w considerably increased the number of protein identifications. Supplementary F summarizes the proteins followed by their respective gene name, gene homologs, unique peptides, and percentage of protein coverage. Of the 361 proteins detected, a of 208 proteins were assigned to the species O. vulgaris. A total of 37 uncharacte proteins were observed, 17 of them related to O. vulgaris.
The final global dataset of the octopus ink proteome was subsequently analyz protein-based bioinformatics, such as gene ontologies, pathways, and network ana and by the prediction of potential bioactive peptides to gather more functional insig the octopus ink.

Label-Free Quantification (LFQ) of O. vulgaris Ink Samples
Relative label-free quantification of each O. vulgaris ink sample (OVI1, OVI2, O was also performed to determine the protein abundance of each sample. Suppleme File S1 contains these results. High-abundance proteins for each sample were ana and compared. Figure 2 shows the distribution of high-abundance proteins detect each O. vulgaris ink sample, while Figure 3 (Venn diagram) shows the distributio overlapping of high-abundance proteins for all O. vulgaris ink samples. The determin of proteins in these samples was directly influenced by ink sac sizes, sampling techn and protein precipitation. As was demonstrated in Figures 2 and 3, the majority high-abundance proteins were detected in the OVI2 sample. In our study, the sy method (OVI1) retrieved the fewest proteins, but the milking method after eutha generated the opposite outcome (Table 1, Figures 2 and 3). Moreover, protein precipi resulted in a minor loss of proteins for OVI1 and OVI3, whereas no purification led identification of significant protein abundance for OVI2. These hybrid collection me offer a high coverage of the octopus ink proteome.

Functional Analysis: Gene Ontologies and Pathways Analysis
PANTHER analysis of the octopus ink proteome using homologous gen bimaculoides; Homo sapiens; Drosophila melanogaster) of O. vulgaris revealed the pres 26 different protein classes ( Figure 4). Apart from protein class identification, PAN was used to categorize the ink proteomes based on their molecular function and bio process (Supplementary Figures S1 and S2). For the prediction of different proteins and function, PANTHER analysis used a number of genes, a percentage of gen functional hits against total genes. All of these corresponding data up to gene level compiled on Supplementary File S3. Figure 4 shows that oxidoreductase (19.8%), transferase (11.4%), hydrolase ( protein modifying enzyme (9.8%), or cytoskeletal protein (6.9%) were the most pro protein classes. Protease (7.45%) was the most common protein modifying e followed by protein phosphatase, tyrosine protein kinase, serine/threonine protein and ubiquitin-protein ligase (Supplementary File S3). A significant part of the ink proteins involved in catalytic activity (GO:00038 binding (GO:0005488) were revealed through molecular function a (Supplementary Figure S1) where hydrolase (20.91%) and oxidoreductase (15.82%) were the most common catalytic activities (Supplementary File S3). molecular function activities of ink proteins such as transporter activity (GO:00 molecular function regulator (GO:0098772), ATP-dependent activity (GO:014065 structural molecule activity (GO:0005198) were also found.

Functional Analysis: Gene Ontologies and Pathways Analysis
PANTHER analysis of the octopus ink proteome using homologous genes (O. bimaculoides; Homo sapiens; Drosophila melanogaster) of O. vulgaris revealed the presence of 26 different protein classes (Figure 4). Apart from protein class identification, PANTHER was used to categorize the ink proteomes based on their molecular function and biological process (Supplementary Figures S1 and S2). For the prediction of different proteins classes and function, PANTHER analysis used a number of genes, a percentage of genes, and functional hits against total genes. All of these corresponding data up to gene level 2 were compiled on Supplementary File S3.  Moreover, ink proteome analysis by PANTHER identified proteins implicated in 16 different biological processes (Supplementary Figure S2). Most of the proteins were involved in the cellular and metabolic process followed by a response to stimulus, biological regulation, localization, and signaling. It also revealed that the octopus ink proteome was involved in the immune system process by performing similar activities such as leukocyte activation (GO:0045321), leukocyte migration (GO:0050900), immune system development (GO:0002520), immune effector process (GO:0002252), and immune response (GO:0006955) (Supplementary File S3). Among these processes, leukocyte-like activation (32.60%), immune response (25.80%), and effector (17.20%) were the most prominent activities.  Figure 4 shows that oxidoreductase (19.8%), transferase (11.4%), hydrolase (11.1%), protein modifying enzyme (9.8%), or cytoskeletal protein (6.9%) were the most prominent protein classes. Protease (7.45%) was the most common protein modifying enzyme, followed by protein phosphatase, tyrosine protein kinase, serine/threonine protein kinase, and ubiquitin-protein ligase (Supplementary File S3).
Mar. Drugs 2023, 21, 206 6 of 28 A significant part of the ink proteins involved in catalytic activity (GO:0003824) and binding (GO:0005488) were revealed through molecular function analysis (Supplementary Figure S1) where hydrolase (20.91%) and oxidoreductase activity (15.82%) were the most common catalytic activities (Supplementary File S3). Other molecular function activities of ink proteins such as transporter activity (GO:0005215), molecular function regulator (GO:0098772), ATP-dependent activity (GO:0140657), and structural molecule activity (GO:0005198) were also found.
Moreover, ink proteome analysis by PANTHER identified proteins implicated in 16 different biological processes (Supplementary Figure S2). Most of the proteins were involved in the cellular and metabolic process followed by a response to stimulus, biological regulation, localization, and signaling. It also revealed that the octopus ink proteome was involved in the immune system process by performing similar activities such as leukocyte activation (GO:0045321), leukocyte migration (GO:0050900), immune system development (GO:0002520), immune effector process (GO:0002252), and immune response (GO:0006955) (Supplementary File S3). Among these processes, leukocyte-like activation (32.60%), immune response (25.80%), and effector (17.20%) were the most prominent activities.
Additionally, the PANTHER pathway analysis of homologous genes of common octopus ink proteomes identified 70 different pathways based on their functional hits (Supplementary File S3). Among all of these pathways, angiogenesis (P00005), apoptosis signaling pathway (P00006), toll receptor signaling pathway (P00054), serine glycine biosynthesis (P02776), FGF signaling pathway (P00021), and EGF receptor signaling pathway (P00018) were highlighted as some important immune signaling pathways in the octopus ink proteome.
The KEGG Pathway systematic analysis of identified proteins was carried out by the DAVID program (version 6.8) to compare the input data with the background of the O. bimaculoides genome, which is the most phylogenetically closest cephalopod species available in the DAVID software. The KEGG pathway search identified 21 different pathways, and the majority of the proteins were found to be involved in metabolic pathways, amino acid biosynthesis, or xenobiotics and drug metabolism (Supplementary File S4).
DAVID software was also used to identify the functional domain of ink proteins. In this case, the InterPro motifs platform was selected for domain searching, categorizing a list of proteins based on protein functional domain (Supplementary File S5).

Network Analysis
A comprehensive protein network encompassing both functional and physical protein interactions was constructed by combining all of the proteins identified for the octopus ink proteome using the STRING software version 11.5. As the genome of O. vulgaris is not available in the STRING software, Octopus spp. was selected, providing a protein-protein interaction (PPI) enrichment p-value of less than 1.0 × 10 −16 , and a total of 147 nodes (proteins) and 277 interactions (edges) were discovered.
A total of 15 subgroups were obtained from 147 nodes, where all the disconnected nodes were hidden from the final network ( Figure 5; Supplementary File S6). Among these subgroups, all the significant pathways with at least three nodes were highlighted in Figure 5. In the octopus ink proteome, metabolic pathways (red) with 30 nodes and 99 interactions made up one of the major pathways, along with ribosome and proteasome pathways with 66 protein-protein interactions (salmon pink; nodes: 18). Different immune proteins part of the innate immune system such as ferritin, catalase, proteasome, and Cu/Zn superoxide dismutase were found in these pathways.
Subgroups of xenobiotics and drug metabolism by the cytochrome P45 (gold; 7 nodes) and immune functioning proteins in the endoplasmic reticulum (dark green; 4 nodes) such as calreticulin, disulfide isomerase family, and heat shock protein 70 family were also identified in the ink proteome. Cytoskeletal protein interactions (yellow; 3 nodes) were primarily formed by the actin and myosin proteins, while signal transduction regulation was mediated by small GTPase (blue; 3 nodes). All the proteins involved in these processes were also identified through molecular functional studies (Supplementary Files S3-S5). 147 nodes (proteins) and 277 interactions (edges) were discovered.
A total of 15 subgroups were obtained from 147 nodes, where all the disconnected nodes were hidden from the final network ( Figure 5; Supplementary File S6). Among these subgroups, all the significant pathways with at least three nodes were highlighted in Figure 5. In the octopus ink proteome, metabolic pathways (red) with 30 nodes and 99 interactions made up one of the major pathways, along with ribosome and proteasome pathways with 66 protein-protein interactions (salmon pink; nodes: 18). Different immune proteins part of the innate immune system such as ferritin, catalase, proteasome, and Cu/Zn superoxide dismutase were found in these pathways.

Putative Bioactive Peptides
In this study, an octopus ink proteome (n = 361) was used to predict all the converted bioactive peptides. For the prediction of the active peptide sequence, protein hydrolysates with trypsin and pepsin were performed using the MS-Digest computational program. No missed cleavages and a minimum of six residues per peptide were selected as parameters. All the predicted peptides after every enzymatic digestion (pepsin and trypsin) are presented in Supplementary File S7.
Trypsin-digested peptides were evaluated for potential bioactivity through Peptide-Ranker (PR), releasing more than 10,000 different peptides (6-44 amino acid residues). A total of 111 non-redundant peptides were selected, which scored higher than 0.90 using the N-to-1 neural network probability ( Table 2). The majority of the bioactive peptides from tryptic digestion corresponded to prominin, tetraspanin, hemocyanin, peroxidases, mucin, and some uncharacterized proteins, for which most of them belong to O. vulgaris and O. bimaculoides.
Similarly, the second in silico digestion of octopus ink proteins with pepsin yielded more than 7000 peptides, with 6 to 44 amino acid residues (Supplementary File S7). Among them, 15 bioactive peptides (score > 0.9) were identified by the Peptide-Ranker with their parent protein (Table 3). Pepsin-digested bioactive peptides mostly belong to prominin, retinal dehydrogenase, hemocyanin subunit, and heat shock proteins.        All the bioactive peptides from trypsin and pepsin digestion (n = 126) were further evaluated for their antimicrobial potential using CAMPR3 (Collection of Antimicrobial Peptides) integrated in the BIOPEP-UWM database. In addition, properties of peptides, e.g., allergenicity and toxicity, were also evaluated by widely used computational platforms AllerTop and ToxinPred (Tables 2 and 3, respectively). The majority of bioactive peptides from trypsin digestion (n = 111) belong to the non-allergen and non-toxin peptides group. Among them, 39 peptides showed antimicrobial potentiality, where 10 peptides scored more than 0.90 in the discriminate analysis classifier score. Additionally, 15 antimicrobial peptides (AMPs) from tryptic digestion showed both non-allergen and non-toxin reactivity. These peptides included mucin-5ac-like (MUC5AC), filamin-a, hemocyanin, inter-alpha-trypsin inhibitor heavy chain, s-formylglutathione hydrolase, tetraspanin, glyoxylate reductase, DNAH, prostaglandin reductase, myosin heavy chain (MYH), H(+) transporting two-sector ATPase, thyroglobulin, and S (hydroxymethyl)glutathione dehydrogenase proteins.
In the case of pepsin digestion, a total of 10 peptides showed potential antimicrobial capability with a higher discriminate analysis classifier score. Among these antimicrobial peptides, the majority of peptides showed both non-allergen and non-toxin reactivity through computational analysis. These hydrolysates were part of prominin, hemocyanin, heat shock protein, retinal dehydrogenase, or acid ceramidase.

Discussion
In this study, a common octopus, O. vulgaris, ink proteome was generated for the first time by using shotgun proteomics, and a total of 361 non-redundant proteins were identified from the complex mixture of ink samples. A shotgun bottom-up proteomics approach is a widely used protocol to create a reference dataset of proteomes for selected marine by-products whereby enzymatically digested peptides from complex samples are used to identify proteins [24,25].
Extracted protein samples for OVI1 and OVI3 were discovered to have a blackish color that could interfere with concentration measurements. Thus, protein samples were further purified and quantified. We found that there was a small loss of proteins during protein precipitation, but the presence of a large amount of proteins was detected in OVI2 that had not been purified. Moreover, a number of protein identifications also varied depending on the extraction technique used. Prior research has demonstrated that the syringe or milking technique has an effect on the recovery of chemical components and proteins from cephalopod ink [1,26]. Hence, the final merging of these samples offers a more precise depiction of the ink proteome. Octopus ink proteome is available at a public repository and could be exceedingly advantageous for future marine by-product research and industrial applications.
Subsequent computational analysis through PANTHER identified 26 active protein classes in the octopus ink proteome. Among these classes, oxidoreductase was the most relevant protein class. In octopus ink, oxidoreductases are primarily involved in the melanogenesis process for catalyzing the polymerization of eumelanin and in the antimicrobial defense system [1,27]. A peroxidase enzyme found in the cephalopod ink sac and associated with melanin synthesis process [28,29] was also recovered from the octopus ink proteome in a large proportion. Additionally, the percentage of protein coverage for peroxidase was high, with excellent peptide spectrum matching, as it was also for hemocyanin and CD109 antigen. The enzymes hemocyanin, tyrosinase, and phenoloxidase share similar active sites. Molluscan hemocyanins are responsible for both oxygen transfer and an effective innate immunological response, while tyrosinases start the synthesis of melanin [30][31][32]. Phenoloxidase also plays an important role in the initial immune defense of invertebrates as a part of the prophenoloxidase-activated system [33]. Fan et al. purified phenoloxidase from ink sacs of O. ocellatus, which is involved in melanin production as well as in-host defense via melaninization as in other crustaceans [34]. Cephalopod ink is composed of secretions from two glands, the ink gland and the funnel organ, a mucus-producing gland, both irrigated by blood vessels. The presence of hemocyanin (a protein that transports oxygen, which is synthetized in cephalopods mainly in the branchial hearts and released to the bloodstream) could be attributed to discharges or the rupturing of vessels in the ink sac, but it could be also a proper component of the ink. Since no previous studies at the proteomic level have been performed before in common octopus ink, and taking into account that hemocyanin has been identified in different organs, including mucus coating different epithelia, it could be possible that hemocyanin, as other phenoloxidases, could be part of the mucus secreted by the ink gland or funnel organ. Further studies are needed to clarify this aspect. Cell surface antigen CD109 is a member of thioester-containing proteins, which form part of the innate immune system involved in host-microbe interactions that have been reported to recognize and bind, and phagocytose bacteria and other parasites [35][36][37].
The KEGG pathway and network analysis of the octopus ink proteome by DAVID (v-6.8) identified 21 different biological pathways, where most of the proteins were involved in metabolic pathways, amino acid biosynthesis, or xenobiotics and drug metabolism. Similar functional and physical protein interactions for all identified proteins of ink were found by STRING. A total of 147 proteins and 277 interactions were discovered through interaction analysis, which covered all the KEGG pathways identified. MCL cluster analysis categorized 15 subgroups from 147 nodes, where metabolic pathways (red; nodes: 30), ribosome and proteasome pathways (salmon pink; nodes: 18), and xenobiotics and drug metabolism by cytochrome P45 (gold; nodes: 7) were identified as the major pathways. Glycolysis, the TCA cycle, the biosynthesis of nucleotide sugars, oxocarboxylic acid metabolism, and oxidative phosphorylation were the major metabolic pathways, and comparable results were obtained from previous ink gene ontology studies using DAVID and PANTHER. These metabolic networks were also identified through transcriptomics analysis in some previous studies [38][39][40]. Another significant iron soluble non-toxic protein ferritin was found in the metabolic networks of octopus ink, which is involved in the immune system and homeostasis process [41]. Catalase, also found in the ink metabolic network, scavenges free radicals to curtail their damaging effects on the host, and it is a crucial enzyme in antioxidant defense and the innate immune system [42].
Moreover, the ubiquitin-proteasome pathway is directly involved in cellular apoptosis, and in some cases, the proteasome can impact other cellular pathways, which may lead to apoptosis [43]. In this study, ubiquitin-activating enzyme E1, E3 ubiquitin-protein ligase, 26S proteasome subunit, proteasome subunit alpha/beta, and proteasome A-type subunit were identified from the common octopus ink. We also identified cytochromes P450 (CYPs), which are a superfamily of enzymes catalyzing xenobiotics in marine invertebrates [44]. All of these identified pathway and immune molecule activities have been described as an essential part of the common cephalopod immune system [45,46]. Another important immune protein, Cu/Zn superoxide dismutase, which is a part of antioxidant defense pathways, clusters in the ink proteasome and ribosome pathways [47,48].
Four immune functioning proteins in the endoplasmic reticulum such as calreticulin, the disulfide isomerase family, the heat shock protein 70 family, and carboxypeptidase were identified from the ink proteome. Calreticulin, a highly conserved endoplasmic reticulum (ER) luminal resident protein, is involved in innate immunity and Ca2+ homeostasis [49]. Huang et al. reported that two ER proteins, calnexin and calreticulin, were involved in antibacterial immunity in Eriocheir sinensis [50]. The disulfide isomerase family functions as molecular chaperones and disulfide oxidoreductase. Through a variety of cellular processes, including redox-sensitive attachment, antigen presentation in the ER, connection with phagosomes, and ROS production by NADPH oxidase, protein disulfide isomerase promoted host-pathogen interactions in viral, bacterial, and parasitic infections [51]. By using cDNA cloning and mRNA expression of heat shock protein 70 (HSP70) gene, Song et al. showed that HSP70 plays a key role in mediating the environmental stress and immune response in bay scallops [52]. Carboxypeptidase, which belongs to the S10 peptidase family, was also purified from Illex illecebrosus [53,54].
Bioactive peptides have been defined as specific protein fragments that have a positive impact on body functions or conditions and may ultimately influence health. Peptides are inactive within the sequence of the parent protein but become active when released due to the action of different enzymes [55,56]. Peptides from different cephalopod extracts showed antibacterial activities, which is an important function of the innate immune system. Positively charged amino acids of peptides interact with the negatively charged membranes of microorganisms to permeate the cell and finally exert their antimicrobial effects [57]. Cephalopod ink is widely used in traditional Chinese medicine due to its antitumor, immunomodulatory, and hemostatic effects [58]. In addition, cephalopod ink secondary metabolites promoting immune function in vertebrates also showed different bioactive potentials, such as antibacterial, antimutagenic, and antitumoral activity [10]. Other studies evidenced that octopus ink extracts exhibited joint immunomodulatory and antiproliferative effects due to the presence of different bioactive compounds without being cytotoxic to human cancer cell lines [12]. Limited data are available regarding the activity of the bioactive peptides found in O. vulgaris ink; thus, this study may offer a new approach to identifying potential lead bioactive peptides from O. vulgaris ink.
Trypsin and pepsin in silico enzymatic digestions from the LC-MS/MS reference octopus ink proteome identified more than 17,000 peptides. Trypsin preferentially cleaves the proteins at Lys and Arg residues in position P1, except for the case in which Pro is found in position P1 , where pepsin cleaves the proteins at Phe, Tyr, Trp, and Leu residues in positions P1 and P1 [59]. Previous studies showed that the trypsin fractioning of isolated peptidoglycans of S. maindroni ink released a polysaccharide with strong antimutagenic activity [58] and that trypsin was also used to hydrolyze oligopeptides to produce a proapoptotic tripeptide [18]. Similarly, trypsin, α-chymotrypsin, or pepsin hydrolysates of giant squid tunic gelatin exhibited antioxidant activity [60].
The bioactivity of peptides was predicted by PeptideRanker, which calculates scores ranging from 0 to 1, assigning higher values to those peptides considered more bioactive [61]. A total of 111 non-redundant bioactive peptides from tryptic digestion and 15 bioactive peptides from pepsin digestion were selected using PeptideRanker. The proteins hemocyanin, prominin-1-a isoform, retinal dehydrogenase, and acid ceramidase released bioactive peptides after in silico digestion with both trypsin and pepsin enzymes.
Hemocyanins are invertebrate metalloproteins found in cephalopods and are mainly known for their role in oxygen transport. Coates and Nairn mentioned that hemocyanins act as a precursor of antimicrobial and antiviral peptides [62]. These proteins play important immune-related roles, such as antimicrobial, antiviral, agglutinative, antifungal, and antitumor proliferation of cancer cells [63][64][65]. In fact, the hemocyanin of marine mollusks has shown significant interactions with T cell monocytes, macrophages, and polymorphonuclear lymphocytes to improve the host immune response [66,67]. Although no previous studies are available related to octopus hemocyanin, it can be considered that the potential pepsin-and trypsin-digested bioactive peptides (SDPMRPF, CGVCP-KCHF, IPCLFAIVFAFWLCGHIAEGNLIR, KKPMMPF, VFGGFWLHGIK, MFAGFLLK, SPWLLGATILCIISIFVPVITNGK, ILCLFAFVFAFWLSGQSAEGNLIR, YACCLHGMPVF-PHWHR, VFAGFLFMGIK, VFVGFLLHGFGSSAYATFDICNDAGECR, LNHLPLLCLAV-ILTLWMSGSNTVNGNLVR, TSFLFLAFVATSWFVYAVTASK) from hemocyanin proteins could be used in the future in antimicrobial, antiviral, anticancer or potential immune stimulator roles.
Prominin-1 is a membrane glycoprotein specifically associated with plasma membrane protrusions, first identified as a novel antigenic marker that is present at the apical surface of mouse neuroepithelial cells [68,69]. It is a very useful marker for various stem cells and can be found in a wide variety of differentiated epithelium and non-epithelial cell types, including photoreceptor cells of invertebrate where mutations in the PROM1 gene are associated with various forms of retinal degeneration [70,71]. Genome sequencing revealed that prominin relatives are present in different echinoderms and mollusks where the amino acid sequence is poorly conserved among prominin-1 gene products [71,72]. In addition, the extensive research of human PROMININ-1 as a possible target for cancer treatment in various organs was realized [71]. Therefore, the discovery of prominin-1-a isoform from octopus ink as well as bioactive peptides (CCCRCCNRCGGRHMKY, IVLYFIGYSICVAIG-ILFIILIPLIGCCLCCCR, TYVTCLVILNTIILFAVVCTFITNELYK, SVAVPCSVLLLWILIAFS-LVDHSFAQNSSQQHR) from this protein may open up new opportunities for cancer and  stem cell studies. Retinal dehydrogenase belongs to the super family of aldehyde dehydrogenases and catalyzes the chemical reaction converting retinal to retinoic acid [73]. Aldehydes are highly reactive molecules that may produce carcinogenic, cytotoxic, mutagenic, and genotoxic effects on biological systems where aldehyde dehydrogenases transform aldehydes to less reactive forms or eliminate the aldehydes [74]. Bioactive peptides (CMGQCCF, IMTFT-NAIQAGTVWVNTYCCVACQAPFGGFK) released from retinal dehydrogenase could be useful to minimize the toxic effect of cancer cells and for future cancer research.
The sphingolipid enzyme acid ceramidase has played an important role in the regulation of apoptosis and also was found to be over-expressed in different human cancer cells [75,76]. Acid ceramidase, which is involved in the initiation and propagation of a number of human cancers could act as a potential therapeutic target in cancer therapy [76]. However, an acid ceramidase-like protein was identified from the ink proteome, whose function is still unknown in cephalopods to the best of our knowledge. In addition, bioactive peptides released from this protein have not shown any antimicrobial potentiality through in silico analysis.
In the present work, antimicrobial peptides (AMPs) were identified using the CAMP (Collection of Anti-Microbial Peptides) database and by applying the DAC score (Discriminate Analysis Classifier score), since CAMPR3 is a widely used database for the prediction of antimicrobial peptides [77]. A total of 39 tryptic peptides showed antimicrobial potential, while 10 peptides showed potential antimicrobial capability for pepsin digestion. The majority of the tryptic-digested antimicrobial peptides were released from hemocyanin, tetraspanin, peroxidase-like protein, myosin heavy chain, MUC5AC, filamina, inter-alpha-trypsin inhibitor heavy chain, and s-xymethyl glutathione dehydrogenase, among others. Heat shock protein is another important protein that released antimicrobial bioactive peptides due to pepsin digestion.
The myofibrillar protein myosin heavy chain, one of the key elements of the muscle, plays a role in both muscular contraction and non-muscular cells, which was previously found in the octopus arm using proteolytic assay [78]. MYH released bioactive peptides (NWQWWR) with high AMP probability, which could be used in future antimicrobial research.
MUC5AC, a major gel-forming mucin, exerts a protective role against inhaled pathogens, while some other studies described that mucin proteins act as a barrier to different microorganisms functioning in a dynamic role in host innate and adaptive immune responses to infection [79,80]. Previous studies showed that mucin-5ac-like proteins have been identified from ivory shell haemocytes of Babylonia areolata significantly involved in the immunological homeostasis of invertebrates [81]. Intestinal mucin isolated from Trichoplusia ni facilitates the digestive process and protects invertebrate digestive tracts from microbial infections [82]. Thus, we could predict that the identified mucin-5ac-like protein and bioactive peptides (SSFDGGSFGGGIAAGIAIAILLLALIYLFYR) from the octopus ink might be useful for future antimicrobial research and applications.
Tetraspanins, which were identified in octopus ink proteome-releasing bioactive peptides, are a group of four-transmembrane domain proteins involved in cell-cell adhesion at cellular junctions or bacterial cell adhesion [83]. Some tetraspanins are capable of limiting cancer progression or migrations, while others foster tumor growth, invasion, and metastasis [84]. Future extensive research is needed to determine the tetraspanin role in O. vulgaris and the potential bioactivity of its peptides (IAAAGLALAFIQVIGIVFACCLAQAIR).
Heat shock proteins act as molecular chaperones in the immunity of organisms, especially under different environmental stresses [85,86]. The inking behavior of cephalopods increased HSP90 expression and suggested that the stimulated increase in HSP90 expression level was one of the organisms' protective approaches against further toxicity [87].
Previous studies revealed that HSP70 releases bioactive peptides and exhibits different biological activities, including ACE inhibition, antioxidant, dipeptidyl peptidase-IV inhibition, etc., through proteomic and bioinformatic analysis [88]. Therefore, pepsin-digested bioactive peptide (GGMPGGMPGGMPGGMPNF) from heat shock proteins could be a new area of natural product research interests.
Additionally, peptide properties were evaluated through AllerTop and ToxinPred web servers. AllerTop predicts the allergen peptides based on the physicochemical properties of peptide sequences [89]. Similarly, ToxinPred compiled different toxic and non-toxic peptides from the known database of SwissProt and TrEMBL and developed in silico models for the toxicity prediction of peptides and proteins [90]. Most of the antimicrobial peptides of ink proteins released from pepsin and trypsin digestion showed non-allergen and non-toxic reactivity properties thorough in silico analysis.
Some of the previous studies reported that cephalopod ink extracts possess antimicrobial properties against diverse pathogenic bacteria [1,91]. Additionally, recently, it was found that O. vulgaris ink extracts exhibit anti-inflammatory, antiproliferative, antimutagenic, antioxidant, and cytoprotective properties [11,12]. There is currently no information concerning cephalopod ink bioactive peptides, but Nadarajah et al. mentioned that fractions of melanin-free ink with low molecular weights (<3 kDa) showed the highest antioxidative activities [92]. Thus, low molecular weight peptides of octopus ink identified for the first time in the current study could be a potential resource in future antimicrobial, antioxidant, and anticancer research. These potential bioactive peptides must be validated by further functional analysis utilizing synthetic peptides to confirm the bioactivity of these potential candidates. In this sense, the bioinformatics method offers quicker and less expensive alternatives than the conventional methods to reduce the number of potential targets that need to be explored.

Common Octopus Sampling
A total of three octopuses (O. vulgaris) with an average weight of 1 kg (980 g to 1.2 kg) were collected using fishing cages by professional certificated fishermen at the Ría de Vigo, Spain. The individuals were transported in proper containers to the Experimental Culture Facilities of IIM-CSIC, which is registered as "User and breeding center on animal experimentation" ES360570202001. Transport, housing, and handling were carried out following the principles of animal welfare, within 2 h after fishing. Special attention was paid to the 3Rs strategy (Reduce, Refine, Reuse), reducing the number of animals used in the experimental assay until it was essential for maintaining statistical robustness. For ink extraction, an initial less invasive method was tested using a live, anesthetized octopus with an anesthetic mix of MgCl 2 (1.5%; w/v) and 70% ethanol (1%; v/v) dissolved in sea water [93]. Ink was then extracted in vivo using a syringe. Since this method produced very small amounts of ink, octopuses were euthanized using overdoses of the anesthetic (MgCl 2 (3%; w/v) and 70% ethanol (1%; v/v) dissolved in sea water) and were carefully dissected using sterilized scissors. Ink sacs were collected by the "milking" method (the content of the ink sac was milked by running forceps along its length) from two octopuses, and ink was transferred into a collection tube. Finally, a total of three representative ink samples (OVI1, OVI2, OVI3; OVI: O. vulgaris ink) were selected for further analysis and were stored at −80 • C.
Procedures for transportation, maintenance, euthanasia, and dissection were carried out in accordance with the principles published in the European Directive (2010/63/EU) for the protection of experimental animals used for scientific purposes and were approved by the Spanish National Competent Authority ethics committee (Research Project ES360570202001/17/EDUCFORM 07/CGM02).

Ink Protein Samples
All the samples were homogenized on ice for 6 cycles of 5 s pulses in an ultra-turrax (Polytron Aggregate R , Kinematica-AG, Switzerland) using 4 mL of lysis buffer (10 mM Tris-HCl, pH 7.2 with 5 mM phenylmethylsulfonyl fluoride (PMSF)), which was prepared fresh [56]. Homogenized ink samples were centrifuged at 40,000× g for 20 min at 4 • C in an Avanti JXN-26 centrifuge (Beckman Coulter, Palo Alto, CA, USA). Extracted protein samples were purified to remove the blackish ink color using the methanol-chloroform precipitation method [94] to avoid interference with further protocols. The OVI2 sample was analyzed without purification due to the recovery of the almost transparent protein extract (light blackish color). Protein concentration in each protein extract was measured by a BCA (Bicinchoninic acid) protein assay kit (Pierce TM -23225) in a spectrophotometric device (Multiskan™ GO, Thermo Fisher Scientific). Extracted proteins were stored at −80 • C until use.

SDS-Polyacrylamide Gel Electrophoresis
Octopus ink proteins of each individual sample were separated onto 10% (v/v) of SDS-polyacrylamide gels (30% acrylamide/N, N'-ethylene-bis-acrylamide, 37.5:1) with a stacking gel of the same polyacrylamide concentration. A total of 15 µg of proteins with an equal amount of Laemmli buffer (Tris-HCl 0.5 M pH 6.8; Glycerol; SDS 10%; Blue Bromophenol 1%; DTT) was boiled on a thermocycler at 100 • C for 5 min and centrifuged at 10,000× g 2 min at 4 • C. Prepared samples were then loaded into the wells and separated in a Mini-PROTEAN 3 cell (Bio-Rad, Hercules, CA, USA). The running buffer consisted of an aqueous solution contained 1.44% (w/v) glycine, 0.67% tris-base, and 0.1% SDS. Running conditions were 80 V for the first 20 min and then 150 V until electrophoresis was complete. The PageRuler TM unstained protein ladder was also used as a molecular weight (MW) indicator (Thermo Fisher Scientific, San Jose, CA, USA). After electrophoresis, gel was stained overnight with Coomassie dye PhastGel Blue R-350 (Solon, Ohio, USA). Then, the gel was unstained by using a solution composed of 25% ethanol and 8% acetic acid. Finally, the gel was washed with 50% methanol (v/v) and scanned at 200 dpi.

In-Solution Protein Digestion with Trypsin
Protein digestion with trypsin was performed as described by Carrera et al. [25]. Briefly, a total of 100 µg of proteins per sample was dried using a speed vac (Gyrozen). Dried proteins were then denatured in 8 M urea with 25 mM ammonium bicarbonate (pH 8.0) at a protein concentration of 4 µg/µL. Reactions were reduced by adding freshly prepared dithiothreitol (DTT) at a final concentration of 10 mM from 100 mM (10×) stock solution and incubated in a UPV Hybridizer Oven at 56 • C for 45 min with a gentle agitation. Iodoacetamide (IAA) was added up to 50 mM from a 10× stock (500 mM) solution into the reaction tube for alkylation, subsequently incubated with a gentle shaking at room temperature in the dark for 60 min. Then samples were diluted 4-fold using 25 mM ammonium bicarbonate, with a pH of 8.25. Finally, proteins were digested with trypsin (Promega, Madison, WI, USA) overnight at 37 • C in the proportion of 1:100 for the protease enzyme to protein ratio. To stop the tryptic digestion, peptides were acidified with 5% formic acid until pH 2. Samples were preserved at −80 • C until use.

Shotgun LC-MS/MS Analysis
Digested samples were purified for LC-MS/MS analysis following the desalting method [56] by using the C 18 MicroSpin TM column (The Nest Group, South-borough, MA, USA). Speed vac dried peptides samples were resuspended in 0.1% formic acid, which were analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS) using a Proxeon EASY-nLC II liquid chromatography system (Thermo Fisher Scientific, San Jose, CA, USA) coupled with an LTQ-Orbitrap Elite mass spectrometer (Thermo Fisher Scientific).
Peptide separation (1 µg) was performed on a reverse phase (RP) column (EASY-Spray column, 50 cm × 75 µm ID, PepMap C18, 2 µm particles, 100 Å pore size, Thermo Fisher Scientific) with a 10 mm pre-column (Accucore XL C18, Thermo Fisher Scientific) using 0.1% formic acid (mobile phase A) and 98% acetonitrile (98% ACN) with 0.1% formic acid (mobile phase B). A 120 min linear gradient from 5 to 35% B at a flow rate of 300 nL min −1 was used. A spray voltage of 1.95 kV and a capillary temperature of 230 • C were used for ionization. The peptides were analyzed in a positive mode (1 µscan; 400-1600 amu), followed by 10 data-dependent higher energy collision dissociation (HCD) MS/MS scans (1 µscans) using a normalized collision energy of 35% and an isolation width of 3 amu. Dynamic exclusion for 30 s after the second fragmentation event was applied, and unassigned charged ions were excluded from the analysis. A total of three biological samples obtained by the above described different extraction methods (OVI1: syringe and purification; OVI2: milking; OVI3: milking and purification) were independently analyzed.

Processing of the Mass Spectrometry Data
All the MS/MS spectra obtained in the LTQ-Orbitrap Elite instrument were analyzed using the search engine SEQUEST-HT (Proteome Discoverer 2.4 package, Thermo Fisher Scientific) against the Cephalopoda UniProtKB database (125,800 protein sequence entries) and the UniGene transcriptome database of O. vulgaris paralarvae [3], containing 77,838 protein sequence entries. The following restrictions were used: tryptic cleavage with up to 2 missed cleavage sites and tolerances of 10 ppm for parent ions and 0.06 Da for MS/MS fragment ions. The carbamidomethylation of Cys (C*) was considered a fixed modification. The permissible variable modifications were methionine oxidation (Mox) and acetylation of the N-terminus of the protein (N-Acyl). The results were subjected to statistical analysis to determine the peptide false discovery rate (FDR) using a decoy database and the Target Decoy PSM Validator algorithm [95]. The FDR was kept below 1% for further analysis. Only proteins that matched selected prerequisites were submitted, such as (a) proteins classified as master proteins, (b) proteins with at least 2 unique peptides, and (c) characterized proteins. For the relative protein abundance determination for each sample, a label-free quantification (LFQ) method was used by applying the Minora Feature Detector node and the ANOVA (individual proteins) method included in the Proteome Discoverer 2.4 software (Thermo Fisher Scientific). Peak areas of ion features from the same peptide for different charge forms were accumulated to one value. All the proteins obtained by the 3 different methodologies were used to create a reference dataset of ink proteome.

Functional Gene Ontologies and Pathways Analysis
A final list of all non-redundant protein IDs identified in the group of the 3 ink samples was selected for further bioinformatics analysis. All the homologous genes were identified through UniProt and the NCBI database and were submitted to the PANTHER program version 17.0 (http://www.pantherdb.org/, accessed on 3 June 2022) to classify the proteins based on the 3 main types of annotation: molecular functions, biological processes, and protein classes. Additionally, the pathways involved were also studied using PANTHER classification. According to Thomas et al. [96] and Mi et al. [97], a statistical significance of representation for the analysis was also provided. KEGG pathway analysis was performed by comparing the input data with the background of the O. bimaculoides genome by DAVID version 6.8 (https://david.ncifcrf.gov/, accessed on 3 June 2022). Protein functional domains were also identified by InterPro Motifs using the same software comparing the input data with the background of O. bimaculoides.

Network Analysis
All the protein networks for octopus ink proteomes were analyzed using STRING (Search Tool for the Retrieval of Interacting Genes) software (v.11.5) (http://stringdb.org/, accessed on 3 June 2022) [98]. This is a large database of known and predicted proteinprotein interactions (PPI). Proteins were represented with nodes. The interactions between proteins were represented with continuous lines to represent direct interactions (physical) and with dotted lines to represent indirect interactions (functional). To minimize false positives and false negatives, all interactions tagged as confidence ≥ 0.7 in STRING software were used for the analysis. Cluster networks were created using the MCL (Markov Cluster Algorithm) inflation algorithm, a distance matrix that is included in the STRING website. MCL inflation was set to 1.8 to reduce the number of clusters for all the analyses.

Conclusions
A wide range of proteins were identified from the common octopus (O. vulgaris) ink samples for the first time using a shotgun proteomic strategy, indicating that the ink proteome could act as a great reservoir of diverse proteins. A total of 1432 different peptides and 361 non-redundant proteins were identified. Different in silico analyses, including GO word enrichment, pathways, and network investigations, were used to explore the final proteome compilation. Peroxidase, hemocyanin, and CD109 proteins, which are part of the innate immune system, were detected with high percentages of protein coverage and peptide spectrum matches. The most prominent protein classes of octopus ink proteomes were oxidoreductase, transferase, and hydrolase, which seems to indicate that binding and catalytic activities were the main molecular functions, including some crucial immunological activities. The majority of ink proteins were clustered under the metabolic pathways, ribosome and proteasome pathways, xenobiotics and drug metabolic pathways, immune functioning protein networks in the endoplasmic reticulum, and cytoskeletal proteins networks. Ink protein networks contained a variety of immunological proteins associated with the innate immune system, including ferritin, catalase, proteasome, Cu/Zn superoxide dismutase, calreticulin, disulfide isomerase, heat shock protein, etc. Octopus ink proteins release a wide range of potential bioactive peptides after in silico digestion with trypsin and pepsin, which could be used in the future for their antimicrobial, antiviral, and anticancer properties or as potential immune stimulators. The combination of global proteomic findings and the bioinformatics analysis of the octopus ink proteome demonstrates a comprehensive knowledge of this fishery discard and provides potential bioactive peptides of this marine by-product for future study.