Structural Diversity and Biological Activity of Cyanopeptolins Produced by Nostoc edaphicum CCNP1411

Cyanopeptolins (CPs) are one of the most commonly occurring class of cyanobacterial nonribosomal peptides. For the majority of these compounds, protease inhibition has been reported. In the current work, the structural diversity of cyanopeptolins produced by Nostoc edaphicum CCNP1411 was explored. As a result, 93 CPs, including 79 new variants, were detected and structurally characterized based on their mass fragmentation spectra. CPs isolated in higher amounts were additionally characterized by NMR. To the best of our knowledge, this is the highest number of cyanopeptides found in one strain. The biological assays performed with the 34 isolated CPs confirmed the significance of the amino acid located between Thr and the unique 3-amino-6-hydroxy-2-piperidone (Ahp) on the activity of the compounds against serine protease and HeLa cancer cells.

In our previous studies, the production of 13 cyanopeptolins by Nostoc edaphicum CCNP1411 was reported [8].The goal of the current work was to expand the existing knowledge about the structural diversity of CPs produced by CCNP1411 and to explore its effect on the biological activity of the peptides.

Identification of CP Structures
Cyanobacteria possess the ability to synthesize a wide array of natural products.The analyses of 185 cyanobacterial genomes led to the identification of 1817 natural products biosynthetic gene clusters (BGCs) [97].In the same study, a positive correlation between the number of BGCs and the size of the genome was documented.Cyanobacteria of the order Nostocales are characterized by the largest genomes and are among those that pose the highest average number (11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25) of natural product BGCs [97].Many of the synthesized compounds are biologically active, and their biotechnological and pharmaceutical potential is commonly explored.In CCNP1411, three classes of non-ribosomal peptides were identified.This includes: anabaenopeptins with four structural variants [98], nostocyclopeptides with six linear and five cyclic variants [99], and thirteen cyanopeptolins [8].In the current study, the number of CPs variants detected in CCNP1411 increased to 93.However, when the cell extract of CCNP1411 was analyzed with LC-HRMS, only 67 CPs were detected.For these peptides, the exact masses were determined [Table S2].The remaining peptides were detected in concentrated fractions collected during the separation process.
The structures of all detected peptides were identified based on their mass fragmentation spectra (Figures 2-7 and Figures S1-S87).In the spectra of all CPs containing Phe in position 4 S3.For the dereplication process, the CPs identified in CCNP1411 were compared with the resources of the CyanoMetDB [1].This is the most comprehensive and openly accessible database containing cyanobacterial metabolites.The updated versions of the database are available on the Zenodo and NORMAN Suspect List Exchange (No S075).Of the 93 CPs detected in CCNP1411 in this work, only 14 were included in the database.Generally, the presence and frequency of specific residues in the structure of CPs produced by CCNP1411 (Figure 8) were in line with the residues present in the previously identified CPs presented in Figure 1.Position 2 of the CPs is most diverse and occupied by Arg, Tyr, Phe, Leu, Met, Trp, as well as methylated Leu, Phe, and Tyr.Similarly to the spectra of Tyr 2 , H 4 Tyr 2 , and Leu 2 -containing aeruginopeptins 917S-A, -B, and -C [55], the Tyr 2 , Leu 2 , or Phe 2 -containing CPs identified in CCNP1411 gave a high intensity dehydrated [M + H − H 2 O] + precursor ion peak.
Leu, Phe, and Tyr.Similarly to the spectra of Tyr 2 , H4Tyr 2 , and Leu 2 -containing aeruginopeptins 917S-A, -B, and -C [55], the Tyr 2 , Leu 2 , or Phe 2 -containing CPs identified in CCNP1411 gave a high intensity dehydrated [M + H − H2O] + precursor ion peak.Leu, Phe, and Tyr.Similarly to the spectra of Tyr 2 , H4Tyr 2 , and Leu 2 -containing aeruginopeptins 917S-A, -B, and -C [55], the Tyr 2 , Leu 2 , or Phe 2 -containing CPs identified in CCNP1411 gave a high intensity dehydrated [M + H − H2O] + precursor ion peak.Leu, Phe, and Tyr.Similarly to the spectra of Tyr 2 , H4Tyr 2 , and Leu 2 -containing aeruginopeptins 917S-A, -B, and -C [55], the Tyr 2 , Leu 2 , or Phe 2 -containing CPs identified in CCNP1411 gave a high intensity dehydrated [M + H − H2O] + precursor ion peak.Based on the mass fragmentation spectra, it is not possible to distinguish the residues (e.g., Ile/Leu).Therefore, for CPs isolated in the highest quantities, i.e. and CP 999 (with Tyr 2 ), CP 990 (with Arg 2 ), CP 983 (with Phe 2 ), CP 949 and CP 9 Leu 2 ), NMR analyses were performed (Figures 2-7, Figures S88-S123, Tables S4obtained results were consistent with structure elucidation based on MS/MS and the identification of Leu 2 in CP 949 and CP 919.The NMR analyses also allow verify the previously published structure of CP 999 [8].It was revealed that posi CP 999 is occupied by N,O-di-MeTyr, and not by MeHty, as suggested based MS/MS spectrum.Both residues give the same fragment ions, including the im ion at m/z 164.N,O-di-MeTyr 5 was previously detected in cyanopeptolins prod Nostoc insulare [49] and Oscillatoria agardhii [27,28,46,84].This structure misinterp illustrates well the need for the application of at least two spectroscopic metho NMR and MS/MS, to provide the correct information on chemical structure, es when isomers are analyzed [100].Unfortunately, in the case of natural product are biosynthesized in minute amounts, the isolation of sufficient amounts of pu pound (>1 mg) for NMR is impossible or difficult to achieve.Then, the structural can be based on HRMS/MS, which allows the assignment of molecular formula a vides important information on the structural components of the analyte [101 more recently developed MS techniques (e.g., ion mobility MS) can additionally the structure elucidation process [102].
Of the 25 CP-like peptides identified in cyanobacteria of the genus Nostoc cluded in Table S1, more than half (13) were reported from CCNP1411 [8].When a tural variants from this study are included in the database, Nostoc can be consi rich source of cyanopeptolins as Microcystis.

Molecular Networking of Cyanopeptolins
To describe the structural diversity of CPs, molecular networking was perfor ing data from the HRMS/MS analysis of 10-mg dry biomass of CCNP1411 cell e Based on the mass fragmentation spectra, it is not possible to distinguish the isobaric residues (e.g., Ile/Leu).Therefore, for CPs isolated in the highest quantities, i.e., CP 941 and CP 999 (with Tyr 2 ), CP 990 (with Arg 2 ), CP 983 (with Phe 2 ), CP 949 and CP 919 (with Leu 2 ), NMR analyses were performed (Figures 2-7, Figures S88-S123, Tables S4-S9).The obtained results were consistent with structure elucidation based on MS/MS and allowed the identification of Leu 2 in CP 949 and CP 919.The NMR analyses also allowed us to verify the previously published structure of CP 999 [8].It was revealed that position 5 in CP 999 is occupied by N,O-di-MeTyr, and not by MeHty, as suggested based on the MS/MS spectrum.Both residues give the same fragment ions, including the immonium ion at m/z 164.N,O-di-MeTyr 5 was previously detected in cyanopeptolins produced by Nostoc insulare [49] and Oscillatoria agardhii [27,28,46,84].This structure misinterpretation illustrates well the need for the application of at least two spectroscopic methods, e.g., NMR and MS/MS, to provide the correct information on chemical structure, especially when isomers are analyzed [100].Unfortunately, in the case of natural products, which are biosynthesized in minute amounts, the isolation of sufficient amounts of pure compound (>1 mg) for NMR is impossible or difficult to achieve.Then, the structural analyses can be based on HRMS/MS, which allows the assignment of molecular formula and provides important information on the structural components of the analyte [101].Other, more recently developed MS techniques (e.g., ion mobility MS) can additionally support the structure elucidation process [102].
Of the 25 CP-like peptides identified in cyanobacteria of the genus Nostoc and included in Table S1, more than half (13) were reported from CCNP1411 [8].When all structural variants from this study are included in the database, Nostoc can be considered as rich source of cyanopeptolins as Microcystis.

Molecular Networking of Cyanopeptolins
To describe the structural diversity of CPs, molecular networking was performed using data from the HRMS/MS analysis of 10-mg dry biomass of CCNP1411 cell extract.A search of databases linked with the GNPS spectra library did not detect any CPs produced by N. edaphicum.Instead, it proposed 209 compounds structurally similar to CCNP1411 cyanopeptolins, including anabaenopeptilide 202A, cyanopeptolin 963A, lyngbyastatin 8, and micropeptin 103.The search also resulted in the detection of 27 compounds within the 195-532 m/z range.
The molecular network for N. edaphicum CCNP1411 showed the existence of 116 nodes connected into 9 clusters by 320 edges (Figure S124), including 3 clusters with CPs features (Figure S124A), 4 with nostocyclopeptides features (Figure S124B), and 2 clusters which did not match any of the above-mentioned groups of compounds (Figure S124C).
The 3 CP clusters were grouped into 62 nodes connected by 202 edges.We were able to assign 32 nodes to specific CPs variants identified in CCNP1411 (Figure S124).The m/z values of the remaining nodes did not match the compounds described in this work, or their weak spectra did not allow the features to be confidently assigned to specific CP variants.A visualization of the 32 annotated CPs is shown in (Figure 9).The molecular network for N. edaphicum CCNP1411 showed the existence of 116 nodes connected into 9 clusters by 320 edges (Figure S124), including 3 clusters with CPs features (Figure S124A), 4 with nostocyclopeptides features (Figure S124B), and 2 clusters which did not match any of the above-mentioned groups of compounds (Figure S124C).
The 3 CP clusters were grouped into 62 nodes connected by 202 edges.We were able to assign 32 nodes to specific CPs variants identified in CCNP1411 (Figure S124).The m/z values of the remaining nodes did not match the compounds described in this work, or their weak spectra did not allow the features to be confidently assigned to specific CP variants.A visualization of the 32 annotated CPs is shown in (Figure 9).These 32 CPs were grouped into two main clusters based on the similarity of fragmentation pattern profiles being a consequence of their specific structural traits (Figure 9).The Arg 2 -bearing CPs were distinctly separated from variants with Tyr 2 , Leu 2 , or Phe 2 , which showed higher similarity to each other.This grouping might result from the fact that, unlike CPs with Arg 2 , the three types of CPs gave dehydrated ions as parent ions in their spectra.In both clusters, the CPs with different amino acids in position 5 grouped separately.Visualization of the structural relationships between CPs using a molecular network yielded consistent results with manually performed structural analysis of MS/MS data.

Enzymatic Assay
Serine proteases play a significant role in major metabolic pathways.Therefore, inhibitors of these enzymes potentially constitute lead compounds in pharmaceutical research.In our study, 34 CPs were isolated as pure compounds (purity > 95%) (Table 1) and their in vitro activities against four serine proteases (trypsin, chymotrypsin, elastase, and thrombin), were determined.In line with our previous results [8], neither of the peptides These 32 CPs were grouped into two main clusters based on the similarity of fragmentation pattern profiles being a consequence of their specific structural traits (Figure 9).The Arg 2 -bearing CPs were distinctly separated from variants with Tyr 2 , Leu 2 , or Phe 2 , which showed higher similarity to each other.This grouping might result from the fact that, unlike CPs with Arg 2 , the three types of CPs gave dehydrated ions as parent ions in their spectra.In both clusters, the CPs with different amino acids in position 5 grouped separately.Visualization of the structural relationships between CPs using a molecular network yielded consistent results with manually performed structural analysis of MS/MS data.

Enzymatic Assay
Serine proteases play a significant role in major metabolic pathways.Therefore, inhibitors of these enzymes potentially constitute lead compounds in pharmaceutical research.In our study, 34 CPs were isolated as pure compounds (purity > 95%) (Table 1) and their in vitro activities against four serine proteases (trypsin, chymotrypsin, elastase, and thrombin), were determined.In line with our previous results [8], neither of the peptides were active against thrombin, even at the highest concentration applied in the assay (45 µg × mL −1 ).Our current work also confirmed the significance of the residue in position 2 for the inhibition of trypsin, chymotrypsin, and elastase.Peptides with Arg 2 inhibited trypsin at IC 50 values from 0.28 µM (CP 1018) to 7.25 µM (CP 1048) and showed weaker or no activity against chymotrypsin (from IC 50 = 6.75 µM to nonactive) (Table 1).Similar effects of CPs with Arg 2 on trypsin and no or weak effect against chymotrypsin were previously reported by other authors [22,28,37,38,47,74].Opposite results were reported only for a CP-like peptide called symplocamide A [31].The peptide inhibited trypsin at IC 50 = 80.2 ± 0.7 µM and showed more potent activity against chymotrypsin (IC 50 = 0.38 ± 0.08 µM).The authors suggested that the activity of symplocamide A can be modified by the N,O-dimethylbromotyrosine at position 5.

Name
Although the amino acid in position 2 is belived to be critical for the interaction of CPs with serine proteases, variants with no activity have been reported [42,47,64].This fact indicates that other components of the molecules are important for enzyme inhibition as well.Indeed, in the work by Salvadore et al. [103], symplostatins with N-MeTyr 5 were found to be slightly stronger inhibitors of elastase than those with N-MePhe 5 .The effect of the side-chain on the activity of CPs was also postulated.Interestingly, the two Arg 2containing CPs from CCNP1411 that lack the side-chain (CP 809 and CP 778) were not active (Table 1).Thus far, the CP-like peptide composed of only the cyclic part was tested once [47].Micropeptin MZ771, with Arg 2 and without the side-chain, did not affect the activity of enzymes.In addition, CPs with the same cyclic part but differing in the sidechain structure (e.g., CP 1048 and CP 1020b) were shown to have different effects on the tested enzyme (7.25 and 0.39 µM, respectively).

MTT Assay
The cytotoxic activity of two CPs produced by CCNP1411, CP 962 with Arg 2 , and CP 985 with Tyr 2 , was previously tested against a breast cancer cell line and no effects were observed, even at 500 µg × mL −1 [8].In the current study, the activity of 17 isolated CPs against a human cervical cancer (HeLa) cell line was assayed.Only for one of the free Arg 2 -containing CPs, CP 978, was the concentration-dependent reduction in cell viability significant (Figure 10).At the highest concentration (200 µg × mL −1 ), the cell viability was 62.5% (SD = 5.35) lower than in the control.Significant effects were also observed for Leu 2 -containing CPs, especially CP 949 and CP 919, which at 200 µg × mL −1 reduced cell viability by 71.5% (SD = 4.92) and 97.6% (SD = 0.12).Other CPs had no effect on cancer cell proliferation.The cytotoxic effects of CP-like peptides have been rarely reported.Among the few examples there are: symplocamide A that affected H-460 lung cancer cells and neuro-2a neuroblastoma cells [31], tasipeptins A and B cytotoxic to KB human epithelial carcinoma cells [30], molassamide inhibiting the elastase-mediated migration of breast cancer cells [52], and kyanamide which was moderately cytotoxic to HeLa S3 cells [4].The majority of the cytotoxic CP-like peptides belong to Leu 2 or Abu 2 bearing analogues and elastase inhibitors, suggesting that these amino acids are critical for activity against both targets.on the proliferation of human cervical cancer (HeLa) cells.CPs that at the highest concentration reduced cell viability by more than 60% were marked with asterixis as significant.

LC-MS/MS Analysis
The LC-MS/MS system composed of Agilent 1200 HPLC (Agilent Technologies, Waldbronn, Germany) and a QTRAP5500 tandem mass spectrometer (Sciex, Toronto, Canada) was used.Compounds were separated in a Jupiter Proteo C12 column (150 × 4.6 mm, 4 µm, 90 Å) (Phenomenex, Aschaffenburg, Germany), using water: acetonitrile mixture (both solvents with 0.1% formic acid).The turbo ion spray operated in positive ionisation, at 550 °C; voltage, 5.5 kV; nebuliser gas pressure, 60 psi; curtain gas pressure, 20 psi.To determine the content of the samples, an IDA (information-dependent acquisition) mode was used, and ions within the m/z range 500-1250 and intensity greater than 5 × 10 5 cps were fragmented.The collision energy was 60 eV, and the dwell time was 100 msec.on the proliferation of human cervical cancer (HeLa) cells.CPs that at the highest concentration reduced cell viability by more than 60% were marked with asterixis as significant.

LC-MS/MS Analysis
The LC-MS/MS system composed of Agilent 1200 HPLC (Agilent Technologies, Waldbronn, Germany) and a QTRAP5500 tandem mass spectrometer (Sciex, Toronto, Canada) was used.Compounds were separated in a Jupiter Proteo C12 column (150 × 4.6 mm, 4 µm, 90 Å) (Phenomenex, Aschaffenburg, Germany), using water: acetonitrile mixture (both solvents with 0.1% formic acid).The turbo ion spray operated in positive ionisation, at 550 • C; voltage, 5.5 kV; nebuliser gas pressure, 60 psi; curtain gas pressure, 20 psi.To determine the content of the samples, an IDA (information-dependent acquisition) mode was used, and ions within the m/z range 500-1250 and intensity greater than 5 × 10 5 cps were fragmented.The collision energy was 60 eV, and the dwell time was 100 msec.

LC-HRMS Analysis
The analysis of CPs present in the cell extract was performed with theapplication of an Elute HPG1300 HPLC system (Bruker Daltonics, Bremen, Germany) coupled with an Impact II high-resolution time of flight tandem mass spectrometer (QToF-HRMS) (Bruker Daltonics, Bremen, Germany).Chromatographic separation was performed in an Atlantis T3 C18 column (100 Å, 3 µm, 2.1 mm × 100 mm, Waters) with a VanGuard cartridge precolumn (Waters).The mobile phases were water (A) and acetonitrile (B) both acidified with 0.1% formic acid.A gradient elution program from 25 to 100% B was used with a constant flow of 0.2 mL × min −1 .The ESI conditions were: positive ionization mode, capillary voltage 3100 V, nebulizer gas 1.0 bar, dry gas 6.0 L × min −1 , dry gas temperature 220 °C, hexapole 100 Vpp and pre-pulse storage 5 µs.Stepping mode was activated as follows: collision RF from 200 Vpp to 700 Vpp (50-50% of the timing), transfer time from 20 µs to 80 µs (50-50% of the timing) and collision energy from 8.4 eV to 10.5 eV (25-7 5% of the timing).Full scan accurate mass spectra were obtained in the range 50-1300 m/z in Auto MS (Data Dependent Analysis, DDA) with dynamic exclusion.Calibration was carried out in every sample run using the sodium formate cluster ions (10 mM).Bruker's HyStar and Data Analysis software was utilized for data acquisition, calibration, and raw data conversion to the .mzXMLformat before further processing.

Molecular Networking
A molecular network was created with the Feature-Based Molecular Networking (FBMN) workflow [92] on GNPS (https://gnps.ucsd.edu,accessed on 10 August 2023) [93].The mass spectrometry data were first processed with MZmine3 [104] and the results were exported to GNPS for FBMN analysis.Data were filtered by removing all MS/MS fragment ions within ±17 Da of the precursor m/z.MS/MS spectra were window filtered by choosing only the top 6 fragment ions in the ±50 Da window throughout the spectrum.The precursor ion mass tolerance was set to 0.05 Da and the MS/MS fragment ion tolerance to 0.05 Da.A molecular network was then created where the edges were filtered to have a cosine score greater than 0.7 and more than 6 matched peaks.Further, edges between two nodes were kept in the network if, and only if, each of the nodes appeared in each others respective top 10 most similar nodes.Finally, the maximum size of a molecular family was set to 100, and the lowest scoring edges were removed from the molecular families until the molecular family size was below this threshold.The spectra in the network were then searched against the GNPS spectral libraries [93].The library spectra were filtered in the same manner as the input data.All matches kept between network spectra and library spectra were required to have a score greater than 0.7 and at least 6 matched peaks.The DEREPLICATOR was used to annotate MS/MS spectra [105].The molecular networks were visualized using Cytoscape software [106].

Conclusions
Analysis of concentrated samples obtained from higher biomass of Nostoc edaphicum CCNP1411 resulted in the identification of 93 cyanopeptolins, including 79 new variants.To the best of our knowledge, this is the highest number of cyanopeptides ever recorded in one strain.The tests performed with the application of 34 isolated CPs of diverse structure confirmed the role of the residue located between Thr 1 and Ahp 3 on the activity of the compounds.Arg 2 -containing CPs were most active against trypsin, CPs with hydrophobic amino acid in position 2 inhibited chymotrypsin, while only CPs with Leu 2 inhibited elastase and showed the most potent cytotoxic effect on human cervical cancer (HeLa) cells.The enzymatic assays also indicated the significance of the CP side-chain for the interactions with serine proteases.With the cytotoxic activity against cancer cells and the activity against enzymes implicated in a number of human diseases, CPs can be classified as lead compounds for further studies on their pharmaceutical potential.

Figure 8 .
Figure 8.General structure of cyanopeptolins produced by Nostoc edaphicum.The number of variants with specific amino acids is given in brackets.SC indicates side-chain.

Figure 10 .
Figure 10.Effect of cyanopeptolins (tested at a range of concentration 25, 50, 100, and 200 µg × mL−1)on the proliferation of human cervical cancer (HeLa) cells.CPs that at the highest concentration reduced cell viability by more than 60% were marked with asterixis as significant.

Figure 10 .
Figure 10.Effect of cyanopeptolins (tested at a range of concentration 25, 50, 100, and 200 µg × mL −1 )on the proliferation of human cervical cancer (HeLa) cells.CPs that at the highest concentration reduced cell viability by more than 60% were marked with asterixis as significant.
Leu 4 + X 5 + H − H 2 O] belong to the most important diagnostic ions.Other ions that supported the process of structure elucidation are listed in Table