Structural Characterization of New Peptide Variants Produced by Cyanobacteria from the Brazilian Atlantic Coastal Forest Using Liquid Chromatography Coupled to Quadrupole Time-of-Flight Tandem Mass Spectrometry

Cyanobacteria from underexplored and extreme habitats are attracting increasing attention in the search for new bioactive substances. However, cyanobacterial communities from tropical and subtropical regions are still largely unknown, especially with respect to metabolite production. Among the structurally diverse secondary metabolites produced by these organisms, peptides are by far the most frequently described structures. In this work, liquid chromatography/electrospray ionization coupled to high resolution quadrupole time-of-flight tandem mass spectrometry with positive ion detection was applied to study the peptide profile of a group of cyanobacteria isolated from the Southeastern Brazilian coastal forest. A total of 38 peptides belonging to three different families (anabaenopeptins, aeruginosins, and cyanopeptolins) were detected in the extracts. Of the 38 peptides, 37 were detected here for the first time. New structural features were proposed based on mass accuracy data and isotopic patterns derived from full scan and MS/MS spectra. Interestingly, of the 40 surveyed strains only nine were confirmed to be peptide producers; all of these strains belonged to the order Nostocales (three Nostoc sp., two Desmonostoc sp. and four Brasilonema sp.).

The Atlantic Forest is a continental biome that extends primarily along the Atlantic coast of Brazil and also includes small portions of Argentina and Paraguay [10,11]. Considered one of the richest regions in the world in terms of biodiversity and endemism, this forest is also one of the most threatened regions [12]. With respect to the microorganisms that inhabit this region, little is known [13]. Particularly, referring to cyanobacteria, the available information is typically focused on diversity and descriptions of species [14][15][16][17] and scarce information about secondary metabolite production is available [18]. Cyanobacteria inhabiting the phyllosphere (leaf surface) are exposed to hostile conditions such as low availability of nutrients, large temperature range, osmotic stress, and high incidence of ultraviolet rays [19], representing a little explored community of extremophilic organisms with potential for production of novel bioactive compounds.
Due to the development in instrumentation and technology, mass spectrometry (MS) has rapidly become a fundamental tool for characterizing large biomolecules such as proteins and peptides [20,21]. The development of electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI) in conjunction with the introduction of multi-stage and hybrid analyzers was decisive for the success of MS analysis of biomolecules [20]. Time-of-flight (TOF) is one type of mass analyzer that finds wide application in this field and can provide high-resolution spectra for protein and peptides. Particularly when combined with other types of mass analyzer, such as the quadrupole mass analyzer forming the hybrid quadrupole time-of-flight (QTOF), valuable peptide or amino acid sequence information can also be obtained. Liquid chromatography (LC) is a typical inlet method for MS analysis and provides an extra dimension of separation. LC can be coupled with ESI (online/offline) or MALDI (offline). Matrix-assisted laser desorption ionization-time-of-flight-mass spectrometry (MALDI-TOF-MS) has been employed in the identification of many known and new cyanopeptides directly from cyanobacteria [22][23][24][25][26]. Applications of liquid chromatography (LC) coupled to quadrupole time-of-flight mass spectrometry (QTOF-MS) have also been reported [27][28][29].
In this study, the peptide profiles of 40 strains of cyanobacteria isolated from the phyllosphere of four native species of plants in the coastal forest of Southeastern Brazil [30] (Table 1) were investigated using LC-ESI-Q-TOF-MS with positive ion detection and revealed a high production potential in the heterocytous strains. Accurate masses and isotope patterns for both precursor and product ions were used to propose the planar structures of the observed peptides. Table 1. Cyanobacteria included in the present study.

Result and Discussion
The peptide profiles of the 40 strains isolated from the Atlantic Forest (CENA350-389) were investigated using LC/DAD/ESI/QTOF/MS/MS. As an example, Figure 1 shows the chromatographic profiles of two strains: one Brasilonema sp. and one Desmonostoc sp. The retention times (RT), protonated molecules ([M + H] + ), molecular formula provided for the experimental m/z, and error and millisigma (mSigma) values for the major peaks are summarized in Table 2. Mass accuracy was below 5 ppm for all the detected compounds. Peak identification was performed based on the data presented in Tables 2-14 and based on previously published data [23,24,26,[31][32][33][34][35][36][37][38][39][40]. This approach allowed the elucidation of the planar structures of 38 peptides, including 10 new aeruginosins, 16 new anabaenopeptins, and 11 new cyanopeptolins. Among the surveyed strains the heterocytous strains Nostoc sp. CENA352, CENA358, and CENA369, Brasilonema sp. CENA360, CENA361, CENA381, and CENA382 and Desmonostoc sp. CENA386 and CENA371 were identified as producers of cyanopeptides. Cyanopeptides were not detected in the extracts of the remaining 31 cyanobacterial strains under our experimental conditions.
The aeruginosins found in these extracts were characterized by closely related structures, most of which were common to both Nostoc and Brasilonema producer species. The most prominent peak detected in the MS chromatogram of these extracts (7) was assigned to the aeruginosin 865 (m/z 865.4565 [M + H] + ) [37]. This compound, which was recently isolated from a terrestrial cyanobacterium belonging to Nostoc sp., was structurally characterized as containing both a fatty acid and a carbohydrate attached to the Choi moiety [37]. Figure 2 shows the product ion spectra of this aeruginosin. A collision energy of 70 eV was necessary to obtain a spectrum with abundant and intense product ions. Consistent with the existence of an agmatine (Agma) residue in the molecule, the C-terminal ions (m/z 588.3249, 412.2912, and 297.1989) and the corresponding satellite ions, which were produced via ammonia or water loss (m/z 571.2985, 395.2657, 279.1820, and 261.1717) dominated the spectrum. Additionally, the ions generated by the cleavage of the glycosidic acid and/or the ester bond established the sugar and the lipid acids as glucuronic acid and hexanoic acid, respectively. The presence of ions at m/z 18 u higher (m/z 156.1035 and 138.0866) than the diagnostic ions that are typically generated from the Choi residue (m/z 140 and 122) indicated the dihydroxylation of the indole ring in this amino acid and were key fragments for the detection of other aeruginosin congeners.  (11) were identified as new congeners. The product ion spectra of these aeruginosins were similar to that of compound 7, allowing us to elucidate their structures based on a comparison of spectra (Tables 3-5).     In this sense, a structure similar to that of aeruginosin 865 was proposed for compounds 5, 9, and 11 except for the fatty acid esterifying position 5 of the Choi moiety. In this position, butanoic, heptanoic, and octanoic acid were proposed for each compound, respectively. These changes were clearly evidenced by the sequence of ions containing the aforementioned fatty acids (m/z 384.2617, 543.2599, 560.2818, and 661.3909 for compound 5; m/z 409.2803 and 703.4423 for compound 9; and m/z 423.2969, 599.3216, 616.3618, and 717.4543 for compound 11). On the other hand, structural differences between the pairs of compounds 4, 5 and 6, 7 were attributed to Choi-glycosylation. A similar product ion spectra, which differed only in the ions generated by the cleavage of the glycosidic bond (m/z 546.3176, 574.3454, and 735.3952/735.3945) suggested that the glucuronic acid in compounds 5 and 7 was replaced by a hexose in compounds 4 and 6.
Compounds 1 and 3, which also showed structures similar to compounds 4 and 6 and 5, 7, 9, and 11, respectively, were distinguished by the lack of fatty acids in their structures (Table 4). These peptides could be biosynthetic intermediates of the respective fatty acid-containing aeruginosins. Along with these compounds, an oxygen-deficient variant of compound 1 (2) was also detected; this structural difference was likely due to the absence of the Choi 5-hydroxylation.
Finally, two other structural variants of compound 7 were also detected (8 and 10). For compound 8, methylation of the amino acid in the second position was suggested based on the mostly conserved product ion spectrum and the presence of a fragment ion at m/z 100.1148 (NMeLeu immonium). Similarly, for compound 10, a phenyl lactic acid was proposed for the N-terminus instead of a hydroxyl-phenyl lactic acid (Table 5).
From a biomedical point of view, the pharmacological potential of these new aeruginosin variants must be evaluated. As mentioned above, aeruginosins typically exhibit antithrombotic activity, making these compounds interesting candidates for the development of anticoagulant drugs [46]. Additionally, all of these compounds are structurally similar to aeruginosin-865, which has exhibited remarkable anti-inflammatory activity [37]. Evaluations of the bioactivity of these compounds could provide insights into the structure-activity relationship of this class of aeruginosins.
Anabaenopeptins are hexapeptides that contain a ring of five amino acids. Position 2 is always occupied by D-Lys, which both closes the ring with the amino acid at position 6 and establishes a ureido link with the amino acid in position 1, giving rise to a side chain. Positions 4 and 5 are typically occupied by aromatic and methylated amino acids, respectively. Position 3 has been reported to be occupied primarily by valine or isoleucine/leucine and less frequently by methionine [3]. Various biological activities have been described for these structures, including inhibition of protein phosphatase [49], carboxypeptidases A [50,51] and U [52], and other protease inhibitory activity [53]. To date, 30 anabaenopeptins have been isolated from many different cyanobacteria genera [8,54] (Anabaena [55,56], Aphanizomenon [51], Lyngbya [57], Microcystis [58], Oscillatoria [53,59], Planktothrix [60], and Schizothrix [61]) and also from marine sponges [62,63].
The detected anabaenopeptins could be grouped according to their structural features. Among these compounds, 10 anabaenopeptins that shared similar structural characteristics were detected in Desmonostoc sp. CENA360 and Brasilonema sp. CENA386 (12-17, 22, 25-27). The representative fragmentation pathways and spectra of these compounds are described in Figures 3 and 4, and the assignments of the principal ions are shown in Tables 6-9. Since the nature of the exocyclic amino acid varies among them, two different fragmentation patterns were observed for these compounds depending on the nature of the amino acid side chain. Extensive fragmentation was observed due to the absence of polar and basic residues in this chain. These anabaenopeptins commonly incorporate the amino acid N-methyl asparagine (N-MeAsn) in position 5, and on several occasions, either methylation or ethylation was postulated for the homovariant amino acid in position 4.       Consequently, just two low intensity series of fragment ions that were assigned to two preferential primary cleavages were observed in the spectrum ( Figure 4). However, enough information was generated to allow a tentative interpretation of the spectrum. The preferential fragmentation observed was attributed to the cleavage of the ureido linkage and the opening of the ring and indicated the presence of lysine in the side chain (m/z 678.3934). that were attributed to the N-methylhomophenylalanine (N-MeHph) and N-ethylhomophenylalanine (N-EtHph) immonium ions, respectively, also supported these assignments.
Using similar reasoning, the remaining anabaenopeptins were subsequently characterized. Losses of 157 u that were observed for compounds 22 and 26, losses of 191 u that were observed for compounds 25 and 27, and losses of 200 u that were observed for compounds 13, 15, and 17 were used to identify the amino acid side chain as leucine, phenylalanine, or arginine, respectively. A fragmentation pattern similar to that described in the preceding paragraphs characterized the cyclic structures. As mentioned above, compounds with [M + H] + at m/z 878.5122, 892.5023, and 906.5196 exhibited low efficiency fragmentations, which are characteristic of oligopeptides containing strongly basic residues. When the collision energy used for fragmentation was increased to improve efficiency, the abundance of the fragment ions was compromised (Figure 4).
Four additional anabaenopeptins (18, 19, 23, 24) were found in Nostoc sp. CENA352. Accurate mass measurement, isotopic profiles, and MS/MS spectra were conclusive to distinguish these compounds from previously described anabaenopeptins and characterize these compounds as new variants [36,53]. Structurally, all of these anabaenopeptins incorporated phenylalanine in the side chain position, as evidenced by the loss of a 191-u fragment from the protonated molecules (Table 10). Positions 5 and 6 of these four peptides were also conserved and were occupied by N-methyl-alanine and homotyrosine respectively. For the homovariant amino acids in position 4, either homophenylalanine (18,19) (Table 10).  In addition, a pair of unusual tryptophan-containing anabaenopeptins (20)(21) was also detected at m/z 803.4417 [M + H] + and 803.4425 [M + H] + , exclusively in the genus Brasilonema sp. (CENA360 and CENA382). As this pair of compounds exhibited similar protonated molecules and product ion spectra but differed in retention times, these compounds were classified as diastereoisomers. The structures of these compounds were postulated in accordance with the fragment ions listed in Table 11. According to our literature search, prior to our work, the occurrence of tryptophan-containing anabaenopeptins in cyanobacteria was limited to the genus Tychonema sp. [64]. The structurally related compounds isolated from this genus, the brunsvicamides A-C, inhibit tyrosine phosphatase B of Mycobacterium tuberculosis (MptpB) [64] and are highly selectivity inhibitors for human leukocyte elastase (HLE) [65]. However, those peptides and the compounds reported here differ in their amino acid sequences. Unlike the brunsvicamides, the postulated anabaenopeptins contain tryptophan in position 4 and N-methyl-alanine in the methylated amino acid position. The structures of the compounds described in this study are more closely related to a synthetic brunsvicamide analog described by Walther et al. [66], which was found to be an inhibitor of carboxypeptidase A. Thus, further biological tests of these compounds are warranted.  (38), and 1032.5170 (34) were present in the extracts from Desmonostoc strains CENA371 and CENA386. Although no single diagnostic fragment ion can be used to identify this family of peptides, series of fragments related to the conserved position 4 (3-amino-6-hydroxy-2-piperidone amino acid (Ahp) could be used to identify these compounds [23]. Under our experimental conditions, doubly charged ions ([M − H2O + 2H] 2+ ) with an abundance comparable to that of protonated molecules were also observed in the mass spectra of all detected cyanopeptolins. MS and MS/MS analyses suggested closely related structures for all of these cyanopeptides (Tables 12-14). As the most common structural feature, these cyanopeptolins contained N-acetyl-proline-glutamine as side chain, with the fifth position occupied by either a dimethylated tyrosine or chlorinated-methylated tyrosine. Valine and leucine alternated in positions 4 and 6.      This family of cyclic peptides with high structural variability featured a ring formed by six amino acids and a side chain of different lengths and composition [3]. An ester bond between the hydroxyl group of the threonine in position one and the carboxyl group of the terminal amino acid cyclizes the ring. The threonine amino acid in position 1 is occasionally replaced by 3-hydroxy-4-methylproline [67,68]. The 3-amino-6-hydroxy-2-piperidone amino acid (Ahp) always occupied position 3, while a methylated aromatic amino acid and other neutral amino acids are found in positions 5 and 6, respectively. The highly variable side chain may contain an aliphatic fatty acid or a glyceric acid, which is attached either directly to the threonine in position 1 or through one or two amino acids [3]. This family of peptides is often described as protease inhibitors [69][70][71][72][73]. Cyanopeptolins have been isolated mostly from Microcystis [71,74,75] but also from other genera such as Lyngbya [76,77], Nostoc [67,78], Oscillatoria [79,80], Planktothrix [60,81], Scytonema [82], or Symploca [72]. Figure 5 shows the proposed depsipeptide cyclic structure for compound 37 (Ac-Pro-Gln[Thr-Leu-Ahp-Leu-NMe-OMe-Tyr-Leu]) and its predicted fragmentation pattern. The most abundant ions observed in the product ion spectrum of this compound were attributed to the loss of amino acid residues from the C-terminus (m/z 881.4576, 690.3777, 464.2468, 351.1633, and 268.1271) of the dehydrated protonated molecule at m/z 994.5553. This precursor ion was suggested to be generated by the cleavage of the ester linkage accompanied by the dehydration of the Thr [34]. Further loss of the acetyl-proline-glutamine side chain from this linear ion was also noted (m/z 727.4318). Comparing the aforementioned data with those obtained for compound 36, a highly similar structure was deduced. Differences of 14 u in the protonated molecules and a fragment ion at m/z 713.3971 in combination with the existence of product ions at m/z 464.2549 and 400.2250 indicated that the leucine attributed to position 6 in compound 37 was replaced by valine. In addition, two other isobaric compounds were observed at earlier retention times (31 at 30.0 min, and 35 at 31.1 min). Structural differences from compound 36 were proposed based on mass data (Table 13) Similarly, four additional compounds (28, 30, 32, and 34) exhibited fragmentation patterns highly similar to that of compound 37. While most of the mass spectrum remained conserved, fragment ions containing the original dimethylated tyrosine amino acid exhibited m/z shifts when compared to that of the model compound. Additionally, the isotopic pattern of these fragment ions in conjunction with that of the protonated molecule revealed the presence of a chlorine atom in their structures. For compound 34, these fragment ions shifted 20 u (m/z 901.4207, 806.3871, 747.3811, 420.1648, and 184.0523), leading us to propose the chlorination of a methylated tyrosine at position 5. For compound 32, shifts of 20 and 14 u were observed (m/z 733.3705, 406.2447, and 184.0486), suggesting that in addition to the modification mentioned above, a substitution of leucine with valine at position 4 also occurred. Based on the same logic, a structure similar to compound 34, just with the leucine residue at position 6 replaced by valine, was proposed for compound 30. Finally, for compound 28, differences of 30 u in product ions containing the side chain were observed in comparison to compound 34 (m/z 321.1521 and m/z 434.2416). Thus, the same cyclic peptide was proposed for compound 28 with the side chain tentatively attributed to methylated-dehydroproline-glutamine (Mdhp-Gln) or other analogs. The chlorinated and methylated tyrosine amino acid proposed for position 5 was quite unusual and has only been observed in a small number of cyanopeptolins [71,80].
The structural similarity of these compounds to other cyanopeptolins with observed protease inhibitory activity warrants further bioactivity assays. Trypsin inhibitory selectivity was suggested to be related to the existence of basic residues adjacent to Ahp, while chymotrypsin selectivity was proposed to be related to hydrophobic residues. Additionally, residues in other positions as side chains or in the fifth position appear to influence this activity [5,72,83]. These assays will be able to establish the influence of the particular properties of these compounds on the selectivity and potency of these and other activities.

Strains of Cyanobacteria and Cultivation Conditions
The 40 surveyed cyanobacterial strains belong to the culture collection of the Center for Nuclear Energy in Agriculture Collection/University of São Paulo (CENA/USP), Brazil. All strains were isolated from the leaves of four plant species-Euterpe edulis, Guapira opposita, Garcinia gardneriana, and Merostachys neesii-that were collected in two regions of the Parque Estadual Serra do Mar (Southeastern Brazil). The isolates (three Choococcales, 13 Pseudanabaenales, and 24 Nostocales) were previously identified using both morphological analysis and phylogenies based on the 16S rRNA gene [30] (Table 1). The cultures of cyanobacteria were maintained in liquid BG11 medium under white fluorescent light (30 mmol photons·m −2 ·s −1 ) with a 14:10 h light/dark cycle at 25 ± 1 °C under constant agitation (150 rpm) for 21 days. The cells were then concentrated by centrifugation (7000× g, 5 min), washed three times in saline solution (NaCl 0.8%), and re-inoculated into 500-mL flasks containing 200 mL of medium and cultured for a further 21 days.

LC/MS Analyses
Analyses were carried out on a Shimadzu Prominence Liquid Chromatography system coupled to a quadrupole time-of-flight mass spectrometer (Micro TOF-QII; Bruker Daltonics, MA, USA) with an ESI interphase. Separations were achieved using a Luna C18 (2) column (250 mm × inner diameter 3:00, 5 μm) (Phenomenex, Torrance, CA, USA) protected with a guard column of the same material. Samples (5 μL) were eluted using a mobile phase A (water, 0.1% formic acid, and 5 mM ammonia formate) and a mobile phase B (acetonitrile). The gradient increased linearly from 5% to 90% B over 50 min at a flow rate of 0.2 mL/min. The ionization source conditions were as follows: positive ionization, capillary potential of 3500 V, temperature and flow of drying gas (nitrogen) of 5 mL/min and 300 °C, respectively, nebulizer pressure of 35 psi. Mass spectra were acquired using electrospray ionization in the positive mode over the range of m/z from 50 to 3000. The Q/TOF instrument was operated in scan and AutoMS/MS mode, performing MS/MS experiments on the three most intense ions from each MS survey scan. Three collision-induced dissociation CID experiments were performed by varying the collision energies from 30 to 70 to produce many fragment ions of high abundance. The collision energies for fragmentation were as follows. The mass spectrometer was calibrated externally with a 10 mM sodium formate cluster solution consisting of 10 mM sodium hydroxide and 0.1% formic acid in water-isopropanol 1:1 (v:v). The accurate mass data were processed using Data Analysis 4.0 software (Bruker Daltonics, Bremen, Germany) which provided a ranking of possible elemental formulae (EF) by using the SmartFormulaEditorTM. For each EF, error (deviations between the measured and theoretical mass of a given sum formula) and sigma value (comparisons of the theoretical and the measured isotope pattern of a given formula) are calculated [65]. The confirmation of the elemental formula was based on the widely accepted thresholds of 5 ppm and 20 m Sigma. Every experiment was run in triplicate.

Chemical
HPLC grade methanol and acetonitrile from J.T. Baker (USA). Ammonium formate and formic acid, both for mass analyses, were obtained from Fluka (Germany).

Conclusions
In the present study, liquid chromatography coupled to a quadrupole time-of-flight mass spectrometer and equipped with an ESI interface was successfully applied to study in depth the cyanopeptide composition of 40 cyanobacterial strains from the Brazilian Atlantic Forest. This approach allowed us to tentatively identify 38 peptides, of which 37 had not been previously described in literature, including aeruginosins, anabaenopeptins, and cyanopeptolins. Based on the mass accuracy data in scan and product ion spectra in combination with the isotopic pattern of the deprotonated and product ions, a planar structure was postulated for each of the detected peptides. In addition to the recently reported aeruginosin 865, 10 novel structural variants were described here. Either hexose or glucuronic acid and butanoic, hexanoic, heptanoic, or octanoic acid O-linked to a Choi motif were observed. With respect to anabaenopeptins, this study led to the characterization of 16 anabaenopeptins. Among those anabaenopeptins, two tryptophan-containing anabaenopeptins and 10 additional anabaenopeptins that incorporated the amino acid N-methyl asparagine were identified. Furthermore, on several occasions, ethylation was postulated for the homovariant amino acid in position four. With respect to cyanopeptolins, 11 new variants were characterized. An N-acetyl proline-glutamine side chain mostly featured these compounds. Additionally, four of these compounds contained the unusual chlorinated N-methylated tyrosine. These results highlight the potential of LC-ESI-QTOF-MS for peptide characterization purposes in complex mixtures from small quantities of material. However, the combination of MS to other techniques such as X-ray crystallography or nuclear magnetic resonance (NMR) applied to the pure compounds is necessary for a complete spectroscopic characterization of the proposed peptide structures including the stereochemistry determination.
Among the surveyed strains of cyanobacteria, only nine strains were observed to produce cyanopeptides (three Nostoc sp., two Desmonostoc sp., and four Brasilonema sp.). However, a highly diverse array of new peptide variants was revealed in the producer strains, which emphasizes the potential of underexplored environments as a source of bioactive compounds.