Lunaemycins, New Cyclic Hexapeptide Antibiotics from the Cave Moonmilk-Dweller Streptomyces lunaelactis MM109T

Streptomyces lunaelactis strains have been isolated from moonmilk deposits, which are calcium carbonate speleothems used for centuries in traditional medicine for their antimicrobial properties. Genome mining revealed that these strains are a remarkable example of a Streptomyces species with huge heterogeneity regarding their content in biosynthetic gene clusters (BGCs) for specialized metabolite production. BGC 28a is one of the cryptic BGCs that is only carried by a subgroup of S. lunaelactis strains for which in silico analysis predicted the production of nonribosomal peptide antibiotics containing the non-proteogenic amino acid piperazic acid (Piz). Comparative metabolomics of culture extracts of S. lunaelactis strains either holding or not holding BGC 28a combined with MS/MS-guided peptidogenomics and 1H/13C NMR allowed us to identify the cyclic hexapeptide with the amino acid sequence (D-Phe)-(L-HO-Ile)-(D-Piz)-(L-Piz)-(D-Piz)-(L-Piz), called lunaemycin A, as the main compound synthesized by BGC 28a. Molecular networking further identified 18 additional lunaemycins, with 14 of them having their structure elucidated by HRMS/MS. Antimicrobial assays demonstrated a significant bactericidal activity of lunaemycins against Gram-positive bacteria, including multi-drug resistant clinical isolates. Our work demonstrates how an accurate in silico analysis of a cryptic BGC can highly facilitate the identification, the structural elucidation, and the bioactivity of its associated specialized metabolites.


Introduction
Streptomyces and other actinomycetes are Gram-positive filamentous bacteria that offered most of the molecules of microbial origin that are now used in human and animal therapy [1,2]. The urge for new structural leads for both the agro-food and pharmaceutical fields has revitalized the interest for bioprospection of microorganisms dwelling in the most diverse and extreme environmental niches [3]. This quest for metabolite-producing bacteria in environments far from the ecological niches where they were mainly/originally isolated (such as rich organic soils for actinomycetes) is motivated by the idea that they would possess genetic features adapted to these specific and unusual environments, allowing them to produce uncommon 'specialized' metabolites [4].
Caves, being inorganic and extreme oligotrophic environments, were initially considered as inhospitable places for streptomycetes and other microorganisms programmed for primarily consuming nutrients derived from the plant decomposing organic matter. Closer sequence analysis of the core and additional biosynthetic genes of BGC 28a allowed us to propose a biosynthetic pathway and predict the amino-acid building blocks and backbone of its main product ( Figure 1). First, as observed in the BGCs of kutznerides and himastatins, BGC 28a contains the two genes required for the biosynthesis of the nonproteinogenic amino acid piperazic acid (Piz) from L-ornithine (Orn). The first reaction is performed by the product of lun16 encoding the flavin-and oxygen-dependent L-ornithine N-hydroxylase (SLUN_ RS38485 homologous to HmtM and KtzI, Figure 1, Table 1) to give L-N 5 -OH-ornithine [35], which will be further converted to L-Piz by the product of lun17 (SLUN_ RS38490 homologous to HmtC and KtzT, Figure 1, Table 1) encoding the heme-dependent piperazate synthase [36]. Interestingly, genes encoding piperazate synthase are present in most BGCs for Piz-containing molecules [37,38], suggesting that at least one NRPS module of BGC 28a is expected to recruit Piz as building block. Next, as observed in the stenothricin gene cluster [29], BGC 28a contains a transcription unit with four ORFs whose products encode the enzymes involved in the five steps of Orn biosynthesis from glutamic acid and glutamine precursors ( Figure 1, Table 1). These genes/proteins are lun14 for the bifunctional N 2 -acetyl-L-ornithine:L-glutamate N-acetyltransferase (SLUN_ RS38475 homologous to StenL, Figure 1, Table 1), lun13 for the N-acetylglutamate kinase (SLUN_RS38470 homologous to StenK and HmtB, Figure 1, Table 1), lun11 for the N-acetyl-gamma-glutamyl-phosphate reductase (SLUN_RS38460 homologous to StenM, Figure 1, Table 1), and lun12 for the N 2 -acetyl-L-ornithine:2-oxoglutarate aminotransferase (SLUN_RS38465 homologous to StenJ, Figure 1, Table 1). BGC 28a thus contains all genes required for the enzymatic conversion of glutamic acid and glutamine to L-Piz, further supporting the recruitment of this amino acid by the NRPS machinery.
Regarding the core biosynthetic genes, three ORFs encode NRPSs with complete bi-  HmtL, non-ribosomal peptide synthetase (Modules Closer sequence analysis of the core and additional biosynthetic genes of BGC 28a allowed us to propose a biosynthetic pathway and predict the amino-acid building blocks and backbone of its main product ( Figure 1). First, as observed in the BGCs of kutznerides and himastatins, BGC 28a contains the two genes required for the biosynthesis of the nonproteinogenic amino acid piperazic acid (Piz) from L-ornithine (Orn). The first reaction is performed by the product of lun16 encoding the flavin-and oxygen-dependent L-ornithine N-hydroxylase (SLUN_ RS38485 homologous to HmtM and KtzI, Figure 1, Table 1) to give L-N 5 -OH-ornithine [35], which will be further converted to L-Piz by the product of lun17 (SLUN_ RS38490 homologous to HmtC and KtzT, Figure 1, Table 1) encoding the hemedependent piperazate synthase [36]. Interestingly, genes encoding piperazate synthase are present in most BGCs for Piz-containing molecules [37,38], suggesting that at least one NRPS module of BGC 28a is expected to recruit Piz as building block. Next, as observed in the stenothricin gene cluster [29], BGC 28a contains a transcription unit with four ORFs whose products encode the enzymes involved in the five steps of Orn biosynthesis from glutamic acid and glutamine precursors ( Figure 1, Table 1). These genes/proteins are lun14 for the bifunctional N 2 -acetyl-L-ornithine:L-glutamate N-acetyltransferase (SLUN_ RS38475 homologous to StenL, Figure 1, Table 1), lun13 for the N-acetylglutamate kinase (SLUN_RS38470 homologous to StenK and HmtB, Figure 1, Table 1), lun11 for the N-acetyl-gamma-glutamyl-phosphate reductase (SLUN_RS38460 homologous to StenM, Figure 1, Table 1), and lun12 for the N 2 -acetyl-L-ornithine:2-oxoglutarate aminotransferase (SLUN_RS38465 homologous to StenJ, Figure 1, Table 1). BGC 28a thus contains all genes required for the enzymatic conversion of glutamic acid and glutamine to L-Piz, further supporting the recruitment of this amino acid by the NRPS machinery.
Regarding the core biosynthetic genes, three ORFs encode NRPSs with complete biosynthetic modules. The sequential order of these modules in peptide synthesis (initiation, elongation, and termination modules) is predicted based on the composition of their catalytic domains (Figure 1). The biosynthesis would start with Lun23 (SLUN_ RS38520, providing the initiation module 1 and the 'inactive' module 2), followed by Lun20 (SLUN_ RS38505, elongation module 3), and finally Lun22 (SLUN_RS38515, elongation and termination modules 4, 5, 6, and 7) ( Figure 1). All together these NRPSs form a megasynthetase organized into seven modules, for which the detailed analysis of the specificity-conferring sequence of each adenylation (A) domains with SANDPUMA [39] allowed the prediction of the building blocks recruited by the NRPS machinery ( Figure 1).
Module 1 in Lun23 contains three domains (A-T-E) and is predicted to incorporate L-Phe (DAWTVAAVCK as 10-aa code), which would be converted to D-Phe by the epimerization (E) domain. Module 2 in Lun23 also contains three domains (C-A-T), but prediction software tools did not propose specificity to any amino acid and suggest that this module would be inactive (DVASLAAYAK as 10-aa code). Module 3 in Lun20 contains three domains (C-A-T) and is predicted to recruit L-Ile (DAFFLGVTYK as 10-aa code). The four following modules (4 to 7), all included in Lun22, are predicted to incorporate four Piz residues, two D-Piz via modules 4 (C-A-T-E) and 6 (C-A-T-E) that both contain an epimerization domain, and two L-Piz via module 5 (C-A-T) and module 7 (C-A-T-TE). Indeed, the 10 aa code sequence conferring amino acid substrate specificity for Lun22 (DVFSVAAYAK as 10-aa code for modules 4, 5, 6, and 7) is identical to the one of HmtL-A1 (DVFSVAAYAK), and highly similar to those of KtzH-A1 (DVFSVGPYAK as 10-aa code, 8/10), ArtF-A1 (DVFSVASYAK as 10-aa code, 9/10), and ArtG-A2 (DVFTVAAYAK as 10-aa code, 9/10) which were reported to recognize and activate Piz in previous studies [25,28,30,37,38]. Finally, module 7 would cyclize and release the final hexapeptide via its thioesterase (TE) domain as shown for the biosynthesis of himastatins and kutznerides where the last TE domain of HmtL and KtzH, the homologues of Lun22, perform the macrocyclization of their respective final peptide [25,28].
In conclusion, the NRPS machinery of BGC 28a is predicted to generate an hexapeptide with the sequence D-Phe, L-Ile, D-Piz, L-Piz, D-Piz, L-Piz as backbone of its main product that, with or without cyclization, would have monoisotopic masses of 708.4071 Da (C 35 H 52 N 10 O 6 ) and 726.4177 Da (C 35 H 54 N 10 O 7 ), respectively.

MS/MS-Based Networking and Peptidogenomics Guided Genome Mining
A peptidogenomic approach where genome mining guides the tandem mass spectrometry (MS/MS)-based molecular identification of high-resolution mass spectrometry (HRMS) data was used to identify compounds produced by BGC 28a. The predicted building-blocks can be used to screen the MS/MS fragments in order to find neutral or charged losses of amino-acids. Although BGC 28a is predicted to produce a cyclized hexapeptide of a monoisotopic mass of 708.4071 Da, we cannot exclude at this stage either modifications of one or more building blocks before their utilization by the NRPS machinery and/or post modifications of the generated hexapeptide. Therefore, the MS/MS spectra of molecular ions ranging from m/z 700 to 800 Da have been manually analyzed to identify tag fragments corresponding to the three different amino acids predicted to be incorporated by the six NRPS modules. Finally, prior to performing the comparative metabolomic analysis, genome mining of the eighteen isolated S. lunaelactis strains revealed a genetic characteristic that is crucial for the identification of the products of BGC 28a. Indeed, BGC 28a is present in only three S. lunaelactis strains, i.e., those that possess pSLUN1a, whereas all the other strains with the linear plasmid pSLUN1b instead comprise BGC 28b [22] (Figure 2a). S. lunaelactis strains that possess pSLUN1b are therefore natural variants/mutants non-producing the searched compound(s) of BGC 28a, and their culture extract can be considered as negative control. The crude extracts from eleven S. lunaelactis strains grown on the ISP7 solid medium were collected and subjected to UPLC-HRMS/MS analysis. As explained above, metabolite profiling was performed to search for molecules with m/z comprised between 700 and 800 Da that were only produced by the two selected S. lunaelactis strains that possess pSLUN1a (MM109 T and MM37), and not by the nine other selected strains that instead have pSLUN1b (MM22, MM25, MM28, MM31, MM40, MM78, MM83, MM113, and MM115) ( Figure 2b). As shown in the heatmap of Figure 2b, a series of m/z signals displayed contrasting production patterns between the two groups of S. lunaelactis strains. An analysis of the extracted ion chromatograms (EIC) of the full extract of the 11 S. lunaelactis strains revealed a series of peaks only present in strains harboring pSLUN1a (Figure 2c). The most intense signal has an m/z of 725.4092, which is appropriate for the molecular formula of C35H53N10O7 + (0.17 ppm error) (Figure 2d). In the same HRMS spectrum, a second ion with lower signal intensity corresponds to the [M + Na] + ion at m/z 747.3903, which corresponds to the molecular formula C35H52N10O7Na + . The masses and formula of both proton and sodium adducts reveal that the main compound of BGC 28a has a molecular formula of C35H52N10O7 (monoisotopic mass of 724.4020 Da), which from now on will be referred to as compound 1 (hereafter named lunaemycin A).
Interestingly, the experimentally calculated mass of compound 1 only differs by 16 Da compared to the mass of the in silico predicted cyclized structure of the main product of BGC 28a ( Figure 1). From the molecular formula C35H53N10O7 inferred from m/z 725.41, Figure 2. (a) Genome mining of S. lunaelactis strains where the 42 predicted BGCs are grouped in 'grapes' of nodes where each node represents a BGC in one strain (from [22]). Note the two grapes corresponding to BGC 28a and to BGC 28b found in strains either containing the linear plasmid pSLUN1a or pSLUN1b, respectively. (b) Heatmap of the metabolomic analysis of 11 S. lunaelactis strains grown on ISP7 media. The zoom on the heatmap highlights the differential production patterns for metabolites with m/z comprised between 700 and 800 Da and only present in the extract of S. lunaelactis strains that possess BGC 28a (MM109 T and MM37). (c) Extracted ion chromatograms (EIC, from m/z 700 to 800) of the full extract of the 11 strains with a focus on the retention times of signals associated with compounds only detected in S. lunaelactis strains with BGC 28a (MM109 T and MM37). (d) HRMS-predicted molecular formula of compound 1 (for both the proton and the sodium adducts) associated with the main signal detected in EICs of S. lunaelactis strains MM37 and MM109 T .
The crude extracts from eleven S. lunaelactis strains grown on the ISP7 solid medium were collected and subjected to UPLC-HRMS/MS analysis. As explained above, metabolite profiling was performed to search for molecules with m/z comprised between 700 and 800 Da that were only produced by the two selected S. lunaelactis strains that possess pSLUN1a (MM109 T and MM37), and not by the nine other selected strains that instead have pSLUN1b (MM22, MM25, MM28, MM31, MM40, MM78, MM83, MM113, and MM115) ( Figure 2b). As shown in the heatmap of Figure 2b, a series of m/z signals displayed contrasting production patterns between the two groups of S. lunaelactis strains. An analysis of the extracted ion chromatograms (EIC) of the full extract of the 11 S. lunaelactis strains revealed a series of peaks only present in strains harboring pSLUN1a (Figure 2c). The most intense signal has an m/z of 725.4092, which is appropriate for the molecular formula of C 35 H 53 N 10 O 7 + (0.17 ppm error) ( Figure 2d). In the same HRMS spectrum, a second ion with lower signal intensity corresponds to the [M + Na] + ion at m/z 747.3903, which corresponds to the molecular formula C 35 H 52 N 10 O 7 Na + . The masses and formula of both proton and sodium adducts reveal that the main compound of BGC 28a has a molecular formula of C 35 H 52 N 10 O 7 (monoisotopic mass of 724.4020 Da), which from now on will be referred to as compound 1 (hereafter named lunaemycin A).
Interestingly, the experimentally calculated mass of compound 1 only differs by 16 Da compared to the mass of the in silico predicted cyclized structure of the main product of BGC 28a (Figure 1). From the molecular formula C 35 H 53 N 10 O 7 inferred from m/z 725.41, and according to the amino acid building blocks (D-allo-Ile, L-Phe, D-Piz, L-Piz, D-Piz, L-Piz) predicted to be loaded by adenylation domains, compound 1 could correspond to either i) an open peptide with an extra unsaturation from a dehydrogenation step or ii) a cyclic peptide carrying a hydroxyl group added in an oxidation step.
An analysis of the HRMS/MS spectrum of the m/z 725.41 molecular ion revealed sequence tags that allowed us to propose that compound 1 is the cyclized sequence of the hexapeptide Phe-OH-Ile-Piz-Piz-Piz-Piz ( Figure 3). Indeed, the daughter ion fragments of m/z 596.33, 449.26, 337.2, 225.13, and 113.07 correspond to consecutive losses of (i) an hydroxy-leucine (H 2 O neutral loss-derived tag fragment of m/z 129.08), (ii) a phenylalanine (H 2 O neutral loss-derived tag fragment of m/z 147.07), followed by three consecutive losses of Piz residues (three H 2 O neutral loss-derived tag fragments of m/z 112.06), and finally, the ion fragment tag of m/z 113.07 corresponding to Piz+H + (Figure 3b). The fragmentation spectrum thus confirmed the presence of tags associated with all six amino acids predicted to be incorporated by the A domains of the active modules of BGC 28a. The only difference with the initial prediction is the hydroxylation of the Ile residue, which could be attributed to the product of lun21 (SLUN_RS38510, Table 1) encoding a FAD-dependent oxidoreductase, homologous to the oxidoreductase PlyE of Streptomyces sp. MK498-98 F14 involved in the production of polyoxypeptins [40]. Indeed, PlyE is predicted to be responsible for the N-hydroxylation of valine and alanine building blocks on the nitrogen atom between Piz and Val or Ala residues of polyoxypeptin A. A similar role is also proposed for the FADdependent monooxygenase CchB that catalyzes the N-hydroxylation of the δ-amino group of ornithine in coelichelin biosynthesis before its recruitment by the NRPS machinery [41]. Similarly, Lun21 would perform the N-hydroxylation of the Ile residue, which would lead to the structure proposed for compound 1 (Figure 3b). Whether this N-hydroxylation of the Ile building block is performed (i) before its recruitment by the A domain of Lun20 (as suggested for coelichelin biosynthesis), (ii) once it is activated and tethered on the thiolation (T) domain of Lun20 (as proposed for polyoxypeptin synthesis) or after the formation and cyclization of the hexapeptide remains speculative at this stage.
To strengthen the reliability of the proposed structure of compound 1, the m/z of the MS/MS data were correlated to the structure of ions produced during the fragmentation mechanism ( Figure 4a). The m/z of nine fragment ions can only correlate with the structure of ions containing one to four Piz residues (fragments 1-9 in blue in Figure 4b). Four of these fragment ions-m/z 113.0711, 225.1347, 337.1984, and 449.2608-can be linked to structures containing respectively 1, 2, 3, and 4 Piz residues, expected to result from sequential b x -y z fragmentation steps. Additionally, m/z 309.2034 and 197.1394 result from the neutral loss of CO of the aforementioned ions m/z 337.1984 and 225.1347, respectively. At least seven fragment ions contain phenylalanine's aromatic side chain thereby confirming the presence of this amino acid in the structure of compound 1 (fragments 10-16 in green in Figure 4c). Fragment ions with m/z 260.1396, 372.2032, 484.2668, and 596.3304 arise from sequential b x -y z fragmentation steps and represent phenylalanine connected to increasing numbers of piperazate residues, whereas the remaining ions (m/z 232.1445, 456.2719, and 568.3356) are associated with a subsequent CO loss step. ment by the NRPS machinery [41]. Similarly, Lun21 would perform the N-hydroxylation of the Ile residue, which would lead to the structure proposed for compound 1 ( Figure  3b). Whether this N-hydroxylation of the Ile building block is performed (i) before its recruitment by the A domain of Lun20 (as suggested for coelichelin biosynthesis), (ii) once it is activated and tethered on the thiolation (T) domain of Lun20 (as proposed for polyoxypeptin synthesis) or after the formation and cyclization of the hexapeptide remains speculative at this stage.  It is interesting to note that no hydroxylation can be inferred from the seventeen fragments discussed above, confirming that the hydroxylation in compound 1 must reside in the remaining amino acid, i.e., isoleucine. This is confirmed by the presence of four fragments that contain a hydroxyl-isoleucine residue (m/z 214.1551, 242.1500, 347.1841, and 354.21366, fragments 17-20 in red in Figure 4d), all of which resulted from similar fragmentation steps, as discussed for the other ions (sequential b x -y z fragmentation and CO loss). Finally, the hydroxylation of compound 1 and the hypothesis of a hydroxylated isoleucine in its structure are further confirmed by the presence of four fragment ions seemingly resulting from the neutral loss of H 2 O (fragments 21-24 in orange in Figure 4e). The fragment ion with m/z 707.3989 is the result of water loss from the [M+H] + adduct and can generate the ion itself with m/z 679.4040 from CO loss. The two other fragment ions (m/z 196.1447 and 224.1394) would come from the loss of water from the previously mentioned isoleucine-bearing fragments m/z 214.1551 and 242.1500, respectively.

Structure Elucidation by NMR Spectroscopy
Finally, nuclear magnetic resonance (NMR) was performed in order to confirm the predicted and MS/MS-deduced structure of compound 1. Indeed, even though we can confidently propose the presence of a hydroxy-leucine residue in the structure of compound 1, the MS/MS information alone does not allow to specify the position of the hydroxyl group in this residue. The fragment structures proposed in Figure 4d assume a hydroxylation on the amide nitrogen of isoleucine, which can only be confirmed by NMR. Multiple media were tested (OSMAC approach [42] and known elicitors of secondary metabolite production in Streptomyces spp. [43][44][45]) to assess the optimal culture conditions for compound 1 production in order to obtain sufficient material for NMR analysis. An analysis of the metabolite profiles showed that the solid ISP1 medium supplemented with N-acetyl-Dglucosamine 25 mM led to the best production yield of compound 1 (not shown). The dried ethyl-acetate full extract of S. lunaelactis MM109 T was processed using the ÄKTA liquid chromatography purification system, followed by semi-preparative HPLC (SP-HPLC) purification, which resulted in 2.15 mg of a pale-yellow-white powder from~0.5 L of solid culture. UPLC-HRMS/MS revealed that this fraction contained two compounds with similar molecular weights, the most intense molecular ion peak being compound 1 (m/z of 725.4092), and a second molecular ion peak with an m/z of 723.3935 (compound 2). This m/z value suggests that compound 2 is a dehydrogenated derivative of compound 1 that possibly contains a dehydro-Piz moiety, as was described previously for antrimycins [46]. This hypothesis was confirmed by 1 H NMR results (see below) and by HRMS/MS analysis (see the following section on the structural diversity of lunaemycins).
To strengthen the reliability of the proposed structure of compound 1, the m/z of the MS/MS data were correlated to the structure of ions produced during the fragmentation mechanism (Figure 4a). The m/z of nine fragment ions can only correlate with the structure of ions containing one to four Piz residues (fragments 1-9 in blue in Figure 4b). Four of these fragment ions-m/z 113.0711, 225.1347, 337.1984, and 449.2608-can be linked to structures containing respectively 1, 2, 3, and 4 Piz residues, expected to result from sequential bx-yz fragmentation steps. Additionally, m/z 309.2034 and 197.1394 result from the neutral loss of CO of the aforementioned ions m/z 337.1984 and 225.1347, respectively. At least seven fragment ions contain phenylalanine's aromatic side chain thereby confirming the presence of this amino acid in the structure of compound 1 (fragments 10-16 in green in Figure 4c). Fragment ions with m/z 260.1396, 372.2032, 484.2668, and 596.3304 arise from sequential bx-yz fragmentation steps and represent phenylalanine connected to increasing numbers of piperazate residues, whereas the remaining ions (m/z 232.1445, 456.2719, and 568.3356) are associated with a subsequent CO loss step.  The semi-purified powder containing compound 1 and minor amount of compound 2 was submitted to 1 H NMR (700MHz), 13 C NMR (176 MHz) as well as 2D NMR (COSY, HMBC, and HSQC). 1D-and 2D NMR data were analyzed with MestReNova V.14 and allowed to confirm the chemical structure predicted by the in silico analysis of BGC 28a and deduced from the MS/MS fragmentation pattern. Table 2 shows the structure and NMR assignment proposed for compounds 1 and 2 (Figure 5a,b and Supplementary Figures S2-S6); however, the overlapping signals did not allow a complete structural assignment, as multiplicities and integrations were not unambiguously characterized.      Considering the chemical shifts observed in the 1 H and 13 C NMR spectra, signature signals typical of cyclopeptides were observed, revealing peptide bonds between amino acid residues with no amino or carboxy interruption of the amino acid sequence. 1 H NMR data revealed the presence of the six α hydrogens of the hexapeptide core between δ 4.8 and δ 5.8 that can be attributed to compound 1 (δ 5.82-5.76 for the Ile; δ 5.36-5.28 for the Phe; δ 4.92-4.84, 5.39, 5.45, and 5.75 for the Piz residues), and compound 2 (δ 5.96 for the Ile; δ 5.68 for the Phe; δ 4.92-4.84, 5.01, 5.36-5.28, and 5.67 for the Piz residues). Further confirmation comes from the 13 C-NMR signals between δ 170.3 and δ 174.1, attributed to the peptide carbonyl groups (C-6, C-13, C-19, C-25, C-31, C-41), and between δ 46.7 and δ 56.4, attributed to the α carbons (C-2, C-12, C-18, C-24, C-30, C-33).
Based on previously published data [47], a series of multiplets between δ 1.2 and δ 2.3 in the 1 H NMR spectra can be attributed the eight βand γ-methylene groups from the piperazate residues present in the structure of compound 1 (C-10, C-11, C-16, C-17, C-22, C-23, C-28, and C-29). Additionally, a series of 13 C NMR signals between δ 18.4 and δ 26.2 can be attributed to those same methylene groups. Similarly, the four δ-methylene groups (C-9, C-15, C-21, and C-27) can be correlated to the 1 H NMR signals between δ 2.9 and δ 3.1 and to the 13 C NMR signals between δ 46.9 and δ 47.3. Interestingly, a signal at δ 6.91 (d, J = 3.7 Hz, 1 H for compound 2) in the 1 H NMR spectra, which is correlated on the HSQC to the 13 C NMR signal at δ 142.9, can be attributed to a methylene hydrogen of an imine group. This observation supports the hypothesis of a replacement of a piperazate residue by a dehydro-piperazate in compound 2.
The presence of phenylalanine and isoleucine residues in the structure of compounds 1 and 2 is clearly detectable from the results of NMR experiments ( Table 2). The six aromatic carbons (C-35 to C-40) of the phenylalanine's benzene ring were observed at δ 137.7 (quaternary carbon C-35); δ 128.4 (two meta carbons, C-37 and C-39); 129.7, (two ortho carbons, C-36 and C-40); and δ 126.6 (the para carbon C-38). Also, the five aromatic hydrogens corresponding to this benzene ring were observed in 1 H NMR spectra between δ 7.1 and δ 7.2. The two hydrogens at meta position (linked to C-37 and C-39) were attributed to the signal at δ 7.24 and could be discerned from the three hydrogens on ortho (linked to C-36 and C-40) and para (linked to C-38), seen at δ 7.2-7.14. As for the non-aromatic carbons in those residues, we were able to attribute 13 C signals to the two methylenes (δ 25.2 for C-4 from the Ile and δ 37.7 for C-34 from the Phe), to the tertiary carbon C-3 from the Ile (δ 32.8), and to the two methyls, C-5 and C-7, from the Ile (δ 11.1 and 15.5, respectively). The upfield region of the 1 H NMR spectrum shows signals corresponding to the non-aromatic hydrogens of the minor (2) and major (1) compounds of the sample analyzed by NMR. The methylene hydrogen of isoleucine, linked to C-4, can be attributed to δ 1.02-0.94, while the methylene hydrogens of phenylalanine, linked to C-34, were attributed to δ 2.93-2.86 and 2.77-2.67. For the isoleucine, the hydrogen on C-3 was attributed to peaks between δ 2.12-2.03, and the methyl hydrogens linked to C-5 and C-7 were attributed to δ 0.82-0.76 and δ 0.84, respectively.
The observation of the non-substituted s-butyl group of isoleucine indicates that hydroxylation of 1 is not localized at its side chain, suggesting that the hydroxylation should take place at the amide nitrogen. Additionally, 2D NMR COSY and 1 H-13 C HMBC correlations, shown in Figure 5c and highlighting proton-proton and proton-carbon correlations respectively, further supported this conclusion. Another key observation is the presence of a single HMBC correlation between an alpha NH 1 H NMR signal at δ 8.27 (attributed to phenylalanine's NH) and an alpha carbonyl 13 C NMR signal at δ 172.7 (attributed to the neighboring piperazate carbonyl). Those observations allowed us to further confirm the N-hydroxylated structure previously proposed for Lunaemycin A.
In conclusion, combined in silico, HRMS/MS, and NMR analyses demonstrated that the main product of BGC 28a, compound 1, is a cyclic hexapeptide of a mass of 724.4019 Da and the molecular formula C 35 H 52 N 10 O 7 , corresponding to the cyclized amino acid sequence D-Phe-OH-L-Ile-D-Piz-L-Piz-D-Piz-L-Piz. We have interrogated StreptomeDB 3.0 [48] and The Natural Products Atlas 2.0 [49] databases and found no natural compounds with neither identical molecular formula, monoisotopic mass, nor molecular weight. Interrogation of databases with broader chemical compounds (NIST, Pubchem, and Chemspider) identified 'SCHEMBL12568990 (Substance SID: 237623651; Compound CID: 53234730 in the Pubchem database) retrieved from patent US-2011136752-A1 [50], as the closest match of compound 1, sharing the same mass, same molecular formula, and same proposed structure. We decided to call compound 1 lunaemycin A, with the Latin prefix lunae-(L. gen. n. lunae, of the moon) referring to the moonmilk speleothems where S. lunaelactis MM109 T was originally isolated and the suffix -mycin, which refers to antibiotic compounds derived from bacteria with fungus-like structure such as Streptomyces. The antibiotic activity of lunaemycins A and lunaemycin-related compounds will be described in Section 2.4.

Structural Diversity of Lunaemycins
The comparative metabolomic study of metabolites with masses ranging between 700 and 800 Da (Figure 2b) suggested a broad diversity of lunaemycin-derived molecules produced by S. lunaelactis strains containing pSLUN1a. Using the MS/MS data, we explored this diversity via global natural products social molecular networking (GNPS) as performed previously [22]. One constellation comprises 42 nodes associated with compounds only produced by the two selected S. lunaelactis strains that possess BGC 28a (MM37 and MM109 T ) and includes the ion at m/z 725 corresponding lunaemycin A (Figure 6).     Table 3), whereas other five lunaemycin derivatives were produced in too weak amounts for proper analysis. Table 3

Antibacterial Activity of Lunaemycins
Molecules produced by known BGCs closest to the lun cluster (himastatins, stenothricins, kutznerides, and aurantimycins) possess antimicrobial activities suggesting that lunaemycins could possibly display similar biological activities. Moreover, amongst the genes of the lun BGC, BLAST analyses also suggest that lunaemycins are likely to possess antibacterial and/or anti-proliferative activities. For instance, lun9 and lun10 encode for an efflux pump and its associated TetR-family transcriptional regulator, respectively, and whose homologues in other BGCs play a role in self-resistance of the producing organism. In addition, lun25 and lun26 and their closest homologues in the BGCs of aurantimycins (ArtJ and ArtK) and the polyoxipeptins (PlyJ and PlyK), encode for an ABC transporter system belonging to the DrrA-DrrB family also conferring self-resistance to the producers [51].
Antibacterial activity of lunaemycins A, B1, and D (0.1 mg/mL,~13.5-14 mM) was first performed via agar-diffusion assays which showed significant direct-acting activity on all tested Gram-positive bacteria except on Enterococcus hirae (though a small zone of growth inhibition is observed at the periphery of the disc suggesting more susceptibility to lunaemycins compared to gentamicin 0.1 mg/mL used as positive control) (Figure 8). In contrast, no growth inhibition was reported on Gram-negative bacteria (Figure 8). Overall, similar antibacterial activities could be observed for lunaemycins A and B1, whereas lunaemycin D showed weaker growth inhibiting activities. Minimal inhibitory concentration (MIC) and minimal bactericidal concentration (MBC) values were determined with pure lunaemycin A. The potent direct-acting activity observed in agar diffusion assays was confirmed, with MIC values ranging 0.12-0.25 µg/mL on type strains and, more interestingly, also on clinical isolates showing various resistance phenotypes ( Table 4). The MBC values were comprised between 0.25 and 8 µg/mL. Interestingly, the MBC values obtained with some antibiotic resistant clinical isolates, especially those showing resistance to methicillin, vancomycin, or linezolid, were overall lower (from 0.25 to 2 µg/mL) than that of other strains, indicating a potent bactericidal activity of luneamycin A for these strains. As largely reported in the literature (e. g., references [52,53]), some of these resistance mechanisms are associated with significant biological or fitness cost, potentially explaining the lower MBC values measured with these strains. Expectedly, no activity could be observed on both susceptible and resistant Gram-negative isolates, as previously observed via the agar-diffusion assays.  Minimal inhibitory concentration (MIC) and minimal bactericidal concentration (MBC) values were determined with pure lunaemycin A. The potent direct-acting activity observed in agar diffusion assays was confirmed, with MIC values ranging 0.12-0.25 µg/mL on type strains and, more interestingly, also on clinical isolates showing various resistance phenotypes ( Table 4). The MBC values were comprised between 0.25 and 8 µg/mL. Interestingly, the MBC values obtained with some antibiotic resistant clinical isolates, especially those showing resistance to methicillin, vancomycin, or linezolid, were overall lower (from 0.25 to 2 µg/mL) than that of other strains, indicating a potent bactericidal activity of luneamycin A for these strains. As largely reported in the literature (e. g., references [52,53]), some of these resistance mechanisms are associated with significant biological or fitness cost, potentially explaining the lower MBC values measured with these strains. Expectedly, no activity could be observed on both susceptible and resistant Gram-negative isolates, as previously observed via the agar-diffusion assays.

Bacterial Strains and Culture Conditions
All bacterial strains used in this study are listed in Table 5. The archetype strain Streptomyces lunaelactis MM109 T [20,21] and all other S. lunaelactis strains used in this study were isolated from moonmilk deposits of the cave 'la grotte des collemboles' (Comblain-au-Pont, Belgium) [9,10]. Streptomyces spores and mycelium stocks were prepared as described in [54]. Cultivation media were prepared as described in [54], and inoculated plates were incubated for a week at 28 • C.

Compound Identification by Ultra-Performance Liquid Chromatography-Tandem Mass Spectrometry (UPLC-MS/MS)
Solid media inoculated with S. lunaelactis were cut into small pieces and mixed overnight in agitator with an equal volume of ethyl acetate to extract metabolites. After centrifugation at 4000 rpm for 20 min, the supernatant was evaporated to dryness by a rotavapor ® (IKA RV10 digital, VWR, Radnor, PA, USA) at 25 • C under 210 rpm and suspended in 2 mL of acetonitrile. Extracts were analyzed by Ultra-Performance Liquid Chromatography-High Resolution Mass Spectrometry (UPLC-HRMS) and by Ultra-Performance Liquid Chromatography-Tandem Mass Spectrometry (UPLC-MS/MS). Lunaemycins were identified according to their exact mass, the isotopic pattern, their MS/MS spectrum of the molecular ion HCD fragmentation, and the UV-VIS absorbance spectra. The detailed protocols for lunaemycins extraction and purification are detailed in Point 3.4. and Appendix A

Nuclear Magnetic Resonance (NMR)
The production and purification of lunaemycins was performed as follows: the ethyl acetate extract of S. lunaelactis MM109 T grown on 25 agar plates of ISP1 + GlcNAc (~500 mL volume) was dried with rotary evaporator (Rotavapor ® R-100) which allowed to obtain 126 mg of a pale yellow powder, re-dissolved in 60 mL of an acetonitrile/water (MilliQ filtered) (10:3, v/v) solution, and fractioned on a C18 LC column (Phenomenex) using an ÄKTA™ purification system. Metabolite elution was performed by using a gradient of increasing concentration of acetonitrile (LC method, Appendix A), and each of the 90 fractions were analyzed by HPLC (HPLC method, Appendix B) to detect peaks corresponding to lunaemycins. Fractions containing the compounds of interest (fractions 35, 36, and 37 to 41) were subjected to a second round of purification using semi-preparative HPLC (SP-HPLC method, Appendix C). Fractions 37 to 41 were pooled, and the volume of each fraction was reduced to 2 mL by evaporation (under gas flow), which served for twenty semi-preparative HPLC injections with 100 µL of each fraction. From fraction 35, we generated 21 sub-fractions among which 4 of them contain lunaemycin D (Table 3) Figure 4a). The fractions containing a single chromatographic peak were pooled and evaporated under continuous gas flow and resuspended in 1 mL of water/acetonitrile (1/9, v/v) solution and diluted to reach a concentration of 0.1 mg/mL for subsequent antibacterial assays (see point 4.5 below).
The pure dried extract mainly containing lunaemycins A (compound 1) was dissolved in 250 µL of hexadeuterodimethylsulfoxide (DMSO-d 6 ) containing 1% (v/v) of tetramethylsilane (TMS) as a reference (Sigma-Aldrich, St. Louis, MO, USA) and transferred in a 3 mm NMR tube (103.5 mm long, Bruker). NMR spectra were acquired using a Bruker Avance NEO 700 MHz spectrometer equipped with a cryoprobe and Bruker's TopSpin 3.5 software package, in order to perform 1 H, 13 C (APT), COSY, 1 H-13 C HMBC, and 1 H-13 C HSQC experiments (standard Bruker parameters), and data were interpretated using the MestreNova V.14 software. Sample temperatures were controlled with the variable-temperature unit of the instrument.

Antibacterial Assays
Pure lunaemycins A, B1, and D were obtained as described above in point 4.4. Agardiffusion tests were implemented as described previously [24]. Briefly, few microliters of an overnight culture (10 µL of the bacterial stock spread on the solid LB or BH media) were used to make bacterial suspensions with standardized turbidity (OD600 between 0.08 and 0.1, corresponding to a 0.5 McFarland turbidity standard). A total of 100 µL of the standardized bacterial suspension were then inoculated on solid LB or BH media, and 3 mm thick Whatman ® papers discs (Sigma-Aldrich, St. Louis, MO, USA) containing 10 µL (2 × 5 µL) of antibiotics are placed on the freshly inoculated solid medium. After overnight incubation at 37 • C, the antibacterial activity was evaluated by measuring the diameter of growth inhibition. Control conditions were performed using 100 µg/mL gentamicin as a positive control (corresponding to 1 µg of gentamicin placed on the positive control disc), and acetonitrile, DMSO or water as a negative control. Additional antibacterial susceptiblity assays were performed at the High-Throughput Screening and Protein Engineering platform (HiProtEn, Dipartimento di Biotecnologie Mediche-Università di Siena). Minimal inhibitory concentration (MIC) and minimal bactericidal concentration (MBC) values were determined in Mueller-Hinton broth, as recommended by CLSI (document M07-A10, 2015 [57]) and the European Committee on Antimicrobial Susceptibility Testing (EUCAST document "Terminology relating to methods for the determination of susceptibility of bacteria to antimicrobial agents", 2000 [58]) on reference (type) strains and clinical isolates as indicated (Table 5).

Conclusions
Herein, we report the discovery and structural elucidation of lunaemycins: new, cyclic hexapeptide antibiotics only produced by S. lunaelactis species that possess the linear plasmid pSLUN1a. The potent antibacterial activity of lunaemycins against antibiotic resistant Gram-positive bacteria may in part explain how S. lunaelactis strains have found their own place in extremely oligotrophic and competitive environments such as moonmilk deposits. Next to the proposed structure of 15 lunaemycin derivatives, we also propose the biosynthetic pathway for lunaemycin A production, the main compound of the lun BGC of S. lunaelactis MM109 T (Figure 9). clic hexapeptide antibiotics only produced by S. lunaelactis species that possess the linear plasmid pSLUN1a. The potent antibacterial activity of lunaemycins against antibiotic resistant Gram-positive bacteria may in part explain how S. lunaelactis strains have found their own place in extremely oligotrophic and competitive environments such as moonmilk deposits. Next to the proposed structure of 15 lunaemycin derivatives, we also propose the biosynthetic pathway for lunaemycin A production, the main compound of the lun BGC of S. lunaelactis MM109 T (Figure 9). Figure 9. Proposed biosynthetic pathway of lunaemycin A. This model suggests that the Ile will be either N-hydroxylated before activation and recruitment by the A domain (like proposed for the coelichelin pathway) or hydroxylated once activated and tethered on the thiolation domain of Lun20 (like proposed for polyoxypeptin synthesis).
Our work provides a good example in which a deep in silico analysis of genes that compose a BGC combined with an exhaustive literature overview of homologous genes allowed us to maximize the accuracy of the predicted structure of a cryptic compound. This in silico approach highly facilitated the comparative mass spectrometry analysis by guiding the examination on a limited series of ions that possessed the appropriate fragmentation tags. Next to core biosynthetic genes, the lun BGC also provides an interesting case where genes required for the synthesis of the building block piperazic acid from ornithine are also present within the BGC itself, and even also the genes that convert the Figure 9. Proposed biosynthetic pathway of lunaemycin A. This model suggests that the Ile will be either N-hydroxylated before activation and recruitment by the A domain (like proposed for the coelichelin pathway) or hydroxylated once activated and tethered on the thiolation domain of Lun20 (like proposed for polyoxypeptin synthesis).
Our work provides a good example in which a deep in silico analysis of genes that compose a BGC combined with an exhaustive literature overview of homologous genes allowed us to maximize the accuracy of the predicted structure of a cryptic compound. This in silico approach highly facilitated the comparative mass spectrometry analysis by guiding the examination on a limited series of ions that possessed the appropriate fragmentation tags. Next to core biosynthetic genes, the lun BGC also provides an interesting case where genes required for the synthesis of the building block piperazic acid from ornithine are also present within the BGC itself, and even also the genes that convert the glutamate and glutamine precursors to ornithine. Such a situation was previously observed, notably in the BGC for quinichelin biosynthesis [59], and in many NRPS BGCs that necessitate Piz as building block [37]. This inclusion of genes/enzymes of metabolic pathways of primary metabolism within a BGC is most likely mandatory to provide sufficient building blocks at the proper timing of specialized metabolite production.
Another main take-home message of our work relates to the use of multiple strains of the same species to facilitate the correlation between the genetic material associated with the biosynthesis of natural compounds. The current tendency in strain prioritization is instead to try to exclude from a collection the multiple strains that belong to the same species in order to avoid the identification of the same bioactive molecules [22]. Here, we showed how strain redundancy instead facilitated our work, even avoiding the necessity to generate null mutants to link a BGC to its natural product. This is particularly interesting when a BGC is present on a linear plasmid whose presence in multiple copies complicates gene inactivation (about three copies for pSLUN1a in S. lunaelactis MM109 T [21]). Our attempts to delete genes of BGC 28a have been unsuccessful so far. We are currently applying the same strategy to identify and assess the biological activity of the cryptic products of BGC 28b found in the linear plasmid pSLUN1b only present in the second subgroup of S. lunaelactis strains [22].