2.1. Isolation and Structure Elucidation of Auricin
To isolate auricin (
1), the growth conditions previously published [
7,
9] were optimized for the highest titers of
1 and an isolation procedure using ethylacetate extraction of culture medium, a silica flesh column and preparative RP-18 HPLC was optimized to overcome problems with the stability of
1 (see Materials and Methods). This procedure resulted in the isolation of 4.5 mg of pure compound
1 from a 500 ml culture, which was verified by analytical RP-18 HPLC as described previously [
7]. As previously described [
7,
9],
1 is a yellow compound with a molecular formula of C
28H
31NO
10, based on high-resolution ESI MS, which gave an ion peak at
m/
z 542.20187 [M+H]
+ (calc 542.20262 for [M+H]
+). ESI MS/MS fragmentation of this ion produced only a single dominant signal at
m/
z 142.12259 [M+H]
+, corresponding to
d-forosamine. The UV spectrum of
1, with maxima at 213, 253, and 436 nm, suggested a
peri-hydroxy quinone chromophore, which was also confirmed by the appearance of a violet color after the addition of sodium hydroxide.
Extensive NMR experiments (
1H,
13C, COSY, TOCSY, H2BC, HMBC, and selective 1D TOCSY and 1D NOESY) were used to establish the structure of compound
1. The NMR data of
1 are summarized in
Supplementary Table S1 and its structure is shown in
Figure 1. 1D and 2D
1H and
13C NMR spectra also appear in the
Supplementary data.
The
1H NMR spectrum (
Supplementary Figure S1) exhibited three aromatic methine proton signals, an OH signal at δ
H 12.00 (OH-9), signals of protons from three methyl, two methylene, and four aliphatic methine groups, and overlapping signals in the up-field region δ
H 2.1-0.5 ppm. Two dimensional (2D)
1H-
1H homocorrelated COSY (
Supplementary Figure S2) and TOCSY (
Supplementary Figures S3 and S4) spectra revealed four coupled spin systems. The first, an ABX spin system, consisted of three aromatic protons at δ
H 7.692, 7.683, and 7.361, which were attributed, respectively, to H-6, H-7, and H-8 of ring A. The second spin system involved methylene protons (δ
H 2.884, 2.727; H-2a,b) coupled to an oxygenated methine proton (δ
H 5.351; H-3), which was further coupled to another proton on the oxygenated methine carbon (δ
H 5.360; H-4) nearby. Both methine signals (H-3 and H-4) overlapped. In the third spin system, a proton located on an oxygenated methine carbon (δ
H 4.562; H-15) was coupled to protons from methyl (δ
H 1.353; H-16) and methylene (δ
H 2.645, 2.604; H-14a,b) groups. The fourth spin system started with the anomeric proton H-1′ (δ
H 4.854) of
d-forosamine showing only one three bond coupling
3J
H1,H2 = 3Hz. According to the Karplus equation, this value suggests an equatorial orientation of one of two methylene protons at C2´, while the angle of the second proton with H1´should be between 120° to 60°. The COSY spectrum showed an interaction from H1´ to both methylene protons H2´a and H2´b (δ
H 2.05 and 1.66 ppm). The whole spin system was identified using selective 1D TOCSY after an irradiation of H1´, showing locations of H3´a and H3´b (δ
H 1.568 and 1.278 ppm), H4´ at δ
H 2.08, H5´at δ
H 3.108 and H6´ at δ
H 0.79 ppm (
Supplementary Figure S4). Their carbon chemical shifts were assigned using a multiplicity-edited 2D
1H-
13C heterocorrelated HSQC spectrum (
Supplementary Figure S6). The
d-forosamine NMR data of
1 are in very good agreement with those published for 3´O-α-
d-forosaminyl- (+)-griseusin A (
3) (
Supplementary Table S2) [
15].
In agreement with the C
28H
31NO
10 molecular formula of compound
1 found by ESI-MS, in the
13C NMR spectrum (
Supplementary Figure S5) twenty-seven signals (the two 7′ carbons of
d-forosamine are magnetically equivalent, affording only one signal) were observed. The
13C spectrum also shows that a low amount of impurities (<20%) are present, which arise from the instability and the decomposition of
1, and their signals are observed in the region δ 55–10 ppm. An edited HSQC (
Supplementary Figure S6) showed that ten signals belonged to quaternary carbons. Three of them have the characteristic chemical shifts of keto-carbonyls (δ
C 181.45, 186.55, 202.26), and one was due to a carboxylic acid group (δ
C 173.75). Two oxygenated quaternary carbons had aromatic (δ
C 162.27) and aliphatic (δ
C 98.91) character. The signal at δ
C 162.27 was attributed to C-9 (of the A ring) due to its specific chemical shift. Of the proton-bearing carbons, three were attributed to aromatic, non-oxygenated methine carbons (δ
C 125.70, 137.29 and 119.59, A ring ABX spin system). The remaining signals arose from seven aliphatic methine, four methylene, and four methyl carbons. The A ring signals were assigned based on COSY, H2BC and HMBC correlations. Two-bond correlations to protonated carbons in the H2BC spectrum (
Supplementary Figure S7) revealed the connectivity of the aromatic protons H-6, H-7 and H-8: H-7 (δ
H/C 7.683/137.29) is correlated to C-6 (δ
C 119.59) and C-8 (δ
C 125.70), while the correlations of both H-6 (δ
H/C 7.692/119.59) and H-8 (δ
H/C 7.361/125.70) led to C-7 (δ
C 137.29) only. Further, long-range interactions observed in the HMBC spectrum (
Supplementary Figures S8 and S9) suggested the location of hydroxyl group OH-9 at C-9 and confirmed the structure of the neighboring B ring. In particular, strong correlations from OH-9 to C-8 (δ
C 125.70), C-9 (δ
C 162.27), and C-9a (δ
C 114.92), and weak ones to C-7 (δ
C 137.29) and C-10 (carbonyl, δ
C 186.55) were observed. The H-6 and H-7 correlations to the second carbonyl (C-5, δ
C 181.45) confirmed the quinone-like structure of ring B. The C-4a and C-10a assignments to signals at δ
C 138.05 and 140.68 ppm, respectively, were confirmed by HMBC correlations (described later). The carbon chemical shifts of the A and B rings (
Supplementary Table S1) were found to agree very well with the published data for angucycline compounds containing the same structural fragment [
16].
Inspection of the NMR spectra revealed the presence of methane aliphatic group structural reporter H/C signals, showing particular chemical shifts due to the near-by oxygen. They belong to the spin systems of three rings: an isolated H-12/C-12 (δ
H/C 5.378/80.85, ring E); H-3/C-3 signals (δ
H/C 5.348/71.31, ring D) and H-4/C-4 (δ
H/C 5.360/68.45, ring D); H-15/C-15 at δ
H/C 4.562/69.28 (ring E) and, finally, H-1/C-1 chemical shifts δ
H/C 4.856/95.94, characteristic of carbohydrates, identified as the anomeric signal of the
d-forosamine. Very strong cross-peaks were observed in the HMBC spectrum from the H-2a and H-2b methylene protons (δ
H 2.884, 2.727) to C-1, C-3, C-4, and quaternary C-11 and weak ones to C-5, C-4a (
Supplementary Figure S10). Both downfield shifted H-3 and H-4 signals correlated with carboxyl group C-1 (δ
C 173.75), however, the H-3/C-4a and H-4/C-10a correlations were the most intense from these protons (
Supplementary Figure S11). All these correlations were important for elucidating the structure of the C ring. The data obtained (
Supplementary Table S1) confirmed that the -
1COO-
2CH
2-
3CH(O)-
4CH(O)- fragment is located in close proximity to the quaternary C-4a and C-5 carbonyl (ring B). Furthermore, the fact that the H-4/C-4 atoms are oxygenated indicated that this structural fragment forms the ring D by joining C-4 and carboxyl C-1 via a carboxylic oxygen. In addition, the strong H2-a/C-11 correlation confirmed closure of the 6-membered ring C via oxygen at C-3 to C-11 (δ
C 98.91).
NMR data also showed that the oxygenated H-15/C-15 signals (δ
H/C 4.562/69.28) belong to another structural fragment, which also includes H-14 and H-16. These protons show multiple mutual HMBC correlations, but all three correlated with the C-13 carbonyl (δ
C 202.26), resulting in a second fragment with a -
13CO-
14CH
2-
15CH(O)-
16CH
3 structure. The H-14/C-12 correlation together with that of H-12 to C-11 and C-10a (
Supplementary Figure S10), and the fact that CH-15 is oxygenated, indicates that the 6-membered ring E must be closed at quaternary C-11. The correlations of H-15 and H-16 to the quaternary C-11 supported such structure. The unusual chemical shifts of the isolated oxygenated H-12/C-12 signals indicated that the aglycone is linked to
d-forosamine at C-12. This conclusion was supported by a through-space interaction between H-12 and H-1´in the NOESY and ROESY spectra (
Supplementary Figures S12 and S13) and a C1´/C12 correlation in the HMBC (
Supplementary Figure S10). The results of the HMBC, COSY, and NOESY experiments on
1 are summarized in
Supplementary Figure S14 and Table S1.
Supplementary Figure S15 shows a model of the 3D structure of compound
1 prepared with Chem3D Pro using a simple MM2 force field energy optimization. In this model, the NOE interactions observed in the NMR spectra could be observed. The absolute configurations of the chiral centers are C11(S), C12(S), C1´(R) and C4´(S). The characteristic vibrations found in the infrared spectrum also support the proposed structure of
1 (
Supplementary Figure S16 and Table S3).
In addition to the presence of
d-forosamine, which is unique among the known angucyclines [
9], the resulting structure of
1 also has an intriguing, highly oxygenated aglycone. The
peri-hydroxy quinone rings A and B are characteristic of these types of compounds (
Figure 1), however, to our knowledge, the three lactone rings, including the [6,6] spiroketal moiety of rings C and E, are unique among angucyclines. Interestingly, the structure of the auricin aglycone strikingly resembles those of the griseusins, e.g., 3′-O-α-
d-forosaminyl-(+)-griseusin A (
2) [
15] and 4´-dehydro-deacetylgriseusin A (
3) [
17] (
Figure 1). However, griseusins are a subgroup of the pyranonaphthoquinone family of aromatic polyketides, which contain a [6,6] spiroketal ring system fused to a juglone moiety (
Figure 1). Like
1, a γ-lactone moiety is present in both
2 and
3, as well as in some other members of the pyranonaphthoquinone family, e.g., kalafungin (
4) [
18]. In accord with this resemblance, a simple comparison of the
1H and
13C NMR data for
1,
2, and
3 showed that the shifts of the of aglycone part of these molecules are nearly identical, including those of the critical spiroketal two-oxygen-bearing quaternary carbon (
Supplementary Table S2). The structural γ-lactone motif of ring D is identical in all three compounds. In
1 and
2 the aglycone is different at position 13 (carbonyl) and 4´(acetyl), respectively, influencing the chemical shifts of H-12 in
1 and H-3´in
2. In
1 and
3 the aglycone structures are identical, but
d-forosamine is missing at position 3´ in the E ring of
3. These small differences are also reflected in the slight variations of the chemical shifts of most signals. An important change was observed in the γ-lactone (ring D), particularly at H-3 in
1 and H-10 in
3 (
Supplementary Figure S14), probably caused by the interaction of H-3 (ring D) with H-3´ of
d-forosamine, which is absent in
3. The differences in the chemical shift of quaternary C-11 might be caused by the different absolute configurations of this chiral center, S in
1 and R in
3. As mentioned above, the spectral data of
d-forosamine in both
1 and
2 were consistently nearly identical, and both the UV and IR spectra of
1 and
2 are also very similar [
15]. The data obtained surprisingly suggest that compound
1, previously thought to be an angucycline-like antibiotic based on genetic data [
3,
4,
5,
6,
7,
8,
9], belongs instead to the pyranonaphthoquinone subfamily of griseusins. Based on the above results, the structure of
1 corresponds to the systematic name (2’S,3aR,3’S,6’S,11bR)-3’-(((2R,6R)- 5-(dimethylamino)-6-methyltetrahydro-2H-pyran-2-yl)oxy)-7-hydroxy-6’-methyl-3,3a,5’,6’-tetra- hydrospiro[benzo[g]furo[3,2-c]isochromene-5,2’-pyran]-2,4’,6,11(3’H,11bH)-tetraone. It is a new O-α-
d-forosaminyl-dehydro-deacetylgriseusin.
2.2. Methanolysis of Auricin
We encountered difficulties purifying
1 in silica columns when using methanol as a solvent. Compound
1 is always degraded under these conditions. This problem was overcome by using ethanol as a solvent as described in the purification procedure. As described previously [
6], purified compound
1 is stable for at least 6 days in acetonitrile and ethanol. Therefore, we similarly tested the stability of purified
1 in methanol with and without silica. HPLC analysis showed that
1 (Rt = 8.6 min) is gradually converted to a new peak at Rt = 7.2 min. This conversion was significantly accelerated by the presence of silica, and after 24 h almost all
1 was converted to the new peak (
Supplementary Figure S17a). This new peak was isolated, and a high-resolution ESI MS analysis produced a molecular ion [M+H]
+ at
m/
z 574.2246, which corresponds to a molecular formula of C
29H
35NO
11 (calc 574.2288 for [M+H]
+) (
Supplementary Figure S17b). Thus, there is an increase in mass of 32 amu, indicating the addition of methanol (methanolysis) to
1. ESI MS/MS fragmentation of this ion was performed to determine the position of the methanol addition. As described previously [
9], the molecular ion [M+H]
+ of
1 (
m/
z = 542.2043 [M+H]
+) produces only a single dominant signal at
m/
z 142.1 [M+H]
+ after ESI MS/MS fragmentation, corresponding to
d-forosamine; however, the molecular ion [M+H]
+ (
m/
z = 574.2246) of the new compound produced an identical
d-forosamine ion. This indicates that the reaction occurs in the aglycone moiety of
1.
Semisynthetic studies with several griseusins, including
3, provided new derivatives by opening the γ-lactone ring D using acid-catalyzed methanolysis. Interestingly, this reaction was also observed as an unwanted side reaction during the chromatographic isolation of
3. In this way 4’-dehydro-9-hydroxy-deacetylgriseusin B methyl ester was obtained from
3 [
17]. The difference in mass of both compounds is also 32 amu, identical to our case. Interestingly, they also used a silica column and methanol as a solvent for the isolation of griseusins, conditions that are not suitable for the isolation of compound
1, as stated above. Since
3 is almost identical to the aglycone of
1 (
Figure 1,
Supplementary Table S2), we conclude that an identical methanolysis likely occurs, in which the γ-lactone ring D in
1 is opened, resulting in methoxyauricin (
Supplementary Figure S17c). The catalytic effect of silica on this methanolysis no doubt arises from the weak acidic properties of silica.
2.3. Bioinformatic Analysis of the Auricin Cluster
Since the structure of
1 suggests, surprisingly that
1, previously expected to be an angucycline-like antibiotic, actually belongs to the pyranonaphthoquinone subgroup of griseusins, we thoroughly analyzed the
aur1 cluster and its flanking regions in the large linear plasmid pSA3239 located in
S. lavendulae subsp.
lavendulae CCM 3239. The central part of the
aur1 cluster for
1 contains the initial biosynthetic genes
aur1C,
aur1D,
aur1E,
aur1F,
aur1G,
aur1H, which are homologous to the angucycline genes encoding a cyclase (CYC), ketosynthase α (KSα), ketosynthase β (KSβ), acyl carrier protein (ACP), ketoreductase (KR), and aromatase (ARO), respectively. In addition, the organization of these genes, including the angucycline-specific CYC gene
aur1C, is strictly conserved in all angucycline clusters (
Supplementary Figure S18) [
2,
5]. Deleting the initial biosynthetic gene of
1,
aur1D, which encodes KSα [
3], and the
aur1 cluster from
sa22 to
aur1V [
4], results in the absence of
1, thus clearly confirming the role of these
aur1 core genes in the biosynthesis of
1. Only one partial biosynthetic gene cluster (
gris) for the biosynthesis of griseusin A and B in
S. griseus K-63 [
14] has been described in the literature or databases. Interestingly, it is highly similar both in its gene organization and in the proteins it encodes to the core
aur1 gene cluster (
Figure 2a). Moreover, our analysis of adjacent
gris regions revealed two new incomplete ORFs, Gris-ORFX and Gris-ORFY, which were highly similar to Aur1C CYC and Aur1I oxygenase, respectively (
Figure 2a,
Supplementary Figures S21 and S22).
Interestingly, although griseusins clearly form a subgroup of pyranonaphthoquinones on the basis of their structures [
19], a phylogenetic analysis of their aromatic polyketide synthases KSα and KSβ localized both of the initial biosynthetic proteins for griseusin (Gris-ORF1 and Gris-ORF2) to the angucyclines clade and not the pyranonaphthoquionoines clade [
20]. We performed a similar phylogenetic analysis with the KSα (Aur1D) and KSβ (Aur1E) of
1 with the KSα and KSβ (Gris-ORF1, Gris-ORF2) of griseusin and several representative KSα and KSβ proteins from the major groups of aromatic polyketides. Both griseusin KSα and KSβ were clearly placed in the angucycline clade and they most resembled the KSα and KSβ of
1 of all the angucyclines (
Figure 2b).
These genetic data, together with the structural analysis, therefore confirm that the
aur1 cluster responsible for biosynthesis of
1 is involved in the biosynthesis of a griseusin-like compound and that both
1 and griseusins are likely to be synthesized in an angucycline-like biosynthetic pathway, at least in the early stages. Thus, the
aur1 cluster (
Supplementary Figure S18) from
S. lavendulae subsp.
lavendulae CCM 3239 represents the first complete biosynthetic gene cluster for a griseusin. To avoid confusion, we would like to continue using the original name for
1 in the future, since 14 papers have been published on its genetic and regulatory properties. The structure of
1 also clarified a number of the surprising properties of the
aur1 cluster. Although, as described above, its core is similar to the angucycline biosynthetic gene clusters, unlike these clusters, it contains many putative tailoring biosynthetic genes encoding oxygenase and dehydrogenase homologs scattered in areas quite distant from the
aur1 core. In addition, we recently found that several
d-forosamine biosynthetic genes also appear in in a region rather distant from the core of the
aur1 cluster (
Supplementary Figure S18) [
9]. Some of these scattered biosynthetic genes are similar to pyranonaphtoquinones biosynthetic genes. They are absent from other angucycline gene clusters and were found to be under the control of auricin-specific positive regulators [
5,
9]. They can, therefore, participate in the biosynthesis of
1 and are responsible for this intriguing structure. Deletion analysis of these genes and further studies on the biosynthesis of
1 are in progress.