Genome-Guided Discovery of Pretilactam from Actinosynnema pretiosum ATCC 31565

Actinosynnema is a small but well-known genus of actinomycetes for production of ansamitocin, the payload component of antibody-drug conjugates against cancers. However, the secondary metabolite production profile of Actinosynnema pretiosum ATCC 31565, the most famous producer of ansamitocin, has never been fully explored. Our antiSMASH analysis of the genomic DNA of Actinosynnema pretiosum ATCC 31565 revealed a NRPS–PKS gene cluster for polyene macrolactam. The gene cluster is very similar to gene clusters for mirilactam and salinilactam, two 26-membered polyene macrolactams from Actinosynnema mirum and Salinispora tropica, respectively. Guided by this bioinformatics prediction, we characterized a novel 26-membered polyene macrolactam from Actinosynnema pretiosum ATCC 31565 and designated it pretilactam. The structure of pretilactam was elucidated by a comprehensive analysis of HRMS, 1D and 2D-NMR, with absolute configuration of chiral carbons predicted bioinformatically. Pretilactam features a dihydroxy tetrahydropyran moiety, and has a hexaene unit and a diene unit as its polyene system. A preliminary antibacterial assay indicated that pretilactam is inactive against Bacillus subtilis and Candida albicans.


Introduction
Genome-guided discovery of secondary metabolites is a powerful strategy for rational identification of novel compounds from microorganisms [1,2]. Recent advances in DNA sequencing technologies allow rapid and low-cost sequencing of microbial genomic DNA, and bioinformatics tools (antiSMASH, ClusterMine360, and NORINE, etc.) are also developed for detection of various gene clusters for secondary metabolites (even their structures) in microbial genomes, which greatly facilitate the de-replication of known metabolites and targeted mining of new ones in the chemistry study of secondary metabolites produced by microorganisms [3][4][5]. The gene clusters predicted by the comprehensive antiSMASH may provide valuable information about secondary metabolite production profiles of microbial strains analyzed.
To get a deep insight into the secondary metabolite biosynthetic potential of ATCC 31565, we sequenced its genomic DNA to direct our chemical investigation of secondary metabolites for new compound(s). Herein, the genome-guided identification of a novel 26-membered polyene macrolactam from ATCC 31565 is described as below.

ATCC 31565 Contains a Gene Cluster for Polyene Macrolactam
Genome sequencing and assembly resulted in a linear chromosome DNA of 8,131,271 bp for ATCC 31565. AntiSMASH analysis of the chromosome DNA revealed over 20 gene clusters for various secondary metabolites, including the expected gene cluster for ansamitocin biosynthesis. Among them, an 80 kb gene cluster for NRPS-PKS showed very high percentages of similar genes to mirilactam and salinilactam gene clusters. Specifically, the gene cluster contained 5 genes for type I modular polyketide synthases (PKS), 1 gene for cytochrome P450 oxidase, and a set of 7 genes for 3-aminobutyrate (the starter of some polyene macrolactams) biosynthesis. The enzymes/proteins encoded by these genes showed ≥96% amino acid sequence identities to those for mirilactam biosynthesis. In particular, the 11 modules of the five PKSs revealed the same or very similar domain organization to PKSs for mirilactam or salinilactam biosynthesis ( Figure 1A,B, Table 1). Thus, the gene cluster should be responsible for the biosynthesis of polyene macrolactam(s) very similar or identical to mirilactam or salinilactam. The gene cluster was designated as plm, and deposited in GenBank with the accession number MK341065.

The Putative Polyene Macrolactam(s) was Detected by LC-MS
The fermentation culture of ATCC 31565 was extracted with methanol or ethyl acetate, and the extracts were analyzed by LC-MS. The LC traces revealed peaks with UV absorption profiles very similar to mirilactam A ( Figure 2). The hyphenated MS spectra of these peaks displayed molecular ions m/z 456 (the same as that of mirilactam A) and/or 438 [M + H] + . Thus, putative polyene macrolactams were found in ATCC 31565.

The Putative Polyene Macrolactam(s) was Detected by LC-MS
The fermentation culture of ATCC 31565 was extracted with methanol or ethyl acetate, and the extracts were analyzed by LC-MS. The LC traces revealed peaks with UV absorption profiles very similar to mirilactam A ( Figure 2). The hyphenated MS spectra of these peaks displayed molecular ions m/z 456 (the same as that of mirilactam A) and/or 438 [M + H] + . Thus, putative polyene macrolactams were found in ATCC 31565.

Structure of the Polyene Macrolactam (1, Pretilactam) was Elucidated by NMR
Compound 1 is a white amorphous solid. Its molecular formula was established as C27H35O4N by HRMS ( Figure S1 The COSY and TOCSY spectra of 1 revealed two spin systems. The first spin system (C-2 to C-15) contained 4 oxymethines, 2 methylenes, and two diene fragments (C-2 to C-5, C-12 to C-15). The COSY correlations between H-8 (δH 3.13) and an active hydrogen (δH 4.58, , and between H-9 (δH 3.90) and another active hydrogen (δH 4.53, 10-OH), established two hydroxymethines: C-8 (δC 71.4) and C-9 (δC 67.0). The chemical shifts of δC 71.0 (δH 3.62) and δC 66.8 (δH 4.58) disclosed the other two oxymethines at C-7 and C-11, respectively. The HMBC correlation from H-7 to C-11 indicated that the two oxymethines must share an oxygen (as a bridge to join C-7 and C-11), and this was also supported by the molecular formula requirement of 1, i.e., two oxygens in hydroxys, one oxygen in
The COSY and TOCSY spectra of 1 revealed two spin systems. The first spin system (C-2 to C-15) contained 4 oxymethines, 2 methylenes, and two diene fragments (C-2 to C-5, C-12 to C-15). The COSY correlations between H-8 (δ H 3.13) and an active hydrogen (δ H 4.58, , and between H-9 (δ H 3.90) and another active hydrogen (δ H 4.53, 10-OH), established two hydroxymethines: C-8 (δ C 71.4) and C-9 (δ C 67.0). The chemical shifts of δ C 71.0 (δ H 3.62) and δ C 66.8 (δ H 4.58) disclosed the other two oxymethines at C-7 and C-11, respectively. The HMBC correlation from H-7 to C-11 indicated that the two oxymethines must share an oxygen (as a bridge to join C-7 and C-11), and this was also supported by the molecular formula requirement of 1, i.e., two oxygens in hydroxys, one oxygen in carbonyl, thus only one oxygen left for two oxymethines. Therefore, a dihydroxy tetrahydropyran moiety was determined in 1. The HMBC correlations from H-2 and H-3 to the carbonyl at δ C 166.2 and NOESY correlation between H-2 and NH proved that the diene fragment of C-2 to C-5 conjugated with an amide group. The HMBC correlations from H-14 (δ H 6.70) to the nonprotonated olefinic carbon δ C 135.7 (C-16), H 3 -26 (δ H 1.70) to the olefinic carbon C-17 (δ C 130.9), and NOESY correlation between H-14 and C-26 methyl group (δ C 13.1, δ H 1.70 s) further extended the diene fragment of C-12 to C15 to a triene fragment of C-12 to C-17. The second spin system (C-27, C-25 to C-22) contained a methyl, a methine, a methylene, and two sp2 methines. The COSY correlation between the methine (δ H 3.82, H-25) and an active hydrogen (δ H 7.35, -NH-), together with the chemical shifts of the methine (δ H 3.82, δ C 44.7), demonstrated that C-25 connected to the nitrogen atom. The HMBC correlation from NH group (δ C 7.35) to C-1 (δ C 166.2), as well as only one nitrogen atom in the molecular formula of 1, determined the connection of the carbonyl C-1 with C-25 via NH.
Four sp2-hybridized methine carbons (δ C 128.7, 131.0, 132.1, and 139.0) could not be assigned due to the imperfectness of 1D-and 2D-NMR signals, with their corresponding proton signals overlapped except the δ H 6.19 for δ C 128.7. However, based on the unsaturation degrees of 1 and UV absorption profile (maximal absorption at 301, 343, and 364 nm suggested a hexaene unit; Figure 2), the four sp2-hybridized methines must form a diene fragment that connected the above two spin systems at C-17 and C-22 to form a hexaene unit, thereby completing the 26-membered polyene macrolactam of 1 (Figure 3). carbonyl, thus only one oxygen left for two oxymethines. Therefore, a dihydroxy tetrahydropyran moiety was determined in 1. The HMBC correlations from H-2 and H-3 to the carbonyl at δC 166.2 and NOESY correlation between H-2 and NH proved that the diene fragment of C-2 to C-5 conjugated with an amide group. The HMBC correlations from H-14 (δH 6.70) to the nonprotonated olefinic carbon δC 135.7 (C-16), H3-26 (δH 1.70) to the olefinic carbon C-17 (δC 130.9), and NOESY correlation between H-14 and C-26 methyl group (δC 13.1, δH 1.70 s) further extended the diene fragment of C-12 to C15 to a triene fragment of C-12 to C-17. The second spin system (C-27, C-25 to C-22) contained a methyl, a methine, a methylene, and two sp2 methines. The COSY correlation between the methine (δH 3.82, H-25) and an active hydrogen (δH 7.35, -NH-), together with the chemical shifts of the methine (δH 3.82, δC 44.7), demonstrated that C-25 connected to the nitrogen atom. The HMBC correlation from NH group (δC 7.35) to C-1 (δC 166.2), as well as only one nitrogen atom in the molecular formula of 1, determined the connection of the carbonyl C-1 with C-25 via NH.
Four sp2-hybridized methine carbons (δC 128.7, 131.0, 132.1, and 139.0) could not be assigned due to the imperfectness of 1D-and 2D-NMR signals, with their corresponding proton signals overlapped except the δH 6.19 for δC 128.7. However, based on the unsaturation degrees of 1 and UV absorption profile (maximal absorption at 301, 343, and 364 nm suggested a hexaene unit; Figure 2), the four sp2-hybridized methines must form a diene fragment that connected the above two spin systems at C-17 and C-22 to form a hexaene unit, thereby completing the 26-membered polyene macrolactam of 1 (Figure 3).  (19), and 20 (21) in 1 could not be determined. Thus, the planar structure of 1 was determined by NMR interpretation. Compound 1 was designated as pretilactam. Its NMR data were assigned in Table 2.   (19), and 20 (21) in 1 could not be determined. Thus, the planar structure of 1 was determined by NMR interpretation. Compound 1 was designated as pretilactam. Its NMR data were assigned in Table 2.
The absolute configuration of chiral carbons in pretilactam may be predicted bioinformatically. The chiral C-9 and C-11 should take the same S configuration because substrate predictions for KR stereochemistry in module 8 and module 7 ( Figure 1B) were both A1 by antiSMASH [3,21]. Chiral C-25 should take the same S configuration as in mirilactam A because the seven genes for 3-aminobutyrate biosynthesis from ATCC 31565 could be regarded as the same with those from Actinosynnema mirum (based on ≥96% amino acid sequence identities for each enzyme or protein encoded by the 7 genes) [6]. Besides, the S configuration of C-25 was also supported by the stereochemistry of mirilactams C-E [12]. The H-7, OH-8, and H-9 were strongly suggested to locate on the same surface of the tetrahydropyran ring by NOESY correlations of H-7/OH-8 and OH-8/H-9. Thus, chiral C-7 and C-8 should take the same R configuration ( Figure 1C).
The structure of pretilactam (1) elucidated by NMR matches rather well with the structure predicted by its gene cluster, except the following three minor points. First, the formation of C-C double bond 12 (13) requires an active DH domain in module 6. Second, the formation of C-6(hydroxy),7 single bond demands an inactive DH domain in module 9. Third, the substrate predictions of acyltransferase domains AT2, AT3, AT4, and AT8 are not accurate. Similar disagreements have occurred many times in structure prediction by other type I modular PKSs [6,[22][23][24][25].

The Dihydroxy Tetrahydropyran Ring Might Come from Dehydration of a Tetrahydroxy Pentane Unit in Pretilactam Biosynthesis
Pretilactam features a dihydroxy tetrahydropyran ring, which might be formed by spontaneous dehydration of a tetrahydroxy pentane unit in the final stage of pretilactam biosynthesis. Based on present understanding of microbial polyene macrolactams biosynthesis, we proposed a plausible biosynthetic pathway for pretilactam ( Figure 4).

Pretilactam was Inactive Against Bacillus Subtilis and Candida Albicans
A preliminary biological activity assay indicated that pretilactam showed no growth inhibition against Bacillus subtilis CMCC 63501 and Candida albicans ATCC 10231 at an amount of ca. 25 μg/paper disk. Due to unstable property of pretilactam (and insufficient amount of pretilactam), other

Pretilactam was Inactive Against Bacillus Subtilis and Candida Albicans
A preliminary biological activity assay indicated that pretilactam showed no growth inhibition against Bacillus subtilis CMCC 63501 and Candida albicans ATCC 10231 at an amount of ca. 25 µg/paper disk. Due to unstable property of pretilactam (and insufficient amount of pretilactam), other biological activity assays of pretilatam were not conducted.

Discussion
Pretilactam is a novel 26-membered polyene macrolactam. Its discovery indicates that ATCC 31565, the famous ansamitocin producer, is a talent strain for various secondary metabolites.
It is rather challenging to determine the absolute configuration of many polyene macrolactams because of their unstable property and unassigned E/Z geometries of some C-C double bonds. We found that pretilactam was sensitive to light, air, and acidic pH, and changed slowly in organic solvents such as methanol, acetonitrile, and DMSO. In addition, pretilactam has three C-C double bonds unassigned of their E/Z geometries. It is, therefore, very difficult to determine the absolute configuration of chiral carbons in pretilactam by chemical analysis.
The plm is believed to be responsible for pretilactam biosynthesis because it is the only candidate gene cluster for polyene macrolactam biosynthesis by antiSMASH of the genomic DNA of ATCC 31565. Still, genetic manipulation of plm should be carried out to confirm its biological function experimentally. Besides, the biological activity of pretilactam needs to be further explored in the future.

Genomic DNA Sequence Analysis
The genomic DNA of ATCC 31565 was extracted from mycelia using DNeasy blood and tissue kit from QIAGEN (Hilden, Germany). The DNA was sequenced on Illumina NGS and PacBio RSII with P4/C6 chemistry using the SMRTbell template prep kit v1.0 (Pacific Biosciences, CA, USA). Sequence data were processed and assembled using HGAP3, and sequence assembly was polished with the Quiver algorithm implemented in the PacBio SMRT analysis suite v2.3.0 (Pacific Biosciences, CA, USA).

LC-MS for Polyene Macrolactam
Fermentation culture of ATCC 31565 was extracted with an equal volume of MeOH or EtOAc. The MeOH or EtOAc extract (50 mL) was vacuum-dried then re-dissolved in 1.0 mL MeOH. An amount of 10 µL MeOH solution was used for LC-MS.
The above EtOAc extract of ATCC 31565 was mixed with ODS (12 nm, S-50 µm, AAG12S50, YMC) at a proportion of 1:1.5, then vacuum-dried at room temperature. The dried mixture was re-extracted with MeOH, and the MeOH solution was used for LC-MS.
A purified sample of pretilactam (1), dissolved in MeCN and stored at 4 • C for 24 h, was used for LC-MS.

Isolation of Polyene Macrolactam (Pretilactam, 1)
A large-scale cultivation of ATCC 31565 was performed to obtain enough material for isolation and characterization of the putative polyene macrolactam(s). The fermentation cultures (50 L; from ca. 1000 plates, 50 mL/plate) were pooled and extracted with EtOAc. The organic layer was collected and evaporated under reduced pressure, which gave 175 g of oily brown residue. The residue was loaded on an ODS column (12 nm, S-50 µm, AAG12S50, YMC; 460 × 49 mm), and fractionated by a step-wise gradient (20%, 30%, 40%, 50%, and 100%) of ethanol-H 2 O. Fraction 4 from 50% ethanol-H 2 O elution was evaporated under reduced pressure, which afforded 257.2 mg crude preparation for the putative polyene macrolactam(s). The crude preparation was subjected to preparative HPLC (YMC-Pack ODS-A, 250 × 20 mm, S-5 µm, 12 nm; 50-80% CH 3 CN-H 2 O, 25 min), which gave 11.2 mg refined preparation of 1 after evaporation. The refined preparation was subjected to semi-preparative HPLC (YMC-Pack ODS-A, 250 × 10 mm, S-5 µm, 12 nm; 78% CH 3 OH-H 2 O) for a final polishing, which yielded 4.1 mg pure preparation of 1 (with purity at 98% by HPLC analysis). The isolation process was conducted under dark, and dried samples in the purification process were stored in glass vials filled with N 2 gas, as polyene macrolactams are sensitive to light and O 2 .

NMR Assay
1 H and 13 C-NMR spectra data were obtained at 800 and 200 MHz, respectively, on a BRUKER AVANCE III 800 spectrometer, and measured in DMSO-d 6 at room temperature.

Assay of Pretilactam (1) Against Bacillus Subtilis CMCC 63501 and Candida Albicans ATCC 10231
A stock suspension (5-10 µL) of Candida albicans ATCC 10231 (or Bacillus cereus CMCC 63501) was inoculated into LB broth (100 mL in a 500 mL flask) for shaking at 37 • C overnight. The fresh culture of Candida albicans ATCC 10,231 (or Bacillus cereus CMCC 63501) was added into LB broth (with agar at 1.5%, 45 • C) at a proportion of 1.0% and mixed well, then poured into plates (12.0 mL for each plate, with a diameter of 8.5 cm; 10 6−7 CFU/mL). After solidification, the LB agar plates were ready for bioassay uses.
Filter paper disks (diameter 6.0 mm) loaded with ca. 6.25, 12.5, and 25 µg pretilactam (1) per paper disk were overlaid on the LB agar plates, then incubated at 37 • C for 12~15 h for microbial growth and inhibition zone formation.