Genome Mining and Metabolic Profiling Reveal Cytotoxic Cyclodipeptides in Streptomyces hygrospinosus var. Beijingensis

Two new cyclodipeptide (CDP) derivatives (1–2) and another seven known cyclodipeptides (3–9) were isolated from Streptomyces 26D9-414 by the genome mining approach combined with genetic dereplication and the “one strain many compounds” (OSMAC) strategy. The structures of the new CDPs were established on the basis of 1D- and 2D-NMR and comparative electronic circular dichroism (ECD) spectra analysis. The biosynthetic gene clusters (BGCs) for these CDPs were identified through antiSMASH analysis. The relevance between this cdp cluster and the identified nine CDPs was established by genetic interruption manipulation. The newly discovered natural compound 2 displayed comparable cytotoxicity against MDA-MB-231 and SW480 with that of cisplatin, a widely used chemotherapeutic agent for the treatment of various cancers.


Introduction
Actinomyces provide a rich source of natural products (NPs) with potential therapeutic applications, and modern "omics"-based technologies have revealed their potent potential for encoding diverse natural products [1]. Genome-guided discovery of clostrubin A [2], closthioamide [3] and cytotoxic benzolactones [4] has reinvigorated NP research, making it a more targeted and systematic research endeavor. To avoid the re-isolation of known NPs, the "genetic dereplication" strategy [5] and the "one strain many compounds" (OSMAC) approach [6] have been successfully used during large-scale culture for discovering NPs with novel skeletons (such as alterbrassinoids A-D [7] and waikikiamides [8]) and novel NPs derived from post-modifications (such as the branched cyclic peptide lyciumin [9] and highly modified polytheonamide-like peptides [10]).
Cyclodipeptides (CDPs), also called 2,5-diketopiperazines (DKPs), are the smallest cyclic peptides formed via the condensation of two α-amino acids. CDPs are mainly produced by Streptomyces [11]. CDPs exhibit important and diverse biological properties, such as antibacterial, antifungal, antiviral, antitumor, immunosuppressive and antiinflammatory activities [12]. Owing to the great potential for activation of specific binding sites in enzymes or proteins, CDPs have become important pharmacophores in pharmaceutical chemistry [13]. Natural CDPs can be biosynthesized through two different machineries; one is catalyzed by the large multi-modular nonribosomal peptide synthetases (NRPSs), and the other is mediated by cyclodipeptide synthases (CDPSs) [14]. The former utilizes free amino acids, and the latter hijacks aminoacyl-tRNAs (AA-tRNAs) from primary metabolism [15]. Generally, CDPSs catalyze the production of representative 2,5-DKPs, which then will be modified by cyclodipeptide-tailoring enzymes (such as methyltransferases, prenyltransferases, oxidoreductases and cytochrome P450 enzymes) to form their intriguing molecular character [14,15].

Results and Discussion
The "genetic dereplication" strain Streptomyces 26D9-414 [18], in which BGCs of tetramycin and anisomycin were deleted, was selected for OSMAC screening of new natural products. The original medium for anisomycin production and the other ten liquid media PYJ1-J10 were selected for the mining of new compounds. Comparative HPLC analysis of the secondary metabolites was conducted, and the metabolic profile of the PYJ1 medium gave many new peaks characteristic of absorption at 224 nm and 296 nm ( Figure S1). Repeat rounds of fractionation alternating between silica gel chromatography and Sephadex LH-20 column chromatography followed by semi-preparative reversedphase HPLC afforded compounds 1-9 ( Figure 1). Their structures were elucidated by spectroscopic methods and HR-ESI-MS data.
Antibiotics 2022, 11, 1463 4 of 10 (Z). The absolute configuration was established by electronic circular dichroism (ECD), and the experimental ECD spectra matched well with the calculated ECD curves of 7R, 9S ( Figure 3B). Considering the biosynthetic origins, the S configuration at C-9 was consistent with natural L-proline. Taken together, compound 1 was identified as (7R,9S)-3-((Z)-benzylidene)-7-hydroxy-hexahydropyrrolo [1,2-a] pyrazine-1,4-dione.   (Table 1). These data suggest that isoleucine, rather than valine, was condensed with arginine, leading to the formation of compound 2 as a new pyrazine derivative. Considering the arginine origin, compound 2 was named argilein. This is the third arginine-containing pyrazine derivative found in natural products. Similarly, the absolute configuration of 2 was 7S, which was consistent with the L-isoleucine ( Figure 4).  (Table S3). The cytotoxicity of compound 2 was comparable with that of cisplatin, which has been widely used as a chemotherapeutic agent for the treatment of various cancers [26]. As to the antibiotic activity, all compounds showed no obvious inhibitory activity against all tested bacteria and fungi, except for compound 2, which exhibited weak activity against Xanthomonas albilineans, Candida albicans and Candida sake with MIC values of 0.25 mg/mL, 1.0 mg/mL and 1.0 mg/mL, respectively (Table S4). This finding   (Table 1). These data suggest that isoleucine, rather than valine, was condensed with arginine, leading to the formation of compound 2 as a new pyrazine derivative. Considering the arginine origin, compound 2 was named argilein. This is the third arginine-containing pyrazine derivative found in natural products. Similarly, the absolute configuration of 2 was 7S, which was consistent with the L-isoleucine (Figure 4).
(Z). The absolute configuration was established by electronic circular dichroism (ECD), and the experimental ECD spectra matched well with the calculated ECD curves of 7R, 9S ( Figure 3B). Considering the biosynthetic origins, the S configuration at C-9 was consistent with natural L-proline. Taken together, compound 1 was identified as (7R,9S)-3-((Z)-benzylidene)-7-hydroxy-hexahydropyrrolo [1,2-a] pyrazine-1,4-dione.   (Table 1). These data suggest that isoleucine, rather than valine, was condensed with arginine, leading to the formation of compound 2 as a new pyrazine derivative. Considering the arginine origin, compound 2 was named argilein. This is the third arginine-containing pyrazine derivative found in natural products. Similarly, the absolute configuration of 2 was 7S, which was consistent with the L-isoleucine (Figure 4).  (Table S3). The cytotoxicity of compound 2 was comparable with that of cisplatin, which has been widely used as a chemotherapeutic agent for the treatment of various cancers [26]. As to the antibiotic activity, all compounds showed no obvious inhibitory activity against all tested bacteria and fungi, except for compound 2, which exhibited weak activity against Xanthomonas albilineans, Candida albicans and Candida sake with MIC values of 0.25 mg/mL, 1.0 mg/mL and 1.0 mg/mL, respectively (Table S4). This finding  (Table S3). The cytotoxicity of compound 2 was comparable with that of cisplatin, which has been widely used as a chemotherapeutic agent for the treatment of various cancers [26]. As to the antibiotic activity, all compounds showed no obvious inhibitory activity against all tested bacteria and fungi, except for compound 2, which exhibited weak activity against Xanthomonas albilineans, Candida albicans and Candida sake with MIC values of 0.25 mg/mL, 1.0 mg/mL and 1.0 mg/mL, respectively (Table S4). This finding was consistent with the reported weak antibacterial activities of CDPs (MICs of 0.5-10 mg/mL) [27,28].
To correlate BGCs with the isolated nine diketopiperazines, antiSMASH analysis of the genome sequence of S. hygrospinosus var. beijingensis was conducted. The arrangement and sequence of genes within the cdp cluster showed high similarity with those of the alb cluster, which was reported to be responsible for albonoursin (5) production ( Figure S2) [21]. albC encodes cyclodipeptide synthase (CDPS), which catalyzes the cyclic dipeptide precursor formation. The heterologous expression of albC led to the synthesis of various cyclodipeptides, including cyclo(Phe-Pro) [29,30], the possible precursor of compounds 1 and 4. The deletion of cdpA-cdpC abolished the production of nine cyclodipeptides (Figures 5 and S4). To the best of our knowledge, this is the first finding of a cyclodipeptide synthesized by CDPS using arginine and also the first report of a proline-derived cyclodipeptide (compounds 1 and 4) from the original producing strain.
To correlate BGCs with the isolated nine diketopiperazines, antiSMASH analysis of the genome sequence of S. hygrospinosus var. beijingensis was conducted. The arrangement and sequence of genes within the cdp cluster showed high similarity with those of the alb cluster, which was reported to be responsible for albonoursin (5) production (Figure S2) [21]. albC encodes cyclodipeptide synthase (CDPS), which catalyzes the cyclic dipeptide precursor formation. The heterologous expression of albC led to the synthesis of various cyclodipeptides, including cyclo(Phe-Pro) [29,30], the possible precursor of compounds 1 and 4. The deletion of cdpA-cdpC abolished the production of nine cyclodipeptides (Figures 5 and S4). To the best of our knowledge, this is the first finding of a cyclodipeptide synthesized by CDPS using arginine and also the first report of a proline-derived cyclodipeptide (compounds 1 and 4) from the original producing strain. Cyclodipeptide oxidases (CDOs) AlbA and AlbB usually catalyze the dehydrogenation of cyclodipeptides to form dehydrogenated cyclodipeptide derivatives [31]. Whether the hydroxyl group in 1 and pyrazinone in 2 are catalyzed by CDO candidates CdpA and CdpB still awaits discovery. CDPSs and CDOs both possess broad substrate selectivity and can be used to synthesize various dehydrogenated cyclodipeptide derivatives, which serve as important precursors for the development of pharmaceutical intermediates [30,32]. Gene c-blast analysis revealed that cdp gene analogs were mainly distributed in Streptomyces and Nocardiopsis, and a few were also found in Nonomuraea, Goodfellowiella, Bailinhaonella, Saccharopolyspora and Actinomadura ( Figure S5).

Conclusions
Modern "omics"-based technologies have revealed the potent potential of Actinobacteria for encoding natural products with diverse structures and biologically active compounds. To reveal the diversity of NPs encoded by Streptomyces hygrospinosus var. beijingensis, the "genetic dereplication" strategy and OSMAC approach were used in this study. Nine CDP derivatives of two types were identified from S. hygrospinosus 26D9-414 through the genome mining strategy. The relevance between the cdp cluster and all the isolated CDPs was confirmed by genetic manipulation. These findings increase the repertoire of natural DKPs and reveal a CDPS with a broad range of substrates that could be developed as a biocatalyst for the future development of therapeutic agents.

General Experimental Procedures
Optical rotations were recorded with a JASCO P-2000 digital polarimeter. UV spectra were recorded on a Thermofisher Evolution 300 UV-vis spectrophotometer. The 1D-NMR and 2D-NMR spectra were obtained on a Bruker AVANCE III 600 MHz spectrometer with Cyclodipeptide oxidases (CDOs) AlbA and AlbB usually catalyze the dehydrogenation of cyclodipeptides to form dehydrogenated cyclodipeptide derivatives [31]. Whether the hydroxyl group in 1 and pyrazinone in 2 are catalyzed by CDO candidates CdpA and CdpB still awaits discovery. CDPSs and CDOs both possess broad substrate selectivity and can be used to synthesize various dehydrogenated cyclodipeptide derivatives, which serve as important precursors for the development of pharmaceutical intermediates [30,32]. Gene c-blast analysis revealed that cdp gene analogs were mainly distributed in Streptomyces and Nocardiopsis, and a few were also found in Nonomuraea, Goodfellowiella, Bailinhaonella, Saccharopolyspora and Actinomadura ( Figure S5).

Conclusions
Modern "omics"-based technologies have revealed the potent potential of Actinobacteria for encoding natural products with diverse structures and biologically active compounds. To reveal the diversity of NPs encoded by Streptomyces hygrospinosus var. beijingensis, the "genetic dereplication" strategy and OSMAC approach were used in this study. Nine CDP derivatives of two types were identified from S. hygrospinosus 26D9-414 through the genome mining strategy. The relevance between the cdp cluster and all the isolated CDPs was confirmed by genetic manipulation. These findings increase the repertoire of natural DKPs and reveal a CDPS with a broad range of substrates that could be developed as a biocatalyst for the future development of therapeutic agents.

General Experimental Procedures
Optical rotations were recorded with a JASCO P-2000 digital polarimeter. UV spectra were recorded on a Thermofisher Evolution 300 UV-vis spectrophotometer. The 1D-NMR and 2D-NMR spectra were obtained on a Bruker AVANCE III 600 MHz spectrometer with TMS as an internal standard. HR-ESI-MS spectra were recorded on an Agilent 1290 HPLC system coupled to a 6230 TOF system mass spectrometer. ECD spectra were recorded using a JASCO J-1500-150ST. HPLC analysis and semi-preparative HPLC were performed with Agilent 1260 HPLC system using an Agilent ZORBAX SB-C18 column (5 µm, 4.6 × 250 mm) and an Agilent ZORBAX SB-C18 column (5 µm, 9.4 × 250 mm), respectively. All comparative studies of crude extracts obtained based on the OSMAC strategy were based on HPLC analysis, the mobile phases were CH 3 OH-H 2 O and 1‰ formic acid or trifluoroacetic acid in Antibiotics 2022, 11, 1463 6 of 10 the water. The gradient was chosen as CH 3 OH-H 2 O: 5% 0-5 min, 5-50% 5-30 min, 50-95% 30-45 min, 95% 45-50 min, 95-5% 50-51 min, 5% 51-60 min, 0.5 mL/min. The HPLC methods used for the separation of compounds 1-9 are described in detail in Section 4.5. Silica gel (100-200, 200-300 mesh, Qingdao Haiyang Chemical Co., Ltd., Qingdao, China) and Sephadex LH-20 gel (Uppsala, Sweden) were used for column chromatography (CC). Precoated silica gel GF254 plates (Qingdao Marine Chemical Ltd., Qingdao, China) were used for TLC monitoring combined with UV light and 10% H 2 SO 4 in EtOH. Taq DNA polymerase and KOD-plus high-fidelity polymerase were obtained from Takara. All restriction enzymes were purchased from Thermo Scientific or Vazyme Biotech Co., Ltd. E.Z.N.A. Gel Extraction Kit and Plasmid Mini Kit were purchased from OMEGA. PCR primers were synthesized by GENEWIZ. All solvents used for CC were of analytical grade (Shanghai Chemical Reagents Co., Ltd., Shanghai, China), and solvents used for HPLC were of HPLC grade (Sigma-Aldrich, St. Louis, MO, USA).

Bacterial Strains, Plasmids, Primers and Culture Conditions
The strains, plasmids and primers used in this study were listed in Tables S1 and S2. Streptomyces and its derivatives were grown at 30 • C on solid SFM medium (2% mannitol, 2% soya flour and 1.5% agar) for sporulation and conjugation, and in TSBY liquid medium (3% tryptone soy broth, 10.3% sucrose and 0.5% yeast extract) for the isolation of chromosomal DNA [33]. All E. coli strains including DH10B and ET12567/pUZ8002 were grown in liquid Luria-Bertani (LB) medium or on LB agar at 37 • C. Apramycin (50 µg/mL) and trimethoprim (50 µg/mL) were used when necessary. All plasmid subcloning experiments were performed in E. coli DH10B following standard protocols. General procedures for E. coli or Streptomyces manipulation were carried out according to the published procedures [34].

ECD Calculations
Conformational analyses for compounds 1-2 were performed via Spartan'14 software using the MMFF94 molecular mechanics force field calculation. Conformers within a 10 kcal/mol energy window were generated and optimized using DFT calculations at the B3LYP/6-31G(d) level. Conformers with a Boltzmann distribution over 1% were chosen for the ECD calculations in MeOH at the B3LYP/6-311 + G (2d, p) level. The IEF-PCM solvent model for MeOH was used. The calculated ECD spectra were obtained by DFT and time-dependent DFT (TD-DFT) using Gaussian 09 and analyzed using SpecDis v1.71.

Cytotoxicity Assays
To determine the cytotoxicity of compounds 1-5, five human cancer cell lines (HL60, A549, SMMC-7721, SW480, MDA-MB-23) were evaluated by MTS assay. Each cell line was exposed to the tested compounds at concentrations of 40, 8, 1.6, 0.32 and 0.064 µM in triplicate. Cell viability was determined using MTS Kit according to the manufacturer's instructions [37].