Lentzeacins A-E, New Bacterial-Derived 2,5- and 2,6-Disubstituted Pyrazines from a BGC-Rich Soil Bacterium Lentzea sp. GA3-008

Pyrazines (1,4-diazirines) are an important group of natural products that have tremendous monetary value in the food and fragrance industries and can exhibit a wide range of biological effects including antineoplastic, antidiabetic and antibiotic activities. As part of a project investigating the secondary metabolites present in understudied and chemically rich Actinomycetes, we isolated a series of six pyrazines from a soil-derived Lentzea sp. GA3-008, four of which are new. Here we describe the structures of lentzeacins A-E (1, 3, 5 and 6) along with two known analogues (2 and 4) and the porphyrin zincphyrin. The structures were determined by NMR spectroscopy and HR-ESI-MS. The suite of compounds present in Lentzea sp. includes 2,5-disubstituted pyrazines (compounds 2, 4, and 6) together with the new 2,6-disubstituted isomers (compounds 1, 3 and 5), a chemical class that is uncommon. We used long-read Nanopore sequencing to assemble a draft genome sequence of Lentzea sp. which revealed the presence of 40 biosynthetic gene clusters. Analysis of classical di-modular and single module non-ribosomal peptide synthase genes, and cyclic dipeptide synthases narrows down the possibilities for the biosynthesis of the pyrazines present in this strain.


Introduction
Pyrazines (1,4-diazines) are ubiquitous aromatic nitrogen-containing heterocycles distributed widely in nature, and are found in plants, insects, and diverse microorganisms. Alkyl-substituted pyrazines are often volatile compounds and are responsible for some of the flavors and aromas of plants, essential oils, coffee and wine, to name only a few. Recent surveys have estimated a multibillion-dollar global market for pyrazines in the food and fragrance industries [1]. Pyrazines play important ecological roles by affecting the behavior of numerous species of ants and comprise an important class of antibiotics produced by fungi such as Aspergillus spp. [2]. Interestingly, bacteria from many genera, including Pseudomonas, Mycobacterium, and Rhodococcus, can use pyrazine derivatives (such as 2hydroxy pyrazine, 2,3-diethyl-5-methyl pyrazine, 2,5-dimethyl pyrazine, and tetramethyl pyrazine) as carbon and nitrogen sources [3,4]. Importantly, natural and synthetic pyrazines have been associated with a wide range of biological effects, including antitubercular, anticancer, diuretic, antidiabetic, insecticidal and nematicidal activities [2,5,6]. Two of the best known examples of pyrazine-containing medicines include the chemotherapeutic bortezomib (Velcade TM ) [5] and the antitubercular drug pyrazinamide.
Due to their importance in agriculture and medicine and the prospect of engineering product diversity, independent groups are investigating the biosynthetic pathways that pro-Due to their importance in agriculture and medicine and the prospect of engineering product diversity, independent groups are investigating the biosynthetic pathways that produce pyrazines and their amino acid-derived precursors including diketopiperazines and piperazines [7][8][9]. Recent studies have revealed several biosynthetic routes for the synthesis of 2,5-diketopiperazines (cyclic dipeptides). These include synthesis by cyclodipeptide synthetases (CDPS) that use aminoacyl-tRNAs (aa-tRNAs) for amino acid incorporation [7], and classical di-modular non-ribosomal peptide synthetases (NRPS) that produce compounds such as thaxtomin A [10] and gliotoxin in bacteria [11], and actinopolymorphol C [12], brasiliamide B [13] and hancockiamides in fungi [13,14]. Most recently single module NRPS-like genes containing a reductase domain have been shown to produce pyrazinones [15][16][17]. Among natural products, 2,5-disubstituted pyrazines are relatively common and their origins from cyclic dipeptides is obvious. In contrast, the occurrence of 2,6-disubstituted pyrazines is less common, thus far isolated from Penicillium spp. [18], myxobacteria [19] and a 2-keto pyrazine from enterohemorrhagic Escherichia coli [16]. Here we describe the isolation and structure determination of lentzeacins A-E (1-3 and 5,6) ( Figure 1) from an understudied actinomycete strain Lentzea sp. GA3-008 where compounds 1, 3, 5 and 6 are new. Together with known compounds 2 and 4, the porphyrin zincphyrin [20] was also isolated from strain GA3-008. Whole genome sequencing and analysis of the biosynthetic gene clusters (BGCs) in GA3-008 revealed the presence of a CDPS and a single module NRPS. The CDPS predicted product does not correspond to the lentzeacin structures, and the broad specificity of the NRPS adenylation domain precludes structure assignments leaving open the possibility that the lentzeacins are biosynthesized by a non-canonical NRPS or from non-clustered enzymes [6].

Isolation and Structure Determination of Lentzeacins and Zincphyrin
Lentzea sp. GA3-008 was isolated from a desert soil sample on 10% actinomycete isolation agar and the axenic strain grown in liquid media for 10 days. The EtOAc extract of the culture broth inhibited the growth of Staphylococcus aureus ATCC 25913. Using antimicrobial assay-guided fractionation of the EtOAc extract and RP-HPLC, the active compound was identified as zincphyrin by high-resolution electrospray ionization mass spectrometry (HR-ESI-MS) and comparison of the 13 C NMR data with published values [21]. Compounds 1-4 were readily detected by UV and MS in HPLC chromatograms and could be purified from the organic extract. Similar product ions were observed for compounds 1-4 in HR-MS/MS spectra (Figures S1-S7). Targeted analysis for these product ions in the

Isolation and Structure Determination of Lentzeacins and Zincphyrin
Lentzea sp. GA3-008 was isolated from a desert soil sample on 10% actinomycete isolation agar and the axenic strain grown in liquid media for 10 days. The EtOAc extract of the culture broth inhibited the growth of Staphylococcus aureus ATCC 25913. Using antimicrobial assay-guided fractionation of the EtOAc extract and RP-HPLC, the active compound was identified as zincphyrin by high-resolution electrospray ionization mass spectrometry (HR-ESI-MS) and comparison of the 13 C NMR data with published values [21]. Compounds 1-4 were readily detected by UV and MS in HPLC chromatograms and could be purified from the organic extract. Similar product ions were observed for compounds 1-4 in HR-MS/MS spectra ( Figures S1-S7). Targeted analysis for these product ions in the total MS/MS data of the organic extract led to the identification of two additional pyrazines 5 and 6 that were purified by HPLC.
Compounds 5 and 6 also had the same molecular formula, C 15

Analysis of the GA3-008 Genome for Pyrazine Biosynthesis
Because the biosynthetic origins of 2,6-disubstituted pyrazines have never been determined for actinomycetes and this substitution pattern is rare, we carried out whole genome sequencing to look for potential biosynthetic gene clusters that could be connected to the lentzeacins. Biosynthetically, compounds 1-6 may be traced back to tyrosine, phenylalanine, and leucine. We used Nanopore Minion long-read sequencing to assemble a draft genome for Lentzea sp.GA3-008 and submitted the 9.19 Mb genome to the program antiSMASH to predict BGCs and potentially identify the biosynthetic pathway responsible for production of the 2,5-and 2,6-disubstituted pyrazines. Forty BGCs of varying types were predicted and included six terpenes, three NRPs and three NRPS-like proteins, seven ribosomally synthesized peptides (RiPPs), three polyketide synthase-like (PKS), two type-1 polyketide synthases (TIPKS), eight hybrid pathways, one T3PKS, two T2PKSs, and one of each of the CDPS, aminoglycoside, heterocyst glycolipid synthase-like PKS (hgIE-KS), and alkaloid BGCs (Figure 3 and Table S1) [23].
From the antiSMASH output, there were two types of BGCs that could correspond to synthesis of the pyrazines, namely a CDPS and a monomodular NRPS. CDPSs recognize and reroute specific aa-tRNAs from ribosomal peptide synthesis to catalyze formation of cyclic dipeptides [24]. The specificity CDPSs from diverse genera for aa-tRNAs has been thoroughly characterized and reviewed by Jacques et al. [24]. Thus, specificity of a new CDPS can be predicted through comparison of residues in the aa-tRNA binding pockets, P1 and P2, and phylogenetic clustering [9]. The P1 and P2 binding pockets in the GA3-008 CDPS (gene KOEHFPPE_07267) have the respective amino acid sequences CGFPGMFF and AWVRQVR that are identical (for P1) or highly similar (for P2) to known CDPSs with specificity for Cys (Figure 4). If lentzeacins were to be synthesized by a CDPS, the binding pockets would be expected to recognize Phe, Tyr, or Ile. Therefore, KOEHFPPE_07267 is inconsistent with biosynthesis of lentzeacins. Molecules 2021, 26, x FOR PEER REVIEW 6 of 12  . KOEHFPPE_07267 from GA3-008 is highlighted in red. P1 and P2 motif corresponds to the amino acid residues in amino acyl t-RNA binding pocket one or two, respectively [9]. Activity data demonstrating amino acid specificity is derived from Jacques et al. [24].
Another route for pyrazine biosynthesis involves an uncommon monomodular NRPS with a terminal reductase domain (TD). Conversion of the amino acid to the reactive amino aldehyde allows for cyclization to form pyrazines [25][26][27][28]. The genome of Lentzea sp. GA3-008 harbors a single monomodular NRPS-like gene (KOEHFPPE_08075) possessing a terminal reductase domain ( Figure S45). KOEHFPPE_08075 shares 50% amino acid identity to a previously characterized carboxylic acid reductase (CAR) in   . KOEHFPPE_07267 from GA3-008 is highlighted in red. P1 and P2 motif corresponds to the amino acid residues in amino acyl t-RNA binding pocket one or two, respectively [9]. Activity data demonstrating amino acid specificity is derived from Jacques et al. [24].
Another route for pyrazine biosynthesis involves an uncommon monomodular NRPS with a terminal reductase domain (TD). Conversion of the amino acid to the reactive amino aldehyde allows for cyclization to form pyrazines [25][26][27][28]. The genome of Lentzea sp. GA3-008 harbors a single monomodular NRPS-like gene (KOEHFPPE_08075) possessing a terminal reductase domain ( Figure S45). KOEHFPPE_08075 shares 50% amino acid identity to a previously characterized carboxylic acid reductase (CAR) in . KOEHFPPE_07267 from GA3-008 is highlighted in red. P1 and P2 motif corresponds to the amino acid residues in amino acyl t-RNA binding pocket one or two, respectively [9]. Activity data demonstrating amino acid specificity is derived from Jacques et al. [24].
Another route for pyrazine biosynthesis involves an uncommon monomodular NRPS with a terminal reductase domain (TD). Conversion of the amino acid to the reactive amino aldehyde allows for cyclization to form pyrazines [25][26][27][28]. The genome of Lentzea sp. GA3-008 harbors a single monomodular NRPS-like gene (KOEHFPPE_08075) possessing a terminal reductase domain ( Figure S45). KOEHFPPE_08075 shares 50% amino acid identity to a previously characterized carboxylic acid reductase (CAR) in Nocardia iowensis (PDB ID: 5msc) [29]. CAR enzymes are NRPS-like and functionally related to monomodular NRPSs with TD domains. Both reduce carboxylic acids to corresponding aldehydes; however, the adenylation domains in CARs can have broad substrate selectivity [29,30]. AntiSMASH 5/NRPSPredictor2 could not predict specificity for the adenylation domain in KOEHFPPE_08075 (Table S2). KOEHFPPE_08076, a gene adjacent to the monomodular NRPS, is a homologue of Tyr-sensitive phospho-2-dehydro-3-deoxyheptonate aldolase (commonly annotated as aroF). aroF (EC:2.5.1.54) catalyzes the first step in the shikimate pathway for biosynthesis of Phe, Tyr and Trp. If KOEHFPPE_08075 is involved in lentzeacin biosynthesis, KOEHFPPE_08076 could provide control over supply of precursor aromatic amino acid substrates. KOEHFPPE_08076 is the second copy of aroF in the Lentzea sp. GA3-008 genome. The other aroF copy is located the with enzymes of the shikimate pathway, as seen in other Lentzea sp. genomes ( Figure S46). In BLAST searches against the NCBI non-redundant nucleotide sequence database, we did not find any other Lentzea sp. genome harboring a similar BGC. However, a taxonomically related strain Allokutzneria albata strain DSM 44149, also a member of the family Pseudonocardiaceae, encoded a near identical copy of this cluster ( Figure S47).

Discussion
Nanopore sequencing and analysis of a draft genome of GA3-008 facilitated a search for the biosynthetic pathway leading to the lentzeacins. While this analysis leaves open the possibility that the lentzeacins are synthesized via a monomodular NRPS-like enzyme, it does not explain the presence of the 2,6-disubstituted compounds that occur together with their 2,5-disubstituted congeners. To the best of our knowledge, the co-occurrence of such a pair of disubstituted piperazines has been observed in fungal-derived brasiliamides isolated from Penicillium brasilianum [18] and trace amounts in the myxobacterium Chondromyces crocatus [31]. Biosynthesis of the brasiliamides was originally proposed to occur via a phenylpropanoid pathway involving phenylalanine/tyrosine ammonia lyase (PAL) and p-coumaric acid because two units of [2-13 C]-phenylalanine were incorporated in brasiliamide structures [18]. These enzymes were also detected in the Lentzea sp. GA3-008 genome (Figure 3). However, a recent study convincingly demonstrated that the biosynthesis of the diketopiperazine core of brasiliamides is catalyzed by a monomodular NRPS (brs) through stereoselective in vitro activity of a purified BrsA protein on its substrate L-phenylalanine [13]. Together, these analyses leave open the possibility that the 2,5-disubstituted pyrazines may be produced by the monomodular NRPS-like gene described above that would yield dihydropyrazine 9 followed by spontaneous oxidation to give the major compounds 2, 4 and 6 (Scheme 1a). Alternatively, rearrangement of a reactive dipeptide aldehyde 10 (Scheme 1b) followed by cyclization would yield 2,6disubstituted dihydropyrazine 11 followed by oxidation to give the minor metabolites 1, 3 and 5. Such a scheme has a precedent in the myxobacterium C. crocatus as demonstrated through incorporation of 15 N 2 Val in both isomers [19]. Further studies involving labeled amino acids or production and characterization of the product of the single module NRPS KOEHFPPE_08075 should prove useful in determining the biosynthesis of these unusual actinomycete-associated metabolites. Interestingly, of the 40 BGCs identified by antiSMASH 5.0, only two showed 100% similarity to BGCs for known compounds. These included a RiPP natural product citrulassin D and two copies of the terpene geosmin linked to BGCs 19 and 33 (Table S1). The low similarity of more than 30 additional clusters to known BGCs suggests GA3-008 is a promising source of novel secondary or specialized metabolites.

Antimicrobial Assays
In the early stages of this work, we monitored the appearance of antimicrobial activity by culturing GA3-008 in 50 mL volumes of actinomycetes broth vegitone media for 10 days. Each day 500 µL aliquots were removed and partitioned with 500 µL EtOAc followed by 500 µL n-BuOH. We tested 20 µL aliquots of each organic extract in a solid agar 96-well pin assay where the agar was inoculated with the test strains. Test plates were incubated overnight at 37 • C and zones of inhibition were observed the next day. Antimicrobial activity as a function of time is shown in Figure S49. Antimicrobial screening of the mixtures in each of the Fractions 1-25 was performed using the same solid agar assay. Antimicrobial activity of compounds 1-6 and zincphyrin were tested in liquid broth in duplicate against S. aureus ATCC 29213 following the Clinical and Laboratory Standards Institute guidelines with the modification that the highest concentration tested was 200 µM [35]. Zincphyrin was the only compound that showed antimicrobial activity against S. aureus and it was not pursued further.