Activation of Cryptic Antibiotic Biosynthetic Gene Clusters Guided by RNA-seq Data from Both Streptomyces ansochromogenes and ΔwblA

With the increase of drug resistance caused by the improper use and abuse of antibiotics, human beings are facing a global health crisis. Sequencing of Streptomyces genomes revealed the presence of an important reservoir of secondary metabolic gene clusters for previously unsuspected products with potentially valuable bioactivity. It has therefore become necessary to activate these cryptic pathways through various strategies. Here, we used RNA-seq data to perform a comparative transcriptome analysis of Streptomyces ansochromogenes (wild-type, WT) and its global regulatory gene disruption mutant ΔwblA, in which some differentially expressed genes are associated with the abolished nikkomycin biosynthesis and activated tylosin analogue compounds (TACs) production, and also with the oviedomycin production that is induced by the genetic manipulation of two differentially expressed genes (san7324 and san7324L) encoding RsbR. These results provide a significant clue for the discovery of new drug candidates and the activation of cryptic biosynthetic gene clusters.


Introduction
Natural products from microbes play an important role in clinical treatments, animal husbandry and plant crop protection [1,2]. Actinomycetes, especially streptomycetes, are a particularly abundant source of secondary metabolites. The decreased number in clinical antibiotics and the emergence of antibiotic resistance pose new challenges for the discovery of novel antibiotics. Streptomycetes have the potential to produce more new natural products than those previously recognized under laboratory conditions. With the rapid development of genome sequencing technology and analysis tools, more and more cryptic secondary metabolites have been mined and the activated cryptic gene clusters have been identified through various strategies, which provide more opportunities to discover new natural products or drug candidates [3,4].
There are many successful examples for mining of hidden natural products. For instance, (i) Three natural products, streptothricin, geosmin and strevertene A (polyene), were successfully activated by using the reporter-guided strategy [16]; (ii) Multiplex activation strategies were implemented to awaken a cryptic biosynthetic gene cluster, resulting in the discovery of eight aromatic polyketides with two types of frameworks, two pentacyclic isomers and six glycosylated tetracyclines [17]; (iii) In Streptomyces roseosporus, mureidomycin biosynthesis was activated by the introduction of an exogenous regulatory gene ssaA [18].
The WhiB-like (Wbl) family of proteins broadly distribute in Actinobacteria, such as Streptomyces, Corynebacteria and Mycobacteria [19]. Wbls usually play key roles in the differentiation, the biosynthesis of various secondary metabolites, and the involvement in the oxidative stress response [19][20][21]. For instance, wblA plays a crucial role in the repression of daptomycin biosynthesis, which is probably mediated by the regulatory genes atrA, dptR2 and dptR3. Although WblA has been reported to have a DNA-binding HTH domain, there are few examples of direct evidence about its transcriptional regulation from EMSA experiments [22]. Wbl proteins may change their regulatory function in response to O 2 and nitric oxidation (NO) via iron-sulphur clusters essential for nutrient starvation survival [23]. In addition, some Wbls can interact with their partner proteins to modulate antibiotic resistance [24,25]. Streptomyces ansochromogenes 7100 was isolated from the northeast soil of China [26]. It can produce a peptidyl nucleoside compound nikkomycin, a promising antibiotic against phytopathogenic fungi and human pathogens [27]. Our previous work showed that the biosynthesis of novel cryptic tylosin analogue compounds 1,2,3 (TAC1,2,3 or TACs) with much higher activity than tylosin against Streptococcus pneumonia was activated by the disruption of wblA, meanwhile, nikkomycin was abolished [28].
In this study, we describe the comparative transcriptome analysis, in which some differentially expressed genes are related to the gene cluster's activation/inactivation. This would provide an efficient approach for the activation of secondary metabolites guided by RNA-seq data.

Sequencing and AntiSMASH Analysis of S. Ansochromogenes 7100 Genome
In order to obtain a complete genome sequence of S. ansochromogenes 7100, the third generation sequencing was performed on a PacBio RSII platform. A total of 157,549 quality subreads were obtained in genome sequencing with an approximate coverage fold of more than 100× (Table S1, Supplementary Materials).
Then the raw data were filtered and assembled into a nearly complete genome of 9.56 Mb (Table S2) with software Canu. The main features of the S. ansochromogenes 7100 chromosome are shown in Figure 1. The genome of S. ansochromogenes 7100 is a linear chromosome with a repetitive sequence content of 2.37% (Table S3), which is composed of 8235 protein-coding genes (Table S4), 72 tRNAs, 18 rRNA and 50 nc RNAs genes (Table S5). Of the total gene set, 98% were classified into functional categories according to NR, COG, Pfam, KEGG pathway and Gene Ontology database (Table S6). The average G + C content of the genome is 72.41%. The circle 1 (outer circle) is a marker of genome total size with a scale of 5 kb each; circles 2 and 3 (from the outside in) are genes on the forward and reverse strand respectively, which are color-coded by function; circle 4 indicates the distributed position of repeat sequence; circle 5 represents tRNAs and rRNAs; circle 6 shows GC content (the light yellow part indicates that the GC content in this region is higher than the average GC content in the genome, the blue part indicates that the GC content in this region is lower than the average GC content in the genome), circle 7 (inner circle) is GC-skew (Black represents the area where G content is greater than C, and red represents the area where C content is greater than G).
The potential secondary metabolites biosynthetic gene clusters in S. ansochromogenes 7100 were analyzed by the antiSMASH program [10,29]. A total of 41 gene clusters were identified, which include nikkomycin gene cluster, TACs gene cluster and 39 uncharacterized gene clusters that are predicted to be involved in the biosynthesis of polyketides (PKSs), lassopeptide, terpene, lanthipeptide, bacteriocin, non-ribosomal peptides (NRPSs), terpene, butyrolactone, betalactone, siderophore, ectoine, melanin and so on (Table 1). Although only a few secondary metabolites have been identified, the abundant 'cryptic' gene clusters in S. ansochromogenes 7100 implied the potential for discovery of novel bioactive products. Genomic sequence of the S. ansochromogenes 7100 was submitted to the antiSMASH program [30]. A total of 41 gene clusters were identified. Among them, gene clusters in region 1 and region 8 were responsible for the biosynthesis of nikkomycin and tylosin analogue compounds (TAC1,2,3), respectively.

Genome-Wide Differentially Expressed Genes in ∆wblA
wblA is an important regulatory gene usually involved in the sporulation, biosynthesis of secondary metabolites and oxidative stress in Streptomyces [21]. The disruption of wblA in S. ansochromogenes 7100 caused the failure of sporulation and the loss of nikkomycins production [28]. Instead, three novel 16-membered tylosin-like macrolides were accumulated in the resulting mutant strain (∆wblA), arousing much interest in how the regulatory pathway works.
To elucidate the landscape of transcriptomic changes of S. ansochromogenes 7100 and ∆wblA, RNA-seq transcriptional profiles at 24 h and 48 h were analyzed. The production of TACs was not detectable at 24 h but appeared at 48 h with the subsequent expression of the biosynthetic genes. RNAs were isolated from the WT and ∆wblA strains at the two time points for cDNA synthesis, library construction and sequencing. The transcriptome sequencing of four libraries resulted in a total unique map of 76,486,734 sequence reads, which were aligned to the reference genome of S. ansochromogenes 7100 (Table S7). More than 95% of unique mapped reads ratio in each library mapped to the S. ansochromogenes genome. The data reached saturation ( Figure S1).
RNA-seq analysis revealed a large group of differentially expressed genes, as illustrated in the Heatmap (Figure 2a). Among them, the transcriptional levels of 1162 genes up-regulated 2-fold and 2776 genes down-regulated 2-fold in ∆wblA at 24 h, and 1767 genes up-regulated 2-fold and 2349 genes down-regulated 2-fold in ∆wblA at 48 h. Comparative transcriptome analyses of WT and ∆wblA at both 24 h and 48 h identified 2594 transcripts whose differential expressions were equal or more than 2-fold. Among them, 1433 transcripts were equal or more than 5-fold, and significantly, 1147 transcripts were equal or more than 10-fold ( Figure 2b, Table S8). Three blocks were illustrated in the Heatmap (Figure 2a). In block 1, when ∆wblA incubated for 24 h, differentially expressed genes involved in some amino acids and secondary metabolites pathways were sharply decreased compared with those of WT, including the nikkomycin biosynthetic pathway and acarbose, as well as the validamycin biosynthetic pathways (map00525 in KEGG pathway) (Table S9). In block 2, the transcriptional levels of differentially expressed genes increased in the protein biosynthetic pathway, the ribosome biosynthetic pathway (map03010), and the secondary metabolites biosynthetic pathway in ∆wblA at 24 h, such as the TACs biosynthetic pathway, porphyrin and chlorophyll metabolism (map00860). In block 3, when ∆wblA incubated for 48 h, the transcriptional level of differentially expressed genes increased in environmental information processing, signal transducing, quorum sensing pathways and secretion system or transport pathways.

The Silent Nikkomycin Pathway and the Activated Tylosin Analogue Compounds (TACs) Biosynthetic Pathway in ∆wblA
Detailed analysis of the transcriptomics revealed a significant up-regulation of TAC biosynthetic genes (from pks1 to orf11) in ∆wblA (Table 2, Figure 3a). Except for orf5 which has a 2.8-fold increase, the transcriptional levels of the TAC biosynthetic genes all showed a 14-192-fold increase at 24 h ( Table 2, the column of fold change of ∆wblA_24 h against WT_24 h). Moreover, the transcriptional up-regulation of the genes ctg1_2035 or ctg1_2036 encoding methylmalonyl-CoA mutase and ctg1_2557 or ctg1_4516 encoding acetyl-CoA C-acetyltransferase at 48 h is much higher than that at 24 h; it seems that these genes might be of importance for the biosynthesis of tylosin analogue compounds (TACs). Meanwhile, with the production of the TAC compounds at 48 h, the transcriptional levels of the TACs cluster-situated biosynthetic genes increased approximately 1-10 folds in ∆wblA compared with those of wild-type 7100 at 48 h, even though their transcriptional levels are lower than those at 24 h (Table 2). We speculated that the transcriptional fold change of these genes may be the main factor leading to the production of the the TACs. The decrease of fold change of ∆wblA/WT at 48 h versus fold change of ∆wblA/WT at 24 h may be related to the feedback negative regulation. Table 2. Differentially expressed genes in the biosynthesis of tylosin analogue compounds (TACs).

Gene ID
Step or Gene Name  The blue boxes represent the precursors that incorporated into tylactone, and the orange boxes represent the precursors derived from the catabolism of five amino acids. The numbers 1-10 denote differentially expressed genes in 10 reaction steps of deduced KEGG pathway. Green numbers denote down-regulated genes encoding enzymes responsible for these reaction steps, red numbers denote up-regulated genes encoding enzymes responsible for these reaction steps. Dotted arrows represent multiple genes responsible for these reaction steps.
The transcription of the genes responsible for the TAC precursor biosynthesis was also analyzed at both 24 h and 48 h. Acetyl-CoA, methylmalonyl-CoA, ethylmalonyl-CoA and malonyl-CoA were proposed to be involved in the tylactone's assembly (Figure 3b). These precursors or intermediates are derived from primary metabolic pathways as amino acid catabolism, fatty acids catabolism, citrate cycle or pyruvate metabolism. The gene (ctg1_4798) responsible for the biosynthesis of acetyl-CoA from leucine metabolism (step 5 in Figure 3b) was transcriptionally up-regulated (6.5-fold increase at 24 h) in ∆wblA. Meanwhile, the genes (ctg1_4276, ctg1_5640, ctg1_6838 and ctg1_7415) catalyzing the branched metabolic pathway in leucine catabolism (step 4 in Figure 3b) were down-regulated, which was proposed to prompt the intermediates' flow to the formation of acetyl-CoA. The transcriptional levels of the methylmalonyl-CoA biosynthetic genes including ctg1_4919 (step 10 in Figure 3b) and ethylmalonyl-CoA biosynthesis related genes (ctg1_1366, ctg1_2557, ctg1_4516, step 1 in Figure 3b) were all apparently increased to supply more precursors for TAC biosynthesis. Transcriptional analysis of the precursors biosynthetic genes will be helpful for improvement of TACs' production by optimizing the metabolic pathways.
Nikkomycins, a group of peptidyl nucleoside antibiotics, are a potent competitive inhibitor of fungal chitin synthases, since their chemical structure is similar to the natural substrate of chitin synthase, UDPN-acetylglucosamine [31]. The abolishment of nikkomycin production in ∆wblA was also approved by analyzing the transcripts. Nearly all the nikkomycin biosynthetic genes were found to be silent at both 24 h and 48 h in ∆wblA ( Figure S2a,b and Table S9). Since no direct binding was observed between WblA and the related promoters, the mechanism that wblA disruption activated TACs' production and simultaneously abolished nikkomycin biosynthesis is still unknown; more investigations need to be performed.

Production of an Anthracycline Antibiotic Oviedomycin Activated by Disrupting Genes san7324 plus san7324L
In ∆wblA, transcription of an rsbR homologous gene ctg1_705 (named as san7324) in RNA-seq data was found to be dramatically decreased, which was consistent with the RT-PCR results (Figure 4a,b), implying that it was positively regulated by WblA. RsbR encoded by rsbR is a positive regulator modulating sigma factor B activity in Bacillus subtilis [32][33][34]. The disruption of san7324 in S. ansochromogenes 7100 abolished nikkomycin production ( Figure S3). However, unlike that in ∆wblA, the production of tylosin analogues in ∆san7324 was not detected [35]. One hypothesis is that a san7324 homologous gene ctg1_3665 (named as san7324L), which was almost not transcribed in ∆wblA, might play the same function, while san7324 was deficient. In addition, their flanking genes (encoding RsbS, RsbT or sigma factor) also showed low transcriptional levels (Figure 4c,d). Thus, the double mutation of san7324 and san7324L (∆san7324/7324L) was constructed and the HPLC results showed TACs were still not be observed. To our surprise, a new peak with maximum absorbance at 280 nm was detected. This compound exhibited an obvious inhibition zone against Bacillus subtilis (Figure 5a,b). Mass spectrometry (MS) analysis showed a [M + H] + signal at m/z 351.2, which is consistent with oviedomycin ( Figure 5c). Oviedomycin is an anthracycline antibiotic which had been previously activated in ∆adpA [36]. The production of oviedomycin implied that RsbR homologs might negatively regulate oviedomycin biosynthesis.

Understanding on Activation of Cryptic Antibiotic Biosynthetic Gene Clusters Guided by RNA-seq Data
The TACs biosynthesis was activated by the disruption of wblA, but the mechanism of the action of WblA is quite complicated because WblA as a global regulator influences a variety of target genes. Furthermore, there is a serious challenge in obtaining sufficient and active WblA protein (an unstable protein) to perform the protein-DNA interaction experiment. However, the RNA-seq data indicated that some sets of genes involved in several precursors' (Acetyl-CoA, methylmalonyl-CoA etc.) biosynthesis up-regulated in ∆wblA, which may be helpful for the biosynthesis of TACs. In addition to focusing on the role of WblA in secondary metabolites biosynthesis and morphological differentiation, we also noticed that the transcription of wblA (ctg1_3506) itself maintained at a high level in its parent strain, but decreased to 0.35-fold and 0.1-fold in ∆wblA at 24 h and 48 h, respectively. Thus, WblA seems to be a positive regulator for its own transcription, and the detailed mechanism of its action remains to be revealed.
Prior to making a decision on target genes for other products' biosynthesis, we screened the differentially expressed genes in ∆wblA and found the transcription of san7324 and san7324L was substantially down-regulated more than 10-fold in ∆wblA at 24 h, and also their deduced products (enzymes) are probably involved in biosynthetic pathways, therefore the two genes were selected for further experiment. The results demonstrated that oviedomycin production can be triggered by the disruption of san7324 and san7324L, but further investigations on their specific roles need to be conducted in the future. The function of Wbl family proteins has been reported in a number of studies. Some of them had the capacity to form a protein-protein complex and change the structural conformation to regulate its targets, while some Wbl proteins can also modulate sigma factors [24,25,37,38], suggesting that they may play broad roles in multiple pathways. In conclusion, mining of the RNA-seq data provided a useful alternative approach for searching target genes, which may be related to secondary metabolisms. It would be more effective to activate silent gene clusters if combined with other strategies.

HPLC Analysis of Oviedomycin
The cultures were filtered through gauze, extracted with chloroform, dried in vacuo and then re-dissolved in 1 mL of methanol. The detection of oviedomycin was performed by HPLC with a ZORBAX SB-C18 reverse phase column (4.6 mm × 250 mm, 5 µm; Agilent) as described previously [36].

PacBio Sequencing, Assembly and Gene Annotation
The 64 µg of high-quality genomic DNA was used to prepare the SMRTbell library, and then sequenced on a PacBio RS II platform [39].
Prior to assembly, raw reads containing low-quality were filtered. After pre-process, a total of 2.09 Gb of the usable data were retained for the following assembly (Table S1). We used Canu software to assemble the subreads into scaffolds. The final assembly size is 9.56 Mb with scaffold N50 of 9.21M (Table S2). Repeat sequences were identified by Repeat Masker software. The estimated repeat sequences account for 2.37% of the genome assemblies (Table S3). Prodigal was performed to predict genes. In total, 8325 proteinencoding genes were generated for S. ansochromogenes 7100 (Table S4). The tRNA-coding genes were predicted by tRNAscan-SE (v1.3.1) software. rRNA and other ncRNA-coding genes were identified by Infernal 1.1 against Rfam database (http://rfam.xfam.org/, access on 9 September 2021) ( Table S5). Annotation of predicted genes was conducted by BLASTP against NCBI non-redundant protein sequence database (NR), Swissprot and TrEMBL protein databases with E-value < 10 −5 . Gene Ontology (GO) terms were retrieved from NR outputs by Blast2GO software [40]. Pathway analysis was performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG) annotation service KAAS (Table S6).

RNA Isolation, Library Construction and Sequencing
The mycelia of S. ansochromogenes 7100 and ∆wblA were harvested at 24 h and 48 h from a 50-mL culture by centrifugation at 12,000 rpm for 15 min at 4 • C. The cell pellets were quickly frozen in liquid N 2 and then stored at −80 • C. Total RNAs were isolated using the TRIzol reagent, and DNase I was used to digest the total DNA according to the manufacturer's protocol. RNA degradation or contamination was monitored with 1% agarose gel electrophoresis. RNA purity was checked by using the NanoPhotometer ® spectrophotometer (IMPLEN, Westlake Village, CA, USA). Strand-specific RNA-Seq libraries were constructed and sequenced by the Illumina HiSeq 2000 system, 100 bp paired-end reads were generated and the RNA-seq data were analyzed by Majorbio company.

Quality Control and Reads Mapping to the Reference Genome
The raw data were filtered using a Perl program by removing low-quality sequences, reads with more than 5% of N bases (unknown bases) and reads containing adaptor sequences. Then the clean data were mapped to the genome sequences of S. ansochromogenes 7100 by the Bowtie2 program [41].

Quantification of Gene Expression Level and Differential Expression Analysis
An abundance of transcripts was estimated using RSEM and transformed into formula FPKM (fragments per kilobase per million mapped reads). Differentially expressed genes (DEGs) were calculated by using R package DESeq [42]. Genes with an adjusted p-value (p-adjust) < 0.001 found by DESeq were assigned as differential expressions, and those with fold change |Log2FC| ≥ 1 were defined to be either up-or down-regulated genes, respectively. The Heatmap was made by R package heatmap.2.

Data Availability
The genomic DNA sequence of S. ansochromogenes 7100 is deposited in the China National Microbiology Data Center (NMDC) with accession numbers NMDC60029072. RNA-seq raw data were deposited in the China National Microbiology Data Center (NMDC) with accession numbers NMDC40009909 (WT_24 h), NMDC40009910 (WT_48 h), NMDC40009911 (∆wblA_24 h), NMDC40009912 (∆wblA_48 h).

Conclusions
RNA-seq analyses demonstrated that high-level transcription of most genes was associated with the production of nikkomycin in WT and TACs in ∆wblA appeared at 24 h, but decreased significantly at 48 h. Interestingly, double mutation of san7324 and san7324L (∆san7324/7324L) led to the production of oviedomycin, which was not produced in wildtype S. ansochromogenes under laboratory conditions. It was suggested that RsbR homologs (San7324/7324L) might negatively regulate oviedomycin biosynthesis. Moreover, the findings would provide an effective approach to dissect the production and biosynthetic pathways of natural secondary metabolites, and also provide insights into the activation of cryptic natural products.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/antibiotics10091097/s1, Figure S1: Saturation curves of sequencing data in RNA-seq, Figure S2: Differentially expressed genes involved in nikkomycin biosynthesis in both WT and ∆wblA, Figure S3: HPLC analysis of nikkomycin production in WT and ∆san7324, Table S1: Statistics of filtered sequencing data, Table S2: Assembly of genome sequencing data, Table S3: Summary of repeat elements identified, Table S4: Statistics of prediction of the proteins, Table S5: Non-coding RNAs, Table S6: Annotation of genes, Table S7: Genome mapping statistics of RNA-seq data, Table S8: Significantly expressed genes in Venn diagram, Table S9: FPKM value of part of differentially expressed genes.