Interplay between Non-Coding RNA Transcription, Stringent/Relaxed Phenotype and Antibiotic Production in Streptomyces ambofaciens

While in recent years the key role of non-coding RNAs (ncRNAs) in the regulation of gene expression has become increasingly evident, their interaction with the global regulatory circuits is still obscure. Here we analyzed the structure and organization of the transcriptome of Streptomyces ambofaciens, the producer of spiramycin. We identified ncRNAs including 45 small-RNAs (sRNAs) and 119 antisense-RNAs (asRNAs I) that appear transcribed from dedicated promoters. Some sRNAs and asRNAs are unprecedented in Streptomyces and were predicted to target mRNAs encoding proteins involved in transcription, translation, ribosomal structure and biogenesis, and regulation of morphological and biochemical differentiation. We then compared ncRNA expression in three strains: (i) the wild-type strain; (ii) an isogenic pirA-defective mutant with central carbon metabolism imbalance, “relaxed” phenotype, and repression of antibiotic production; and (iii) a pirA-derivative strain harboring a “stringent” RNA polymerase that suppresses pirA-associated phenotypes. Data indicated that the expression of most ncRNAs was correlated to the stringent/relaxed phenotype suggesting novel effector mechanisms of the stringent response.


Introduction
In recent years, the key role that small non-coding RNAs (ncRNAs) play in the regulation of gene expression in bacteria is becoming increasingly evident. ncRNA-mediated gene regulation involving riboswitches, trans-encoded small RNAs (herein referred to as sRNAs), or cis-encoded antisense RNAs (herein referred to as asRNAs) modulates many essential physiological and stress responses [1][2][3][4]. Among ncRNAs, sRNAs and asRNAs mostly act on target mRNAs via base-pairing leading to positive or negative modulation of the target gene expression, providing the bacterial cell with a very simple, cheap and effective gene regulation mechanism that is alternative to more complex and expensive mechanisms based on protein-nucleic acid interactions [5,6].
sRNAs are encoded elsewhere in the genome with respect to gene coding for their target mRNAs and are usually able to act on multiple targets although with a limited base pairing that often requires RNA chaperone proteins [5][6][7]. asRNAs and their target mutant in more detail, focusing on some aspects related to its relaxed phenotype including ncRNA expression. To this purpose we introduced into this strain the Nonomuraea gerenzanensis rpoB(R) gene, coding for a mutant type of RNA polymerase beta-chain [36]. The product of rpoB(R) is characterized by five amino acid substitutions located within or close to the so-called rifampin resistance clusters that play a key role in mimicking a "stringent" phenotype [36,37], by activating the secondary metabolism. Indeed, rpoB(R) markedly activated antibiotic biosynthesis in the wild-type Streptomyces lividans strain 1326 and also in strain KO-421, a relaxed (rel) mutant unable to produce ppGpp [36]. The rationale of transcriptomic approach of the present study was to better understand the interplay between ncRNA expression and control of morphological and biochemical differentiation.

S. ambofaciens Transcriptome: Structure and Organization
The transcriptome of S. ambofaciens ATCC 23877 linear chromosome was redefined from RNAseq data of bacteria growing in YS broth at different time points (48 h, 72 h, 96 h, and 120 h) based on Rockhopper software [38][39][40] predictions. The original 7229 genes located on the linear chromosome were structured in 5587 transcriptional units (TUs). Furthermore, 4433 TUs were monocistronic while 1154 were polycistronic, with a maximum of 23 genes included in a single unit (Table 1, and Supplementary Table S1). Only 53% of the originally defined genes (2944) retained the exactly original annotation while, for the other genes, either new 5 -UTR, 3 -UTR, or both, sequences, or polycistronic arrangements were individuated. The new annotation increased the portion of linear chromosome nucleotides included in the transcriptome from 87% to 90.8%. This 3.8% increase leads to a doubling of the mapped reads falling into annotated transcripts, which increase from 45% to 94% on average on S. ambofaciens ATCC 23877 strain. This gain is even more impressive looking at the 48 h timepoint (Supplementary Table S2). Both polycistronic and monocistronic transcription units appeared to be uniformly distributed along the linear chromosome of S. ambofaciens as shown in Figure 1 that illustrates the transcription units as a function of their lengths and locations in the genome map. The median and length range of these transcription units are reported in Table 1. Some monocistronic and polycistronic transcription units largely exceeding the average length were located near the chromosome "right arm" (Figure 1). Other large monocistronic transcription units mapped to a specific "core" region of the S. ambofaciens chromosome where the spiramycin gene cluster and additional gene clusters for secondary metabolites are located. Computational analysis also reveals 162 new transcripts putatively originated from dedicated promoters. We classified them into sRNAs (Supplementary Table S3) and asRNAs (herein referred as asRNAs I) according to the fact that, based on our re-annotation, they overlap on the opposite strand of another gene ( Figure 1, and Supplementary Table S4).
Furthermore, we identified asRNAs resulting from divergent transcription (herein referred to as asRNAs II) (Supplementary Table S5) and asRNAs resulting from the overlap of the terminal region of convergently oriented genes (cutoRNAs, herein referred to as asRNAs III) (Supplementary Table S6 Figure 2. Abundances and median plus length ranges of these three classes of asRNAs are reported in Table 1, while Figure 1 illustrates the asRNAs I, II, and III as a function of their lengths and locations in the genome map. For asRNA II and asRNA III, only the total overlaps between the two involved genes were considered. Most sRNAs and asRNAs I and II were mapped in the "core" region of the S. ambofaciens linear chromosome, at variance with asRNAs III that were uniformly distributed along the chromosome. The high number of asRNA III (also known as cutoRNAs) is noteworthy, a typical feature of streptomycetes that have genomes with high GC content and weak transcription termination [20]. We need to specify that our classification is based on only the transcriptional landscape and that open reading frames (ORFs) were not investigated among the newly identified transcripts; doing this a different classification could originate. The presence of some asRNAs I, II, and III with lengths largely exceeding the average lengths (Table 1) is also remarkable. based on our re-annotation, they overlap on the opposite strand of another gene ( Figure  1, and Supplementary Table S4). Furthermore, we identified asRNAs resulting from divergent transcription (herein referred to as asRNAs II) (Supplementary Table S5) and asRNAs resulting from the overlap of the terminal region of convergently oriented genes (cutoRNAs, herein referred to as asRNAs III) (Supplementary Table S6 Figure 2. Abundances and median plus length ranges of these three classes of asRNAs are reported in Table 1, while Figure 1 illustrates the asRNAs I, II, and III as a function of their lengths and locations in the genome map. For asRNA II and asRNA III, only the total overlaps between the two involved genes were considered. Most sRNAs and asRNAs I and II were mapped RNAseq data showed that 11 (out of 49, i.e., about 25%) sRNAs and 39 (out of 119, i.e., about 33%) asRNAs I exhibited clear differential expression during the growth of the S. ambofaciens ATCC 23877 strain (Figure 3; Supplementary Table S7). In particular, most of them (7 sRNAs and 31 asRNAs I) were upregulated in at least one of the later time points  (72 h, 96 h, or 120 h) when compared to the 48-h expression, while showing an increasing  trend for all the later time points in all cases. Conversely, only a few of them (4 sRNAs and  8 asRNAs I) were more expressed at 48 h than at later time points. genomes with high GC content and weak transcription termination [20]. We need to specify that our classification is based on only the transcriptional landscape and that open reading frames (ORFs) were not investigated among the newly identified transcripts; doing this a different classification could originate. The presence of some asRNAs I, II, and III with lengths largely exceeding the average lengths (Table 1) is also remarkable. RNAseq data showed that 11 (out of 49, i.e., about 25%) sRNAs and 39 (out of 119, i.e., about 33%) asRNAs I exhibited clear differential expression during the growth of the S. ambofaciens ATCC 23877 strain ( Figure 3; Supplementary Table S7). In particular, most of them (7 sRNAs and 31 asRNAs I) were upregulated in at least one of the later time points (72 h, 96 h, or 120 h) when compared to the 48-h expression, while showing an increasing trend for all the later time points in all cases. Conversely, only a few of them (4 sRNAs and 8 asRNAs I) were more expressed at 48 h than at later time points.   Table S7). Log2Fold changes of sRNA (a) and asRNA I (b) resulting differentially expressed in at least one timepoint were reported. The sRNAs and asRNAs I are arranged on the x-axis according to the sRNA and asRNA I ID numbering reported in Supplementary Tables S3 and S4.

Novel sRNAs and Putative Targets
Only very few sRNAs with known functions have been functionally characterized in Streptomyces. In Streptomyces coelicolor A3(2), Vockenhuber et al. [23] identified 63 ncRNAs including 29 cis-encoded antisense RNAs and confirmed expression for 11 of them which are now included in Rfam database (scr1601, scr2736, scr2952, scr3202, scr3920, scr4115, scr4389, scr4632, scr5676, scr6106, and scr6925) and in prevalence show growth-phase dependent expression. Of these ncRNAs, 5 out of 11 (scr1601, scr2736, scr2952, scr3920, and  Table S7). Log2Fold changes of sRNA (a) and asRNA I (b) resulting differentially expressed in at least one timepoint were reported. The sRNAs and asRNAs I are arranged on the x-axis according to the sRNA and asRNA I ID numbering reported in Supplementary Tables S3 and S4.
The Rfam database (http://rfam.xfam.org/, accessed on 15 February 2020) was first used to search for functional information and the degree of phylogenetic conservation of predicted S. ambofaciens sRNAs (Supplementary Table S8). Computational analysis predicted scr2736 homolog (sRNA #10) among the sRNAs, in addition to other conserved sRNAs including the bacterial small signal recognition particle RNA (SRP RNA) (sRNA #20), and 6C RNA (sRNA #24). 6C RNA is a class of ncRNA present in actinomycetes, which is characterized by a conserved RNA structure having two stem-loop structures each containing six or more cytosine residues. Transcription of the S. coelicolor 6C RNA increases during sporulation [22]. In Mycobacterium tuberculosis, 6C sRNA regulates genes involved in various processes including DNA replication and protein secretion [41].
Trying to understand the reason for the possible misclassification of scr1601, scr4115, scr4677, scr5239, scr5676, and scr3202 homologs in S. ambofaciens we went back to timepoint predictions in search of possible prediction errors (Supplementary Figure S1). Only for two of them (scr5239, scr5676), we found predictions for both UTR and ncRNAs in different days and relevant signals corresponding to the expected ncRNAs length, so we decided to add them to the list of sRNA as sRNA #44 and sRNA #45.
Overall, these results seem to associate most of S. ambofaciens sRNAs with gene functions involved in the control of mycelial growth, morphological differentiation, secondary metabolism, and stress responses. Indeed, one might note sRNA #2 targeting two paralogs coding for diadenosine tetraphosphate (Ap 4 A) hydrolase (RS00910 and RS35245). It may also be worth noting the presence of two sRNAs, sRNA #3 and sRNA #43, targeting a pSAM2-encoded protein (RS19305) harboring a GGDEF domain that is present in diguanylate cyclase (c-di-GMP synthetase) or its enzymatically inactive variants. In Streptomyces c-di-GMP is a central component of the signal transduction network by controlling the activity of the developmental master regulator BldD [53].

Antisense Transcription in S. ambofaciens
Although it was initially considered non-functional transcriptional noise, antisense transcription is increasingly considered important in regulating gene expression [10]. This assumption is corroborated by the high phylogenetic conservation of several asRNAs. In S. ambofaciens we detected 119 asRNAs putatively expressed from a dedicated promoter (asRNAs I) targeting 146 mRNAs (Supplementary Table S4). Two of them were antisense to tRNAs, seven to pseudogenes, three to other ncRNA, and two to mixed feature types. The remaining 133 targeted newly defined gene extensions (UTR and within_operon regions) (19), CDS (83), and a mix of the three (31). Only 21 antisense_ncRNA accounted for less than 90% of their length for the target TUs. Functional enrichment analysis by Clusters of Orthologous Groups (COGs) demonstrated that asRNA-targeted CDSs, with respects to whole-genome CDSs, showed a prevalence in the COGs J (translation, ribosomal structure, and biogenesis), F (nucleotide transport and metabolism), and C (energy production and conversion) ( Figure 4). GlassGO analysis revealed that about 30% of class I asRNAs were specific to S. ambofaciens: 10% of the sequences (12 out of 119) were identified only in S. ambofaciens ATCC 23877 strain while 20% of the sequences (24 out of 119) were identified in both ATCC 23877 and DSM 40697 strains. The remaining 70% of the sequences appeared to be evolutionarily conserved: 56.3% were present in the Streptomyces genus, and 13.4% in both Streptomyces and other genera (Supplementary Table S11). Although this tool does not allow assess whether an ncRNA is expressed or not, the results are accurate (see section 4.7. of the Materials and Methods).
Among asRNA-targeted mRNAs of genes involved in energy production and conversion, we found gltA2 (RS15735) coding for citrate synthase 2, nuoF (RS20780), nuoL2 (RS21040), and nuoM2 (RS21045). nuoF, nuoL2, and nuoM2 code for subunits of the NADH:quinone oxidoreductase multi-protein complex (the respiratory complex I) (Supplementary Table S4). Like other streptomycetes, S. ambofaciens possesses a complete 14subunit encoding nuo operon (nuoA-N), and an additional copy of many nuo genes (nuoA2, nuoB2, nuoD2, nuoH2, nuoI2, nuoK2, nuoL2, nuoM2, and nuoN2). While nuoD2 is separated from the other nuo genes, the other ones, i.e., nuoA2, nuoB2, nuoH2, nuoI2, (YnuoJ) nuoK2, nuoL2, nuoM2, and nuoN2 are clustered together in an operon (nuo2). Interestingly, the asRNAs targeting nuoF, nuoL2, and nuoM2 are evolutionarily conserved in streptomycetes [20]. Two asRNAs that appear to be conserved in Streptomyces, albeit not described so far in this genus, map in the biosynthetic gene cluster for aminobacteriohopanetriol (a C35 GlassGO analysis revealed that about 30% of class I asRNAs were specific to S. ambofaciens: 10% of the sequences (12 out of 119) were identified only in S. ambofaciens ATCC 23877 strain while 20% of the sequences (24 out of 119) were identified in both ATCC 23877 and DSM 40697 strains. The remaining 70% of the sequences appeared to be evolutionarily conserved: 56.3% were present in the Streptomyces genus, and 13.4% in both Streptomyces and other genera (Supplementary Table S11). Although this tool does not allow assess whether an ncRNA is expressed or not, the results are accurate (see Section 4.7 of the Materials and Methods).
We found many putative asRNAs that are transcribed complementary to genes coding for regulatory proteins that oversee morphological and biochemical differentiation. In particular, an asRNA is transcribed complementary to phoR (RS16370) coding for the sensor kinase of the two-component PhoR-PhoP system [54]. Moreover, another asRNA is transcribed complementary to whiB coding for the regulatory protein WhiB that is essential for morphological differentiation in Streptomyces [55,56]. In addition, some asRNAs appear to also target genes coding for proteins involved in the control of secondary metabolite biosynthesis, and, in particular, spiramycin production and resistance. Precisely, asRNAs #88 is transcribed complementary to smrR mRNA (Supplementary Table S4). As mentioned above, smrR codes for a transcriptional activator of the spiramycin gene cluster [57,58]. Another asRNA, asRNAs #87, is transcribed complementary to smrB mRNA (Supplementary Table S4).

Expression of ncRNAs in S. ambofaciens ATCC 23877, and Derivative ΩpirA and ΩpirA rpoB(R) Mutants
With the aim of gaining some insight into the possible biological effects of predicted ncRNAs, we analyzed their expression levels in three S. ambofaciens strains: the wild type ATCC 23877 strain (hereafter indicated as w.t.), an isogenic pirA-defective mutant (here referred to as ΩpirA) that is characterized by central carbon and energy metabolism imbalance, high sensitivity to oxidative injury, and repression of polyketide antibiotic production [35], and a pirA-defective derivative strain harboring a stringent RNA polymerase exhibiting partial suppression of pirA-defective phenotypes (here referred to as ΩpirA rpoB(R)). Specifically, in the ΩpirA strain the integrative plasmid pTYM-18 (a ΦC31 Att/Int system-based vector) inactivates pirA, encoding a redox-sensitive negative modulator of very long-chain acyl-CoA dehydrogenase, which catalyzes the first committed step of the beta-oxidation pathway [35].
RpoB(R) harbors a specific histidine-to-asparagine substitution in the rifampin resistance cluster I, which was believed to be essential for the activation of the secondary metabolism by mimicking a "stringent" phenotype [36,37], and, indeed, we found that rpoB(R) in S. ambofaciens ΩpirA increased spiramycin (in YS medium) (Figure 5a,b) and antimycin (in SFM medium) (Figure 5c,d) production. This effect may be a consequence of rpoB(R)-mediated phenotypic suppression of the "relaxed" phenotype exhibited by ΩpirA strain that was characterized by upregulation of genes involved in translation, ribosomal structure and biogenesis, and amino acid biosynthesis, and downregulations of spoT, encoding the enzyme involved in (p)ppGpp biosynthesis [35]. ΩpirA strain that was characterized by upregulation of genes involved in translation, ribosomal structure and biogenesis, and amino acid biosynthesis, and downregulations of spoT, encoding the enzyme involved in (p)ppGpp biosynthesis [35]. Partial suppression of pirA-associated phenotypes was also evident at the transcriptome level (Supplementary Table S12). In these experiments, w.t., ΩpirA, and ΩpirA rpoB(R) were cultivated in YS broth. In this medium, growth rates during rapid growth phase 1 (RG1), and final biomass values of ΩpirA were slightly lower as compared to w.t., while in ΩpirA rpoB(R) and w.t. these parameters were similar (Figure 5a). This finding was also supported by maSigPro analysis of RNA-Seq time series dataset that identified the expression pattern of 1166 of the most variable genes during the time course subdivided into 9 clusters (Supplementary Table S12). The median profile of each cluster was inferred from the expression patterns and was used to represent the expression profile of the strains (Figure 6). For most of the clusters, the expression profile of ΩpirA rpoB(R) shows an intermediate profile between w.t. and ΩpirA if not a complete rescue of either the expression levels, time course profile, or both, as in cluster 8 and in cluster 4, supporting the experimental data. Partial suppression of pirA-associated phenotypes was also evident at the transcriptome level (Supplementary Table S12). In these experiments, w.t., ΩpirA, and ΩpirA rpoB(R) were cultivated in YS broth. In this medium, growth rates during rapid growth phase 1 (RG1), and final biomass values of ΩpirA were slightly lower as compared to w.t., while in ΩpirA rpoB(R) and w.t. these parameters were similar (Figure 5a). This finding was also supported by maSigPro analysis of RNA-Seq time series dataset that identified the expression pattern of 1166 of the most variable genes during the time course subdivided into 9 clusters (Supplementary Table S12). The median profile of each cluster was inferred from the expression patterns and was used to represent the expression profile of the strains (Figure 6). For most of the clusters, the expression profile of ΩpirA rpoB(R) shows an intermediate profile between w.t. and ΩpirA if not a complete rescue of either the expression levels, time course profile, or both, as in cluster 8 and in cluster 4, supporting the experimental data.  Table S12) and median profile of each cluster was inferred from the expression patterns.
The growth of Streptomyces follows a biphasic curve in which two phases of rapid growth (RG1, RG2), interspersed by a transition phase (T), precede the stationary phase (S). ncRNA levels were compared in w.t., ΩpirA, and ΩpirA rpoB(R) at two corresponding time points: 48 h, RG1 and 120 h, stationary phase. Results demonstrated that 21 (out of 45, i.e., about 47%) sRNAs and 72 (out of 119, i.e., about 70%) asRNAs (asRNAs I) were differentially expressed at least in one strain when compared to the w.t. (Supplementary  Table S13). The plots in Figure 7 show log2fold changes (log2FC) of the mutants compared to w.t. at 48 and 120 h. The data indicated that at 48 h more ncRNAs were differentially expressed in ΩpirA rpoB(R) with respect to 120 h (42 and 16, respectively), while for ΩpirA the number was similar (55 and 52, respectively). At 48 h the cloud was shifted up toward ΩpirA rpoB(R) indicating that, in general, the delta in expression level was higher for this strain, while at 120 h this was not true anymore. At both time points we could notice that the greatest part of differentially expressed genes was not proportional between the two mutants (not within the red lines) but tended to be differentially expressed only in ΩpirA genotype. sRNAs are generally equally distributed between differentially up and downregulated features, apart from ΩpirA rpoB(R) at 120 h which shows only 3 upregulated sRNAs (Supplementary Table S13). asRNAs showed clear trends of downregulation in mutant strains (85% to 97% of the differentially expressed features, Supple-  Table S13). The plots in Figure 7 show log2fold changes (log2FC) of the mutants compared to w.t. at 48 and 120 h. The data indicated that at 48 h more ncRNAs were differentially expressed in ΩpirA rpoB(R) with respect to 120 h (42 and 16, respectively), while for ΩpirA the number was similar (55 and 52, respectively). At 48 h the cloud was shifted up toward ΩpirA rpoB(R) indicating that, in general, the delta in expression level was higher for this strain, while at 120 h this was not true anymore. At both time points we could notice that the greatest part of differentially expressed genes was not propor-tional between the two mutants (not within the red lines) but tended to be differentially expressed only in ΩpirA genotype. sRNAs are generally equally distributed between differentially up and downregulated features, apart from ΩpirA rpoB(R) at 120 h which shows only 3 upregulated sRNAs (Supplementary Table S13). asRNAs showed clear trends of downregulation in mutant strains (85% to 97% of the differentially expressed features, Supplementary Table S13). Furthermore, ΩpirA rpoB(R) had an opposite trend at 48 h with 62% of upregulated sRNAs.  Table S13). Log2fold changes of the two mutants compared to the w.t. strain were reported in the plots as indicated by the labels.
As for asRNAs I targeting genes involved in secondary metabolism, we found that asRNA #119 was less expressed in ΩpirA than in both w.t. and ΩpirA rpoB(R). This asRNA is transcribed complementary to afsA coding for A-factor biosynthesis protein AfsA. Expression of asRNA #76, which was transcribed complementary to whiB, exhibited a similar trend. This asRNA was more expressed in w.t. than in ΩpirA, and even more in ΩpirA rpoB(R). In addition, asRNA #87, which is transcribed complementary to smrB in the spiramycin gene cluster, was also more expressed in w.t. than in ΩpirA, even more in ΩpirA rpoB(R) at earlier time point. smrB encodes for ribosomal protection factor that is involved in spiramycin resistance [60,61]. Therefore, it would be interesting to understand if there is a link between spiramycin production and expression of asRNA #87.

Discussion and Conclusions
In this study, the redefinition of monocistronic and polycistronic transcription units, and other structural elements including 5'-and 3'-UTR and intercistronic regions of the S. ambofaciens ATCC 23877 transcriptome was preliminary to the identification of novel sRNAs and asRNAs. We identified 45 sRNAs and 119 asRNAs (asRNAs I) that are predicted to be transcribed from dedicated promoters. Some sRNAs were previously identified in Streptomyces. Other sRNAs are novel, and bioinformatic analysis predicted several mRNAs encoding proteins involved in the control of mycelial growth, morphological differentiation, secondary metabolism, and stress responses, which may be targeted by these sRNAs. Indeed, most putative sRNA targets code for key proteins involved in transcrip-  Table S13). Log2fold changes of the two mutants compared to the w.t. strain were reported in the plots as indicated by the labels.
As for asRNAs I targeting genes involved in secondary metabolism, we found that asRNA #119 was less expressed in ΩpirA than in both w.t. and ΩpirA rpoB(R). This asRNA is transcribed complementary to afsA coding for A-factor biosynthesis protein AfsA. Expression of asRNA #76, which was transcribed complementary to whiB, exhibited a similar trend. This asRNA was more expressed in w.t. than in ΩpirA, and even more in ΩpirA rpoB(R). In addition, asRNA #87, which is transcribed complementary to smrB in the spiramycin gene cluster, was also more expressed in w.t. than in ΩpirA, even more in ΩpirA rpoB(R) at earlier time point. smrB encodes for ribosomal protection factor that is involved in spiramycin resistance [60,61]. Therefore, it would be interesting to understand if there is a link between spiramycin production and expression of asRNA #87.

Discussion and Conclusions
In this study, the redefinition of monocistronic and polycistronic transcription units, and other structural elements including 5 -and 3 -UTR and intercistronic regions of the S. ambofaciens ATCC 23877 transcriptome was preliminary to the identification of novel sRNAs and asRNAs. We identified 45 sRNAs and 119 asRNAs (asRNAs I) that are predicted to be transcribed from dedicated promoters. Some sRNAs were previously identified in Streptomyces. Other sRNAs are novel, and bioinformatic analysis predicted several mRNAs encoding proteins involved in the control of mycelial growth, morphological differentiation, secondary metabolism, and stress responses, which may be targeted by these sRNAs. Indeed, most putative sRNA targets code for key proteins involved in transcription control including sigma and transcriptional factors.
Among the sRNAs that were predicted to target genes encoding transcriptional factors, it may be worth noting the cases of the sRNAs targeting the pleiotropic Lrp and AcrR paralogs. Specifically, the Lrp RS15615 was predicted to be targeted by 3 sRNAs. Lrp RS15615 is the homolog of SCO3361 in S. coelicolor A3(2). SCO3361 functions as a pleiotropic regulator controlling secondary metabolism and morphological development [62]. In particular, it activates actinorhodin (Act) production by directly binding to actII-ORF4 promoter, and it stimulates the expression of amfC, whiB, and ssgB thus promoting hyphae formation and sporulation [62]. Phenylalanine and cysteine were identified as the effector molecules of SCO3361, with phenylalanine reducing the binding affinity, whereas cysteine increasing it [62].
In this regard, one might note that 3 Hfq-dependent sRNAs, namely, DsrA, MicF, and GcvB, each independently downregulate the lrp transcript in E. coli [63]. MicF and DsrA interact with an overlapping site early in the lrp ORF, while GcvB acts upstream in the long lrp 5 -UTR [63]. In particular, GcvB was responsible for lrp downregulation in response to oxidative stress. Four AcrR paralogs appeared to be, instead, each targeted by a distinct sRNA. AcrR is a one-component allosteric repressor of the genes associated with lipid transport and antibiotic resistance. AcrR contains a C-terminal ligand-binding domain and an N-terminal operator-binding region. When fatty acid ligands bind to the C-terminal domain, a conformational change in the N-terminal domain is triggered, which releases the repressed DNA and initiates transcription [64].
Among the sRNAs that were predicted to target genes involved in the biosynthesis of cellular alarmones, one might note sRNA #2 targeting two paralogs coding for diadenosine tetraphosphate (Ap 4 A) hydrolase (RS00910 and RS35245). Ap 4 A is a product of the back reaction of the amino acid activation catalyzed by some aminoacyl-tRNA [65]. While on one hand, Ap 4 A may be considered an unavoidable and toxic by-product of protein synthesis that has to be cleared from the cell, on the other hand, it can function as a signal molecule and be deeply involved in the regulation of DNA replication, cell division, and stress response [66]. Ap 4 A hydrolases catalyzes the hydrolysis of Ap 4 A into two ADP and are involved in heat shock and oxidative stress responses in bacteria by regulating the intracellular Ap 4 A concentration [67].
Regarding the asRNAs that we detected in our study, some of them are also well conserved in Streptomyces, including the subgroup transcribed complementary to mRNAs coding for several subunits of the respiratory complex I. In particular, the asRNAs targeting nuoF, nuoL2, and nuoM2 are evolutionarily conserved in streptomycetes [20]. The asRNA targeting nuoF (and the contiguous nuoE in some streptomycetes) was hypothesized to play a role as a "checkpoint" during complex I assembly [20], while the asRNAs targeting nuoL2 and nuoM2 were proposed to regulate the differential assembly of NuoL vs. NuoL2 and NuoM vs. NuoM2 into the respiratory complex I [20], similarly to what was observed in several cyanobacteria in which the multiplicity of functions assigned to cyanobacterial complex I (respiration, cyclic electron flow, and CO 2 uptake) relies upon the diversity of the NdhD (NuoM) and NdhF (NuoL) protein isoforms resulting in the occurrence of distinct complex I assemblages [68].
Noteworthy, we found a remarkable number of asRNAs having as putative targets mRNAs of genes involved in translation, ribosomal structure, and biogenesis, in addition, two asRNAs that are transcribed complementary, respectively, to nusA, coding for the transcription elongation factor NusA, and cdnL, coding for an ANTAR-containing protein.
To our knowledge, asRNAs targeting ribosomal protein (r-protein)-encoding genes have not been described so far in Streptomyces. However, a survey of small ncRNAs associated with r-protein operons in the bacterial pathogens Staphylococcus aureus, Vibrio cholerae, Salmonella enterica sv. Typhi, and Mycobacterium tuberculosis reported the presence of a total of 13 asRNAs transcribed complementary to nine r-protein mRNAs [69].
The physiological significance and the molecular mechanism by which these asRNAs can modulate the expression of the r-protein-encoding genes are currently under investigation. In all living organisms, ribosome assembly is subject to extensive feedback regulation to ensure correct ribosome manufacture in response to a variety of environmental and metabolic changes. In bacteria, despite numerous investigations, the mechanisms for the ribosome feedback regulation, and growth rate-dependent regulation of rRNA synthesis in bacteria other than the model organisms Escherichia coli remain to be elucidated.
Studies in E. coli demonstrated that r-protein synthesis is tightly regulated by numerous mechanisms including translational coupling, translation repression, or premature transcription termination [70]. In particular, in E. coli expression of more than half of the r-protein genes are controlled by 12 distinct RNA autogenous regulatory elements by which these proteins inhibit the translation of their own mRNA [71]. However, only two or three of these regulatory elements are widely distributed across many bacteria phyla raising some concerns about the full transferability of the r-protein regulation model established in E. coli to other bacteria, including the streptomycetes [71][72][73]. The discovery of asRNAs targeting so many r-protein-encoding genes in S. ambofaciens may suggest the existence of an additional regulatory level.
In addition to r-protein genes, we detected several asRNAs that are transcribed complementary to the 5 tRNAs were also detected. To our knowledge, this finding is unprecedented in Streptomyces. Moreover, it may be interesting to note that one of the 5 tRNAs recognize a specific codon for proline (CCA) that is less represented in Streptomyces CDSs with respect to other codons specifying the same amino acids (Supplementary Figure S2).
Streptomyces use rare codons to regulate their physiology. The most known example is bldA that codes for tRNA Leu (UAA) that recognizes a rare leucine codon, UUA (about 0.01% of codons). The tRNA Leu (UAA) is unnecessary for vegetative growth but is required for some aspects of secondary metabolism and morphological development [74]. In streptomycetes, about 3% of genes contain a UUA codon, which is mostly associated with either recently acquired genes by horizontal gene transfer, or genes coding for pleiotropic or pathway-specific regulators of secondary metabolism and morphological development [74]. Expression levels of these genes are strictly dependent on tRNA Leu (UAA) levels, which increase during the stringent response [74]. In S. ambofaciens, UUA codons are present in srmR (RS26685) (containing 1 UUA codon) and srmS (RS26595) (containing 3 UUA codons) genes encoding two pathway-specific regulators of the spiramycin biosynthetic cluster. It is presumable that tRNA Pro (UGG) may have a similar regulatory function on secondary metabolism and morphological differentiation. This hypothesis is supported by a considerable higher frequency of this codon in some CDSs for proteins involved in secondary metabolism and morphological development control, including bldG (RS24355) (Supplementary Figure S2). The existence of an asRNA that is transcribed complementary to the tRNA Pro (UGG) gene could add complexity to this putative regulatory level.
The association between antisense transcription and Streptomyces developmental cycle seems to be supported by the evidence of many putative asRNAs that are transcribed complementary to genes coding for regulatory proteins that oversee morphological and biochemical differentiation. In particular, two asRNAs are transcribed complementary to phoR and whiB, respectively. PhoR is the sensor kinase of the two-component PhoR-PhoP system that controls both primary and secondary metabolism in Streptomyces in response to phosphate availability [52], while the regulatory protein WhiB, which belongs to the Wbl family of proteins that are characterized by an (4Fe-4S) iron-sulfur cluster, is essential for sporulation septation. By acting in concert with WhiA, WhiB halts the aerial growth to initiate the septation event at the aerial mycelial tip, and chromosome partition into the spores [53,54]. In S. coelicolor A3(2) WhiA and WhiB function cooperatively to control the expression of a common set of genes organized in about 240 transcriptional units [54]. Both whiA and whiB transcription is subject to negative repression by BldD when this regulatory protein is associated with c-di-GMP, whose levels are sustained high during vegetative growth [75].
Some asRNAs appear to also target genes coding for proteins involved in the control of secondary metabolite biosynthesis, and, in particular, spiramycin production and resistance. An asRNA is transcribed complementary to smrR mRNA. SmrR is a key transcriptional activator of the spiramycin gene cluster [55,56]. SrmR acts by activating, in turn, the pathway-specific transcriptional activator SrmS that controls most of the spiramycin biosynthetic genes [56]. SrmR is characterized by Hsp70 superfamily and GAF superfamily domain at the N-terminus, and PucR-like helix-turn-helix domain (HTH_30) at the C-terminus. The GAF domain is a type of protein domain that was so named after its initial discovery in cGMP-specific phosphodiesterases, adenylyl cyclases, and FhlA (formate hydrogen lyase transcriptional activator). This universal domain is responsible for binding allosteric regulatory molecules such as the second messenger cyclic nucleotides cGMP and cAMP [76], although in some bacterial proteins the GAF domain was shown to contain haem [77] or a non-haem mononuclear iron center [78] enabling them to sense molecular oxygen or nitric oxide, a second messenger gaseous compound whose importance in the physiological transitions that led to morphological differentiation in Streptomyces is increasingly apparent [79]. Another asRNA is transcribed complementary to smrB mRNA smrB gene (RS26680) is adjacent to srmR (RS26685) and transcribed in convergent direction and encodes an ABC-F type ribosomal protection protein [61], SrmB, that is involved in spiramycin resistance [60]. Further work is required to understand if and to which extent these asRNAs, targeting srmR and srmB, may contribute to either regulation of spiramycin production, resistance, or both.
The availability of a catalog of sRNAs and asRNAs in S. ambofaciens led us the opportunity of analyzing their expression in particular genetic backgrounds to gain some insights on the interplay between non-coding RNA transcription, stringent/relaxed phenotype, and antibiotic production. It was previously suggested that the repressed polyketide antibiotic biosynthesis in ΩpirA may be the consequence of a direct effect of the metabolic imbalance due to the lack of PirA, a redox-sensitive modulator of beta-oxidation, leading to a modified carbon flux between glycolysis, Krebs cycle, ethylmalonyl-CoA pathway, and lipid metabolism, with an increased and deregulated flow through the beta-oxidation and compensatory lipid biosynthesis and accumulation of lipid esters [35]. This modified carbon flux may be responsible for changes in the intracellular concentration of short-chain acyl-CoA pools that are precursor monomers for polyketide antibiotic biosynthesis. Here, we provide evidence that the repressed antibiotic biosynthesis in ΩpirA may be also associated with its "relaxed" phenotype that may be an indirect consequence of the modified carbon flux. Indeed, the stringent response, which negatively controls the expression of genes involved in translation, ribosomal structure and biogenesis, amino acid and nucleotide biosynthesis, is also essential to activate morphological and biochemical differentiation in streptomycetes [80,81]. The "relaxed" phenotype of ΩpirA is suppressed in ΩpirA rpoB(R) expressing a "stringent" RNA polymerase [58,59].
Intriguingly, we found a considerable number of asRNAs I having as targets mRNAs involved in translation, ribosomal structure, and biogenesis and showed that most of these asRNAs are differently modulated in w.t., ΩpirA, and ΩpirA rpoB(R), with downregulation in ΩpirA, and upregulation in w.t., and even more in ΩpirA rpoB(R). A summarized scheme depicting the up or downregulated ncRNA involved in above mentioned processes has been reported in Figure 8. These findings open the possibility that the expression of these asRNAs may be stringently controlled, and that, in turn, these asRNAs may be involved in the negative control of genes involved in translation, ribosomal structure, and biogenesis, consistent with RNAseq data, thereby overseeing critical aspects of microbial physiology.

Spiramycin Production Assay
Spiramycin production by S. ambofaciens broth cultures was assessed by high-performance liquid chromatography-electrospray ionization-mass spectrometry (HPLC-ESI-MS) as described [83]. At different time intervals, supernatants were filtered through Phenex-RC membrane (0.45 mm; Phenomenex). Five µL of a solution of erythromycin (1 mg/mL) (Sigma-Aldrich) in 30% v/v aqueous acetonitrile plus 210 µL of acetonitrile were added to 500 µL of filtrated samples; mixture samples were vortexed, centrifuged at 4 • C for 5 min, and then 2 µL of the supernatant were injected for spiramycin determination.
The high-performance liquid chromatography (HPLC)-ESI-MS apparatus consisted of a Surveyor MS quaternary pump coupled to a Finnigan LCQ Deca XP Plus mass spectrometer (ThermoFisher, Monza, Italy) equipped with an ESI source and a quadrupole ion trap analyzer. Xcalibur software (ThermoFisher) was used for instrument control and data analysis. The spectrometer was calibrated externally with a mixture of caffeine, MRFA and Ultramark (ThermoFisher) [84]. The mass spectrometer operated in the positive-ion mode with the following settings: sheath gas, 60 units; auxiliary gas, 20 units; spray voltage, 1.5 kV; capillary temperature, 325 • C; and capillary voltage, 10 V. Mass spectra were recorded in full-scan MS in the m/z range 400-1100. Spiramycin I, II and III were resolved on a BioBasic C-18 analytical column (150 × 2.1 mm, particle size: 5 µm) (Thermo Scientific, Monza, Italy), which was eluted with 5 mM ammonium formate, pH 7.0 (solvent A) and LCMS-grade acetonitrile (solvent B), at a flow rate of 200 µL/min, applying the following gradient of solvent B: 0 min, 15%; 1 min, 15%; 15 min, 70%; 20 min, 70%; 21 min, 15%; and 25 min, 15%. To minimize source fouling during analysis, the eluent was directed to the source only during erythromycin and spiramycin elution by using the divert valve on the mass spectrometer. Calibration curves were made based on peak area ratios of analyte/internal standards vs. analyte concentrations, using 6 point calibration standards in the range of 0.1-5.0 µg/mL (0.1, 0.2, 0.5, 1.0, 2.5, and 5.0 µg/mL). The recoveries of the method, evaluated by spiking the samples with low (200 ng) and high (2 µg) levels of the analyte, were 97 and 93%, respectively.
The method limits of detection (LODs) and limits of quantification (LOQs) were determined using the samples fortified at the lower validation level. LODs and LOQs determined at S/N ratios of 3 and 10 for spiramycin were 20 and 67 ng/mL, respectively.

Antimycin Production Assay
Bacteria were cultivated in SFM broth. At different time points, 1 mL of cultivation broth was collected, lyophilized, and then dissolved in 1 mL methanol. After centrifugation at 10,000× g, the pellet was discarded, and the supernatant was collected for analysis. Antimycin was analyzed by HPLC using a System Gold programmable solvent module 125 (Beckman) equipped with a Nucleosil C8 analytical column (200 × 4.6 mm; particle size: 3 µm, pore dimension: 120 Å) (Macherey & Nagel) maintained at 25 • C, and a UV detector (350 nm). The column was eluted with water (solvent A) and methanol (solvent B), at a flow rate of 1 mL/min, applying the following gradient of solvent B: 0 min, 10%; 20 min, 100%; 34 min, 100%; and 44 min, 10%. Pure antimycin (Sigma Aldrich) was used as standard.

RNA Extraction and RNAseq Experiments
S. ambofaciens strains w.t., ΩpirA, and ΩpirA rpoB(R) were grown in YS medium at 28 • C on a rotary shaker at 200 rpm. Four different time points (48 h, 72 h, 96 h, and 120 h) were collected for each strain in biological duplicates. Total bacterial RNA was extracted from the pellets, ribosomal RNAs were depleted, and sequencing libraries prepared as already described [34]. Each library was then sequenced on a MiSeq Illumina sequencer and 76 bp paired-end reads were produced.

S. ambofaciens Transcriptome Re-definition, ncRNA Prediction and Antisense Transcripts Definition on the Main Chromosome
We adopted Rockhopper [38][39][40] to obtain w.t. S. ambofaciens transcriptome redefinition and ncRNAs prediction using paired-end strand-specific RNA-seq already published [34]. The predictions were carried out at 48 h, 72 h, 96 h, and 120 h timepoints separately and, for each of them, we obtained the annotated genes list (eventually extended for the length of their UTRs) and ncRNAs one. We then pulled together daily predictions and we merged features resulting to overlap on the same strand both for ncRNAs and annotated genes lists, to reduce small inconsistencies in start and end position definition. Finally, we pooled together the two lists giving priority to UTRs prediction in case of inconsistencies, and we evaluated the distances from the nearest element on the same strand for each feature and eventual antisense overlaps using BEDTools functions [85].
This led us to highlight some peculiarities of Rockhopper [38][39][40] predictions that could be improved: (1) The algorithm tends to split long UTRs into one UTR and one or more "predicted ncRNA", usually separated by few or even no nucleotide between them. The same happens to long ncRNAs which are split into multiple close records. Tracks inspection instead shows no difference in coverage between the two features and no gaps.
(2) There were hotspots of ncRNAs predictions in rRNAs regions and on the arms of the main chromosome. These regions result from genome inverted repetitions which are particularly uncertain for predictions as reads are forced to map to one of the possible locations resulting in false antisense transcription predictions.
For this reason, we decided to merge all the predicted features found on the same strand at 15 or fewer nucleotides from each other, to manually collapse ncRNAs located within 500 nucleotides from one other when RNA-coverage tracks show continuous signal, and to exclude ncRNAs predicted from duplicated genes/regions to produce a sound S. ambofaciens transcriptome re-annotation.

RNAseq Expression Analysis
Bowtie 2 (v2.2.6) [86] was used to align the read1 to S. ambofaciens ATCC 23877 genome (GCF_001267885.1) as already described in Pinatel and Peano, 2018 [87]. For comparability reasons, ΩpirA rpoB(R) sequences were treated using the same pipeline and annotation previously available [34,35] and relative control stats are shown in Supplementary Table S1. Raw read counts relative to newly predicted ncRNAs were instead obtained for all three strains using FeatureCounts [88], and R package DESeq2 (v1.14.1) [89] was then used to produce differential expression data for each condition in biological duplicate, normalizing the counts to the total amount of reads mapped to the main chromosome in the less covered sample.

ncRNAs Annotation and Functional Predictions
The Rfam database (http://rfam.xfam.org/) was first used to search for functional information and the degree of phylogenetic conservation of predicted S. ambofaciens sRNAs. The Rfam database is a collection of RNA families (non-coding RNA genes, structured cisregulatory elements, and self-splicing RNAs) by multiple sequence alignments, consensus secondary structures, and covariance models [90,91].
The degree of sRNAs conservation during microbial phylogeny was also analyzed by Global Automatic Small RNA Search go (GLASSgo) web server [46][47][48]. With the same method and tool, we performed a conservation analysis on ClassI asRNA. GlassGo was based on BLAST searches, pairwise identity filtering, and structure-based clustering. This tool does not allow predicting if ncRNA is expressed or not, however according to the data reported by the developers, the positive predictive value (PPV) is high (minimum of 0.85).
To predict the putative targets of each sRNA, we used Interacting RNAs (IntaRNA) web server [49,50]. This tool calculates the RNA-RNA interactions by an energy-based approach. IntaRNA predicts interacting regions between each sRNA and putative target mRNA by incorporating the accessibility of both interaction sites and the presence of a seed interaction; both features are commonly observed in sRNA-mRNA interactions [92].
COG functional classes were obtained from Conserved Domain Database (CDD) domain database https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml (accessed on 18 June 2020) as reported by Tatà et al. [35]. Functional enrichment analysis was performed on class I asRNA by counting the total number of COG classes identified, considering the same class several times if the latter mapped on targets to which more than one class was assigned. Similarly, class II and III asRNAs (comprising two overlapping transcripts) were considered twice. The unclassified genes (without COG classification) were excluded from the analysis.

Gene Expression Analysis
The first read of each pair was aligned to S. ambofaciens genome (GCF_001267885.1_ASM 126788v1_genomic.fna) according to the pipeline previously published [87]. The scripts are also available at https://github.com/epinatel/Bacterial_RNAseq/blob/master/RNAseq_ analysis_pipeline.txt, accessed on 22 June 2021. All the tools' versions correspond to those provided in the paper. Gene level counts obtained by the pipeline, adopting as reference S. ambofaciens annotation table published in [34] (datasheet 1), were normalized with factor sizes estimated by DESeq2 on the entire batch of data (3 genotypes and 4 timepoints). MasigPro analysis was run with default regression parameters (α = 0.05; Q = 0.05) and forward step regression model was adopted. Significant genes lists were obtained by imposing R-square = 0.7, in order to select genes correctly fitting the model as suggested by Conesa et al. (2006) [94].

ncRNAs Differential Expression
The first read of each pair was aligned to S. ambofaciens genome as previously described for gene expression and FatureCounts (Subread v2.0) was adopted to obtain read counts based on the annotation reported in Supplementary Tables S4-S7. The following command line was used: featureCounts -a <ncRNA.saf> -o <ncRNA_counts> -F SAF -t gene -s 1 -T 15 -p -B -C -J -G <GCF_001267885.1_ASM126788v1_genomic.fna> --fracOverlap 0.5 <BAM file list> DESeq2 version 1.26.0 in R 3.6.3 with default parameters was run to obtain differentially expressed ncRNAs, manually defining the size factors according to the ones estimated on the entire transcriptome and already used for standard gene expression analysis.