1. Introduction
Rhizobium leguminosarum bv.
trifolii is a soil bacterium that establishes nitrogen-fixing symbiosis with clover (
Trifolium spp.). However, in the absence of compatible host plants, this micro-organism must often exist for a long period of time in the soil, where it is exposed to various environmental conditions. To adapt to these conditions, rhizobia have developed a wide range of strategies which allow them to survive in the soil. Among these adaptations, composition of the bacterial envelope, synthesis of surface polysaccharides, cell motility, and quorum sensing play the most important roles [
1,
2].
Recently, it was established that a regulatory protein encoded by the
rosR gene is essential for the adaptation of
R. leguminosarum to stress conditions, affecting the expression of many genes related to the synthesis of surface polysaccharides and cell-surface components, secretion of extracellular proteins, motility, and other cellular processes [
3,
4,
5]. This gene encodes a 15.7 kDa global transcriptional regulator, which contains a Cys
2-His
2-type zinc finger motif responsible for binding to RosR-box motifs located in the promoter regions of the regulated genes [
4,
6]. The amino acid sequence of this motif revealed similarity to the Cys
2-His
2-type zinc finger motif, which is primarily found in DNA-binding proteins of eukaryotic origin [
7,
8,
9]. In contrast to eukaryotic transcription factors (TFs), which typically contain tandem arrays of such Cys
2-His
2 zinc fingers, bacterial proteins, including the rhizobial RosR/MucR family, possess only one Cys
2-His
2-type zinc finger motif located in their C-terminal region.
RosR of
R. leguminosarum shares significant similarity with Ros of
Rhizobium etli and
Agrobacterium tumefaciens, and with MucR of
Sinorhizobium meliloti and
Sinorhizobium fredii [
10,
11,
12,
13,
14]. Mutations in
rosR and
mucR affect the function of both free-living rhizobia and during symbiosis with their host plants. In
S. meliloti, MucR regulates the expression of genes involved in the synthesis of extracellular polysaccharides (EPS) (succinoglycan and galactoglucan), which are required for effective symbiosis with its macrosymbiont, alfalfa [
15]. Thus, in this bacterium, inactivation of
mucR abolishes the synthesis of succinoglycan and reduces cell motility, that in consequence, lead to disturbances in symbiosis [
12,
16,
17]. Also in
S. fredii, which shows a broad host range for nodulation, mutation in
mucR1 leads to a severe decrease in EPS biosynthesis, increased cell aggregation, and a drastic reduction in the nitrogen fixation capacity on
Glycine max and
Lotus burttii [
13,
18,
19]. At least in this species, MucR1 is also required for the transcriptional activation of ion transporters engaged in bacterial nitrogen fixation in soybean nodules [
14]. Similarly, the
Rhizobium etli Ros protein is essential for EPS production, as well as for colonization and infection of its host plant,
Phaseolus spp. [
10]. In addition, a
rosR mutant strain of
R. leguminosarum produces a significantly smaller amount of EPS than the wild type, is more sensitive to surface-active compounds, exhibits decreased motility, and induces nodules with a high delay on clover roots, which are ineffective in nitrogen fixation [
4,
5].
Previously, we have established that the
R. leguminosarum rosR open reading frame (ORF) possesses a 403 bp long upstream region containing several regulatory motifs [
20,
21]. Transcription of this gene was found to be driven by two promoters of different strengths: a distal strong promoter P1 and a much weaker proximal, P2 [
6]. It was reported that P1 functions as the main promoter, and in addition to the −35 and −10 hexamers recognized by the RNA polymerase (RNAP) with σ
70 subunit, it contains two other regulatory elements, an Upstream Promoter (UP) element [
22,
23] and a 3–4 bp long extension of the −10 element (TGn-extended −10 element), which in other bacteria are known to play a significant role in the initiation of transcription. We have confirmed that the AT-rich UP element located upstream of the −35 hexamer ensures a high level of expression of this gene from P1 [
20]. The UP elements are recognized by the α subunits of the RNAP, whereas the 3–4 bp long extended −10 elements, which usually contain two nucleotides TG, located immediately upstream of the −10 hexamer are recognized by the σ subunit of this enzyme [
23,
24,
25,
26,
27]. The expression of
rosR driven by P2 is ~10-fold lower than that from P1. Downstream of the TGA stop codon of
rosR, a
rho-independent transcriptional terminator is located, which contains 12 nt long inverted repeats (IR) and forms a very stable stem structure.
It was reported that RosR recognizes and binds to a 22 bp RosR-box motif located downstream of
rosR promoters and negatively regulates transcription of its own gene [
6]. Moreover, binding sites for cAMP receptor protein CRP (cAMP-CRP) and Pho-boxes recognized by an activator PhoB, engaged in a positive regulation of
rosR expression under carbon source and phosphate limitation, respectively, and a LysR motif (T-N
11-A)
3 recognized by proteins of the LysR family, were identified in the
rosR upstream region [
20,
21]. In addition, a few motifs containing inverted repeats of different lengths (named IR1–IR6) were found in this region. Among them, IR5 contained the longest, 12 bp, inverted repeats. Up to now, only the role of IR4 in
rosR expression was experimentally confirmed [
21]. This motif was engaged in the negative regulation of
rosR transcription, since its deletion or mutations resulted in considerably increased expression of the gene.
In this work, we examined the significance of several motifs identified in the upstream region of R. leguminosarum bv. trifolii rosR in the regulation of its own transcription. For this purpose, a set of rosR-lacZ transcriptional fusions, which contained mutations in these motifs was constructed. The levels of rosR transcription in different mutant variants in R. leguminosarum were established using both quantitative real-time PCR and β-galactosidase activity assays. The stability of wild type rosR transcripts and those containing mutations in IR motifs located in the 5′-untranslated region of this gene was determined using an RNA decay assay. In addition, the influence of chosen rhizobial regulatory proteins (a histidine kinase ChvG of a two-component regulatory system, and an activator CinR, and a represor PraR, both involved in quorum sensing) on the expression of rosR was studied.
4. Discussion
In this study, we showed that transcription of the
R. leguminosarum rosR gene undergoes a complex regulation, in which several
cis-regulatory elements and a previously unidentified
trans-acting factor are engaged. Mutational analysis of regulatory motifs identified in the
rosR upstream region confirmed a significant role of some of these elements in the modulation of transcription and/or transcript stability of this gene. In general, transcription of
rosR was high (see transcriptional activity of
rosR in the wild type fusion pEP1 in
Figure 4 and
Figure 6), and the main promoter P1 was responsible for this effect (see transcriptional activity of
rosR provided by UP and P1 in pEP14 in
Figure 6).
According to the definition of Gottesman, RosR belongs to global regulators, on the basis of its pleiotropic phenotype and ability to regulate operons associated with different metabolic pathways [
54]. In fact, RosR directly or indirectly affects a large group of
R. leguminosarum genes (1106), with the majority of them negatively regulated, indicating that RosR functions mainly as a repressor. These genes are associated with the synthesis of cell-surface components, envelope biogenesis, motility, transport and metabolism of carbohydrates and nitrogen sources, and other cellular processes, such as signal transduction and transcription regulation [
4,
55]. The RosR-box motifs identified in the promoter regions of genes directly regulated by RosR shared a low similarity with the RosR-box consensus. It is well-known that the regulatory effect of individual TFs depends on their concentration and affinity to binding sites; to function, weak sites (i.e., sites with a low sequence similarity to the consensus) require high concentrations of TFs, whereas strong binding sites work with a lower amount of TFs [
56]. In contrast to local TFs that tend to have high-affinity sites, global TFs are less specific, bind to a larger collection of sites, and therefore, must be expressed at higher levels [
57,
58]. This is in agreement with our observation of a high level of
rosR transcription. This effect was associated with the activity of the P1 promoter, which in addition to the two core elements (−35 and −10 hexamers) recognized by RNAP with σ
70 subunit, contains two
cis-regulatory elements (UP and TGn-extended −10) that are present only in a small number of bacterial promoters [
59]. Three domains of RNAP σ
70 are responsible for recognizing and binding the −10 hexamer (domain 2), −10 extension (domain 3), and −35 hexamer (domain 4), whereas C-terminal domains of the two RNAP α subunits can interact with the UP element located upstream of the −35 region. UP elements characterized in
E. coli promoters are ~30 bp A/T-rich sequences that contain two distal and proximal regions, and their presence can stimulate transcription up to 300-fold, depending on the gene studied [
23,
24,
25,
60]. Several studies reported that UP elements enhance the transcription of downstream genes, although so far they have been described only for a small number of bacterial promoters, mainly in
E. coli (such as P1 of the rRNA
rrnB and the
guaB promoter required for the
de novo synthesis of GMP) but also in other bacteria, such as
Bacillus subtilis [
23,
24,
25,
61,
62,
63,
64]. Moreover, the presence of UP in promoters served by RNAP containing σ
70 might reduce their dependence on the consensus of the −10 or −35 elements [
27]. To the best of our knowledge,
R. leguminosarum rosR P1 is the first described example of a rhizobial promoter containing UP and TGn-extended −10 elements [
23,
24,
25,
26,
27].
In this study, we performed mutational analyses of 12 sequence motifs located in the
rosR upstream region, including the UP and TGn-extended −10 elements, to examine their role in the transcription of this gene.
rosR expression is driven by two promoters, the strong P1 promoter and the very weak P2 promoter. The fact that the majority of the motifs studied are located immediately upstream or even inside P1 most probably means that their mutations mainly affect P1-driven
rosR expression. Based on the results obtained for different variants of this region (plasmids pM1–pM12), we observed that the UP present in the
rosR P1 promoter was longer (46 nt) than those characterized in
E. coli (30 nt), and that it was functional in both
R. leguminosarum and
E. coli backgrounds, since mutations of both the 5′- and 3′-regions of this element (pM1 and pM3, respectively) considerably reduced
rosR transcription in both bacteria (3-fold in
E. coli and 10-fold in
R. leguminosarum) (
Figure 4). This confirmed that the UP element plays an essential role in the stimulation of
rosR expression from the P1 promoter. However, alteration of the 5′-region of the IR1 element located in UP (pM2) had an opposite effect, resulting in a ~2-fold increase of
rosR transcription in both tested bacteria (
Figure 3 and
Figure 4). To elucidate whether the changes introduced in the sequence of this motif might have resulted in the appearance of an additional promoter sequence, bioinformatics analyses of the
rosR upstream region containing the changed IR1 5′-end were performed; the analyses did not reveal any such new sequences. Therefore, we propose that the alteration of this sequence may have contributed to a stronger interaction of RNAP α subunits with this regulatory region or may be connected with the loss of either a repressor-binding site or a silencing region, which would normally attenuate
rosR expression from P1. In contrast, mutation of IR2 (pM3 and pM4) had a negative effect on
rosR transcription. Taken together, these results indicated that not only the A/T-rich composition of UP, but also its local structural organization and sequence might be important for transcription initiation, and that IR1 and IR2 exert a negative and positive effect on
rosR expression, respectively. Surprisingly, we did not observe a pronounced effect of the mutation in the TGn-extended −10 element (plasmid pM6) (
Figure 3 and
Figure 4). Based on the approach used, only a weak negative effect was observed in
R. leguminosarum using β-galactosidase assay and no effect in RT-qPCR. The effect of this mutation was more pronounced in
E. coli (a 4-fold decrease of
rosR transcription in pM6 in relation to the wild type pEP1). These data indicated that the 3 bp long −10 extension does not play such an important role in the transcription of this gene in
R. leguminosarum, as described for some
E. coli genes [
58,
59,
60]. However, a mutation in the IR3 3′-region located just upstream of the −10 extension (pM5) resulted in a strong positive effect in both genetic backgrounds tested, confirming that IR3 also plays a negative role in
rosR transcription. Further, in this case the performed in silico analyses excluded the occurrence of a new, additional promoter in this region. Thus, one of the possible explanations of this phenomenon might be the loss of a site recognized by a repressor, which would be conserved between α- and γ-Proteobacteria. Moreover, the possibility of the loss of a silencing region that normally attenuates
rosR expression from P1 to allow regulation via the downstream promoter P2 cannot be excluded. Another possibility is that the observed elevated
rosR transcription from pM5 is associated with a putative interaction of RNAP σ
70 domains with the IR3 3′-region, which might be stronger in the case of its mutated version than in the wild type sequence. Our results suggest that, apart from the −10 extension, a short sequence (7 bp) adjacent to this motif might be engaged in the RNAP σ
70–
rosR promoter binding, and that at least in some cases, e.g.,
rosR, the −10 extension might be longer than 3 bp. The fact that the mutations in pM5 and pM2 similarly affected
rosR expression in both
R. leguminosarum and
E. coli suggests that such element(s) may be conserved in both bacteria. Moreover, mutations introduced in IR1–IR3 most probably do not affect the
rosR mRNA stability since these motifs are located upstream of the transcription start sites. Similarly, a highly complex mechanism of action was detected for other bacterial global regulatory proteins. For example, fumarate-nitrate reduction regulator Fnr was found to play a dual role in the regulation of
arcA, which encodes an aerobic respiratory control protein in
E. coli, depending on the growth conditions tested (anaerobiosis/aerobiosis) [
65,
66]. This gene possesses a long non-coding upstream region (530 bp) containing five promoters recognized by RNAPσ
70 and Fnr can function either as an activator from a distal arcAp1 promoter or as a repressor from arcAp3 promoter by binding to the same Fnr-box sequence in this region (−284 bp).
In this study, we also established the role of the IR5 and IR6 motifs, and the RosR-box, located downstream of the transcription start sites TS1 and TS2, in the formation and stabilization of
rosR RNA secondary structures. In silico sequence analysis of the wild type and mutated versions of
rosR transcripts indicated that these sequence motifs impacted the RNA secondary structure. Based on the results obtained using plasmids pM7–pM12, we experimentally confirmed that mutations of both 5′- and 3′-parts of IR5 and IR6 negatively affected
rosR transcription in
R. leguminosarum and
E. coli (
Figure 3 and
Figure 4). The greatest reduction was observed for mutations within IR5 (in the rhizobial background). RNA decay analysis performed in
E. coli using plasmids pQM1 and pQM7–pQM12 confirmed that wild type
rosR transcripts were very stable in bacterial cells, and that IR5 located at the 5′-end of the
rosR mRNAs plays the most essential role in their synthesis and protection against degradation (
Figure 5). In contrast, the IR6 motif decreased the stability of the transcript since its inactivation increased the half-life of
rosR mRNA. Thus, our data indicated that IR5 and IR6 play opposite roles in the stability of
rosR transcripts. Moreover, considerably higher amounts of
rosR transcripts were detected in the case of plasmids pQM11 and pQM12, which harbor mutations within the RosR-box, than in the wild type pQM1. This suggested that (i) RosR was effectively synthesized in
E. coli, and (ii) this rhizobial protein was functional in
E. coli (i.e., recognized the wild type RosR-box and negatively regulated the transcription of its own gene). However, this motif did not play such an important role in the stability of
rosR transcripts as the IR5 and IR6 motifs.
As reported by Pratte and Thiel, for several genes in
nif cluster (
nifB1,
nifS1,
nifH1,
nifE1,
nifD1,
nifU1,
nifK1,
nifN1,
nifX1,
hesA1, and
fdxH1) encoding proteins involved in nitrogen fixation in the cyanobacterium
Anabaena variabilis [
47], the half-lives of individual
nif mRNAs are very different; from as high as 33 min (for
nifH1) and ~20 min (for
nifD1,
hesA1 required for efficient nitrogen fixation, and
fdxH1 encoding the [2Fe-2S] ferredoxin that is an electron donor to nitrogenase, a key enzyme in nitrogen fixation), to as low as ~8 min (
nifE1 and
nifU1). In comparison to these data,
R. leguminosarum rosR transcript belongs to those characterized by high life-time. These authors also showed that the degradation patterns of these mRNAs were strictly different, confirming that this is a specific property of individual gene’s mRNA. The structural organization of promoters of these
nif genes and their transcript stability proved to be important for their abundance and life-time in the cell. Similarly to our findings for IR5, they reported that the stem-loop structure upstream of
nifH1 controlled the abundance of
nifH1 mRNA through transcript processing and stabilization [
48]. Stem-loops stabilize transcripts when they are at the extreme 5′-end of the transcript, since these double-stranded structures prevent mRNA recognition by 5′-exonuclease [
67,
68].
In this work, we also examined whether some regulatory proteins which play an important role in processes such as quorum sensing (CinR and PraR) or signaling (ChvG) in rhizobia, affect
rosR transcription [
29,
31,
33,
53]. Based on the results obtained for the wild type
R. leguminosarum and its
chvG,
cinR, and
praR mutant strains, we confirmed that PraR may act as a
trans-regulatory factor able to repress
rosR expression (
Figure 6). In fact, a sequence with a high similarity to the PraR-binding site was identified downstream of the P1 -10 motif. However, further studies are required to effectively prove a direct effect of PraR on the
R. leguminosarum rosR expression. Inhibition of
rosR expression by PraR in
E. coli, which lacks a PraR ortholog, might provide evidence for the direct interaction. We plan to perform these types of studies in the near future. Recently, Frederix and others [
53] characterized a consensus sequence recognized by PraR, and reported that this TF effectively binds to the promoters of
rapA2,
rapB, and
rapC (which encode adhesins),
plyB,
rosR, and its own promoter, and negatively regulates transcription of these genes. PraR is important for biofilm formation both in vitro and on plant roots, i.e., during a step that precedes the initiation of rhizobial infection of legume roots. Mutation in
praR enhanced root biofilms and improved nodulation competitiveness of the bacterium, most probably by increasing the expression of genes coding for proteins involved in bacterial attachment to host root surfaces. All these data indicate that RosR and PraR are important elements of the rhizobial regulatory network, which is required by the bacteria to constantly monitor the extracellular physiological conditions and to respond by modifying gene expression pattern to adjust their growth [
69,
70].