1. Introduction
Research over the last 15 years has evidenced that intercellular communication frequently occurs in bacteria regulating gene expression in a cell density-dependent signaling, referred to as “quorum sensing” (QS) [
1]. Gram-negative bacteria most commonly use
N-acyl homoserine lactones (AHLs) as signal molecules; AHLs are synthesized by LuxI-family proteins and at high concentration (
i.e., high cell density) they bind to a cognate LuxR-family regulatory protein, which consequently binds target gene promoters. There is a class of LuxR-family proteins having the typical modular structure of QS LuxRs, which do not possess a cognate LuxI AHL synthase; these LuxR proteins have been called orphans or solos [
2,
3]. A sub-family of LuxR solos of Plant Associated Bacteria (PAB) has recently been shown to be part of a novel interkingdom signaling circuit, involved in communication between plant with both pathogens and beneficial bacteria [
4,
5]. It is likely that this sub-family of LuxRs of this interkingdom signaling circuit shares structural and functional similarities with the archetypical LuxI/R QS systems [
6]. These PAB LuxR solos bind and respond to plant signals and probably have undergone coevolution with the host plant.
Five members of this PAB LuxR solos subfamily have been studied: XccR of
Xanthomonas campestris pv. campestris (Xcc), OryR of
Xanthomonas oryzae pv.
oryzae (Xoo), PsoR of
Pseudomonas fluorescens, XagR of
Xanthomonas axonopodis pv.
glycines (Xag) and NesR of
Sinorhizobium meliloti [
2,
7–
11]. With the exception of NesR, all have been demonstrated to respond to as yet chemically uncharacterized low molecular weight signal molecules synthesized by the host plant, with the final outcome of regulating crucial aspects of plant-bacteria interactions. Namely, OryR of the rice vascular pathogen Xoo, is involved in virulence; it responds to plant signals since the protein is solubilized and activates the expression of the neighboring
pip and of motility genes only in the presence of plant extracts [
4,
7,
9]. XccR of the crucifer pathogen Xcc also responds to an as yet unidentified plant compound and regulates the neighboring
pip gene; the presence of the plant extracts allows XccR to bind to the
pip promoter
in vitro [
8]. XagR of the soybean pathogen Xag which causes bacterial leaf pustule on soybean (
Glycine max) is also involved in virulence [
11]. As for XccR in Xcc, XagR in Xag also activates
pip transcription
in planta and temporal studies have indicated that
pip transcription increases gradually after infection, reaching its greatest activity after 72 h, before slowly decreasing. PsoR responds to plant compounds of different plant species playing a role in biocontrol by rhizospheric
Pseudomonas fluorescens via the control of transcriptional regulation of various anti-microbial-related genes [
10]. NesR of
Sinorhizobium meliloti is important for survival under stress and utilization of various carbon sources; the response to plant compounds has not yet been addressed [
2].
Nevertheless, being closely related to the QS LuxRs, this sub-family of PAB LuxR solos share the same overall protein architecture, comprising two functional domains. In particular, members of the QS LuxR family are mainly composed of an
N-terminal ligand-binding domain (the regulatory domain) [
12,
13] and a
C-terminal helix-turn-helix DNA-binding domain [
14,
15], joined together by a short linker region. In QS systems, a conformational change is induced upon binding of the regulatory domain to the cognate AHL, most commonly then allowing the recognition of specific promoter regions by the DNA-binding domain and leading to transcriptional activation [
16,
17]. Indeed binding to the AHL is responsible for stability, correct folding [
16] and most commonly dimerization, which in turns stabilizes the transcription factor allowing DNA binding [
18].
Surprisingly conservation of primary structure among LuxR-family proteins is quite low (18%–25%), however, multiple sequence alignments performed have identified nine highly conserved residues (
Figure 1): six of these residues delineate the cavity of the ligand-binding domain (W57, Y61, D70, P71, W85 and G113, according to TraR numbering) and the remaining three are located within the DNA-binding domain (E178, L182 G188) [
19–
22]. On these bases in a recent review Gonzales and Venturi [
4] pinpointed that members of PAB LuxR solos subfamily show substitutions in one or two of these highly conserved amino acids in the regulatory domain, namely, W57M and Y61W, thus suggesting an involvement of these residues in the different selectivity of this subfamily towards specific host plant signal molecules rather than to AHLs.
Considering that the experimental three dimensional structures of several members of the LuxR family [
23–
30] show a quite conserved overall folding, with the regulatory domain composed of five anti-parallel β-sheets flanked by three α-helixes on each side, our aim here is to: (i) validate and extend the previous analysis of LuxR-family, based on primary sequence alignment, in order to dissect the structural determinants involved in ligand recognition; and (ii) extend the outcomes of the detailed molecular cartography to the PAB LuxR solos subfamily in order to identify the molecular determinants responsible for the different ligand selectivity of this subfamily.
In the present study we take advantage of the large body of experimental structural data available for several members of the LuxR-family in complex not only with cognate AHLs, but also with unrelated signaling molecules [
23–
30], focusing on structure-based sequence alignment, structural superimposition and a comparative analysis of the contact residues involved in ligand binding; this should allow the identification of the key residues characterizing the ligand-binding sites. Moreover, in the absence of experimentally determined structures of members of the PAB LuxR solos subfamily, the homology model of its prototype, OryR, is expected to provide us with sufficient information to gain insights into the architecture of its ligand-binding site, as well as to elucidate the likely structural basis of the reported different ligand selectivity between the PAB LuxR solos subfamily and the canonical QS LuxR receptors.
2. Results and Discussion
Structure-based multiple sequence alignment of the regulatory domains of all the QS LuxRs whose experimental three-dimensional structural information (obtained by X-ray crystallography or by NMR spectroscopy) is available (
i.e., TraR from
Agrobacterium tumefaciens and from
Sinorhizobium fredii NGR234, LasR and QscR from
Pseudomonas aeruginosa, CviR from
Chromobacterium violaceum and SdiA from
Escherichia coli) was performed by Expresso [
31].
Figure 1 shows the multiple sequence alignment, based on structural information, having a main score (the total consistency value) of 71 (100 being the full agreement between the considered alignment and its associated primary library that has been computed as a first step of the consistency-based protocol exploited by Expresso), albeit the overall level of sequence identity or homology is quite low according to the calculated consensus sequence. It is interesting to note that even if the individual scores are 74 for LasR, 74 and 77 for the
A. tumefaciens TraR and for its homolog from
S. fredii NGR234 respectively, 76 for QscR, 56 for SdiA and 71 for CviR, regions encompassing residues 21–132 (TraR numbering) are characterized by an even higher level of consistency that has been prompted to reflect a higher level of accuracy [
31].
The regulatory domains of all the QS LuxRs complexes in the PDB database [
32] have been analyzed using Pymol [
33] and the Protein Interfaces, Surfaces and Assemblies (PISA) interactive tool for the exploration of macromolecular—ligand interfaces at the European Bioinformatic Institute [
34]. The results have been summarized in
Figure 1 and will be discussed using TraR numbering as a reference.
It is interesting to note that most of the residues involved in ligand binding (see
Figure 1, in bold and colored in red) are conserved in all the QS LuxRs complexes and seems to be invariant regardless of the chemical nature of the ligand (AHLs, chloro-lactones, triphenyl ligands) as observed in the LasR complexes. This finding supports the strategy to dissect the cartography of the ligand-binding sites of QS LuxRs in order to gain insight on the structural basis of PAB LuxR solos specificity.
Previous studies have suggested, based on multiple sequence alignment of QS LuxR transcriptional regulators, that six conserved hydrophobic/aromatic residues of the regulatory domain
i.e., W57, Y61, D70, P71, W85 and G113, delineate the binding site [
19–
22]. The present structure-based multiple sequence alignment validates the above mentioned six residues pinpoints to an additional three conserved residues of the QS LuxR family regulatory domain,
i.e., Y53, A105 and G109 (identified by a star in
Figure 1) and clearly indicates that residues P71, G109 and G113, although located very close to the binding site, are not directly involved in ligand binding. Furthermore, an additional four, but not fully conserved, among the 10 residues with similar physico-chemical properties (identified by a semicolon in
Figure 1) directly interact with the ligands in all the analyzed complexes.
Besides the 10 residues,
i.e., Y53, W57, Y61, D70, W85, A105; V/L73, F/L101, I/L/M110 and T/S129, in a number of QS LuxRs complexed with AHLs—
i.e., TraR from
A. tumefaciens in complex with OC8-HLS (PDB_ID 1L3L [
23], TraR from
S. fredii NGR234 in complex with OC8-HSL (PDB_ID 2Q0O [
25]); LasR from
P. aeruginosa in complex with OC12-HSL (PDB_ID 2UV0 [
26], PDB_ID 3IX3), QscR from
P. aeruginosa in complex with OC12-HSL (PDB_ID 3SZT [
28])—a water molecule is present at the ligand-binding sites, mediating protein-AHLs interactions by bridged hydrogen bonds.
From a structural perspective, it is interesting to note that not only the C
α positions but also the side chains orientations of all the six conserved residues W57, Y61, D70, P71, W85 and G113 (hereafter called Cluster 1, highlighted by a star in
Figure 1 and colored in green in
Figure 2) superimpose in all the analyzed structures rather well (
Figure 2b). To this end only W57, Y61, D70 and W85 are directly involved in ligand binding (see
Figures 1 and
2a). Residues P71 and G113 are located close to the binding site (
Figure 2a) and are likely involved in the proper side chain orientation of D70 and W85, respectively. In this respect it is worth noting that in all of the analyzed structures, residues P71 and G113 adopt a
trans and
cis peptide conformation, respectively.
The present analysis reveals that the regulatory domain of the QS LuxR family includes, besides Cluster 1, an additional cluster of residues, namely V72, V73, F101, A105, I110, T129 (hereafter Cluster 2, colored in cyan in
Figure 2) that is reasonably conserved and also directly involved in ligand binding (see
Figures 1 and
2c). In all of the analyzed structures, the C
α positions and the side chain orientations superimpose rather well, as shown in
Figure 2d.
Beyond these two conserved clusters, the residues A49, Y53, Q58 and F62 (hereafter Cluster 3, colored in orange in
Figure 2), represent a less conserved cluster (see
Figures 1 and
2e) within the regulatory domain of the QS LuxR family. Besides Y53 that is conserved in a number of members belonging to the LuxR family (
Figure 3), the residues A49, Q58 and F62 are highly substituted. Nevertheless the C
α positions and the side chains orientations of these residues superimpose rather well in all of the analyzed structures (
Figure 2f).
To extend this detailed molecular cartography of the regulatory domain of the QS LuxR family to the PAB LuxR solos subfamily [
35], OryR, the prototype of this subfamily, has been modeled and structurally aligned, based on secondary structure prediction, using I-TASSER [
36] (see
Figure 3).
The obtained homology model allowed to inspect the architecture of its ligand-binding site and to map the residues belonging to the three clusters, pinpointing the molecular determinants that are responsible for the observed differences in the ligand selectivity of this subfamily compared to QS LuxRs.
Mapping Cluster 2 residues on the regulatory domain of OryR shows that residues V72 and T129 are conserved, whereas residue F101 is substituted by L (like in QscR) and I110 is substituted by M (similarly to CviR). V73 and A105 instead are substituted by Q and L respectively, these residues being rather conserved and specific for the subfamily of PAB LuxR solos (highlighted in cyan in
Figures 3 and
4).
Regarding Cluster 3, the residues A49, Q58 and F62 are highly substituted in the PAB LuxR solos subfamily (highlighted in green in
Figures 3 and
4) similar to what has been found in the QS LuxRs. In contrast, Y53 is highly variable within the PAB LuxR solos subfamily members (
Figure 4), while it is conserved in a number of QS LuxRs (
Figure 3).
Details of the residues type and the frequencies for the residues belonging to each of the three clusters both in QS LuxRs and in PAB LuxR solos are summarized in
Table 1.
The three dimensional architecture of the boundaries of the ligand-binding site of the QS LuxRs is outlined in
Table 2.
The contribution of the three clusters to the binding site topology of QS LuxRs and in PAB LuxR solos can be seen in
Figure 5.
The residues belonging to each of the above described three clusters have been mapped on TraR (PDB_ID 1H0M) [
24] and on the homology model of OryR regulatory domains (
Figure 5a,b) in order to obtain the cartography of their respective ligand-binding sites: the resulting comparison (
Figure 5c,d) indicates a tripartite architecture.
Firstly, a shared part (conserved core), delimited by the floor and the distal wall (residues 70, 71, 72, 85, 110, 113, 129), appears to be crucial for the folding mechanism of the regulatory domain in the presence of the signal molecule, regardless of its chemical nature. Then a specific part (specificity patch), mainly delimited by the roof and the nearby regions of the proximal and distal walls (residues 57, 61, 73, 101, 105), is conserved only within the QS LuxRs or within the PAB LuxR solos subfamily members respectively. It is therefore likely that the selectivity of LuxR family and of the PAB LuxR solos subfamily towards diverse ligands is modulated by these residues. In all the experimental structures analyzed, these are the ones interacting with the lactone ring of the AHL, while the PAB LuxR solos do not bind to AHLs. Finally a variable part (variability patch), delimited by the proximal wall and the nearby regions of the roof and of the floor (residues 49, 53, 58, 62), is less conserved even within the members of the QS LuxR family or of the PAB LuxR solos subfamily. It is interesting to note that in all the experimental structures analyzed, these residues interact with the fatty acyl side chain moieties of the AHLs that are found to adopt different position/orientation and conformations. Therefore they are likely to be responsible for the different selectivity towards molecules belonging to the same family of ligands or for the modulation of the degree “promiscuity” towards members of the same family of compounds.
In order to corroborate the results of this analysis, the shape and the physico-chemical properties of the ligand-binding sites were evaluated. In
Figures 6 and
7, on comparing the overall shape and the electrostatic and lipophilic potentials, respectively, of the ligand-binding sites of QS LuxRs, confirms the tripartite topology previously outlined (mapping the conserved core in yellow, the specificity patch in magenta and the variability patch in orange). Furthermore the physico-chemical properties mapped on the ligand-binding site of the OryR model reveal an increased negative potential (
Figure 6) and a decreased hydrophobicity (
Figure 7) in comparison to QS LuxRs, which most likely accounts for the structural determinants that are responsible for differences in the selectivity of the PAB LuxR solos subfamily with respect to QS LuxRs.
An additional validation was carried out by comparing the binding sites topochemical preferences of the QS LuxRs and of the PAB LuxR solos prototypes by SITEHOUND [
37] that exploits favorable interactions of three different structural probes (methyl carbon, aromatic carbon and hydroxyl oxygen). This analysis discloses clear differences between the prototypes of the two families (
Figure 8). Indeed the comparison of the binding sites of the OryR and of the prototype of QS LuxRs shows the former to prefer hydroxylic groups rather than aliphatic and/or aromatic groups; therefore providing further support to the molecular determinants responsible for the differences in selectivity of the PAB LuxR solos towards specific host plant signal molecules rather than towards canonical quorum sensing ligands.
3. Experimental Section
Sequence alignment was performed by Expresso [
31], that exploits structural aligners algorithms like SAP [
38] or TMalign [
39] to generate structure-based alignments that are used as a template for realigning the original sequences.
All the complexes of QS LuxR deposited in PDB both solved by X-ray crystallography,
i.e., TraR from
Agrobacterium tumefaciens in complex with OC8-HSL (PDB_ID 1L3L [
23], PDB_ID 1H0M [
24]), TraR from
Sinorhizobium fredii NGR234 in complex with OC8-HSL (PDB_ID 2Q0O [
25]); LasR from
Pseudomonas aeruginosa in complex with OC12-HSL (PDB_ID 2UV0 [
26], PDB_ID 3IX3), with TP1 (PDB_ID 3IX4 [
27]), with TP3 (PDB_ID 3IX8 [
27]) and with TP4 (PDB_ID 3JPU [
27]); QscR from
Pseudomonas aeruginosa in complex with OC12-HSL (PDB_ID 3SZT [
28]); CviR from
Chromobacterium violaceum in complex with C6-HSL (PDB_ID 3QP1 and PDB_ID 3QP6 [
29]), in complex with OC8-HSL (PDB_ID 3QP2 [
29]), in complex with OC10-HSL (PDB_ID 3QP4 and PDB_ID 3QP8 [
29]) and in complex with an antagonist chlorolactone (PDB_ID 3QP5 [
29]) and by NMR,
i.e., SdiA from
Escherichia coli in complex with OC8-HSL (PDB_ID 2AVX [
30], have been superimposed using Pymol [
33] and analyzed by using the Protein Interfaces, Surfaces and Assemblies (PISA) web tool at the European Bioinformatic Institute [
34].
Homology-based protein modelling has been performed on the full-length amino acidic sequence of OryR protein from
Xanthomonas oryzae using five molecular modelling strategies based on different criteria for template selection. SWISS-MODEL [
40] performs a search in a library of experimental protein structures extracted from the PDB: up to five template structures per batch are superposed using an iterative least squares algorithm generating a structural alignment after removing incompatible templates, improved by a heuristic step after the calculation of a local pair-wise alignment of the target sequence to the main template structures [
41]. ModWeb [
42] depends on the large scale protein structure modeling pipeline, ModPipe, which performs a search in a set of non-redundant chains extracted from structures in the PDB and establishes sequence-structure matches using multiple variations of sequence-sequence, profile-sequence, sequence-profile and profile-profile alignment methods [
43–
45]. M4T (Multiple Mapping Method with Multiple Templates) [
46] is based on two of major modules, Multiple Templates (MT) and Multiple Mapping Method (MMM) [
47], developed to produce accurate alignments and models by minimizing the errors associated with the first two steps of modeling procedure (template recognition and alignment). HHpred (Homology detection & structure prediction by HMM-HMM comparison) [
48] implements pairwise comparison of profile hidden Markov models (HMMs) to generate pairwise query-template alignments or multiple alignments of the query with a set of templates selected from the search results. I-TASSER (iterative threading assembly refinement) [
36] generates three-dimensional atomic models from multiple threading alignments and iterative structural assembly simulations.
The five top-scored models generated have been ranked and validated by two protein model quality predictors ProQ [
49] and AIDE [
50], that have different and often complementary ability to properly assess the quality of protein structures and therefore their combined use can increase the reliability in the evaluation of model quality. The resulting outputs were consistent, pinpointing the top-scored model (confidence score 0.64) produced by I-TASSER, based on the QscR template (PDB_ID 3SZT [
28]), as the most reliable candidate. Indeed the correctness of the selected model was confirmed by ProQ [
49] (having Predicted LGscore of 4.299 and Predicted MaxSub of 0.437) and its overall best quality was validated by AIDE [
50] (with a Predicted TM-score of 0.69 and a Predicted RMSD of 6.97).
Electrostatic potentials calculations were performed by PDB2PQR [
51] and visualized by Pymol [
33] on the ligand surface in the ligand-binding sites of the QS LuxRs and on the cavity surface of the ligand-binding site of OryR model. Lipophilic potential representation, based on the hydrophobicity scale derived by Black and Mould [
52], was performed by Pymol [
33] on cavity surface of the ligand-binding site of the QS LuxRs and of OryR model.
Binding sites preferences of QS LuxRs and of PAB LuxR solos prototypes, TraR and OryR respectively, were analyzed by SITEHOUND [
37] employing three different structural probes (
i.e., methyl carbon, aromatic carbon and hydroxyl oxygen).