4.1. The Problem
The replication of RNA-containing viruses is generally error-prone, and in the case of picornaviruses nearly each act of template copying may be associated with the acquisition of a mutation [
1,
2,
3]. Such negligence is mainly due to a low fidelity of the viral RNA-dependent RNA polymerases (RdRP) [
26,
27,
28] and the lack (with a very few exceptions) of proofreading mechanisms. This infidelity is not an inherently incorrigible property of RdRP, since their faithfulness could be markedly enhanced by various point mutations [
29,
30,
31,
32,
33]. Somewhat counterintuitively, an increase in fidelity may result in a decreased viral fitness [
29,
34,
35,
36] and hence the frequency of RdRP-made errors appears to be evolutionally tuned.
To prevent or diminish the potential harm of replicative infidelity, viruses should possess a significant degree of mutational tolerance. This tolerance is largely due to the degeneracy of codons, phenotype-neutral character of many amino acid substitutions, the ability of diverse sequences in RNA regulatory elements to maintain analogous mutual orientations, and functional equality of certain nucleotides in these elements [
9]. Though the general importance of these factors for the counteracting replicative infidelity is well appreciated, only rather limited information is available on the quantitative aspects of the mutational tolerance of distinct viral functions.
4.2. Mutational Tolerance of the TGK Motif and the oriL/3CD Interaction
The interaction between
oriL and 3CD is an essential step of the poliovirus genome replication [
13,
17,
18,
20,
37]. This interaction involves the tetraloop of domain
d of
oriL and TGK tripeptide of the 3C moiety of 3CD [
16,
18,
19,
20,
21,
22,
38]. Notably, the TGK motif is highly conserved in 3C proteins of members of the
Enterovirus C species, though Thr, being most abundant, could also be occupied by Val, Ile, and Met [
38]. This study provides insights into quantitative and mechanistic aspects of the mutational tolerance of the genome regions controlling the structure of these ligands.
As summarized in
Figure 8, at least 11 nucleotide mutations out of 27 possible in the TGK-encoding nonanuclotide are compatible with the viral viability. Taken together with our previous observation of the mutational robustness of domain
d of
oriL [
20], this means that at least 34 point mutations out of 51 possible in the two-segmented 17 nt-long stretch of RNA (octanucleotide of domain
d and nonanucleotide of the 3C gene) are not lethal. If a second point mutation in the TGK-encoding motif is allowed (such mutations could well be already present in the quasispecies populations), then the number of the viability-compatible substitutions in it would reach at least 19 (
Figure 8) and the whole space of the permitted nucleotide replacements in the 17 nt-long stretch of RNA would rise to at least 42. Additionally, the tripeptide can sustain not less than 11 amino acid replacements, these being six and five at positions 154 and 156, respectively.
Being not lethal, the amino acid replacements in 3C exerted different fitness effects. A significant proportion of them did not demonstrate, in our in vitro experiments, any marked adverse effects. Other mutations negatively affected the oriL/3CD interaction to different degrees, with some of them bringing the virus on the verge of a catastrophe. However, even in the most debilitating cases, the surviving viruses have a resilience tool: the infidelity of RNA replication resulting in the acquisition of reversions or compensatory mutation.
If there exist such a variety of structures of the relevant tripeptide in protein 3C with apparently more or less equal phenotypic impacts, why is TGK so strictly conserved in wild-type polioviruses? It may be speculated that the laboratory assays do not completely reflect fitness of circulating viruses. It should also be taken into account that even in tissue culture experiments competitive capacity of the relevant mutant viruses has not been assayed.
Though we are focusing here on the direct interaction between the tetraloop of domain
d of
oriL with the TGK motif of 3CD, it should be kept in mind that the both RNA and protein partners of this interaction have several separate functions and that the formation of the
oriL/3CD complex involves several other viral and cellular participants and is significant not only for the RNA replication but for its translation as well [
39,
40,
41]. The viability-compatible mutations identified in this study may affect some of these activities but obviously such effects, if any, are not virus-killing.
4.3. Possible Mechanistic Features of the oriL/3CD Interaction
Although the significance of the
oriL/3CD interaction for viral RNA replication is well established, detailed information about the mechanistic aspects underlying their mutual affinity is lacking. The results reported here and in our previous paper [
20], though insufficient to suggest a specific molecular model of this interaction, may nevertheless contribute to the development of such models in future.
In particular, the requirements for distinct amino acids at positions 154–156 of poliovirus 3C became partially defined. Although all full wild-type poliovirus genomes in the NCBI database have TGK in the corresponding region, the tripeptide could endure numerous modifications (
Figure 8) either without any appreciable loss of fitness or with some debilitating but still viability-compatible effects. Only the central Gly
155 appeared to be indispensable, although, admittedly, no exhaustive attempts to prove this were undertaken. The strong requirement for this residue may be related to its position at the loop between the two β-strands [
42]. Gly is frequently found in loops because it provides a high flexibility to peptide chains and is often conserved as a structure determinant [
43]. “Good” residues at position 154, Val and Ile, share with the wild-type Thr a methyl group at the β carbon atom, hinting that this group may be involved in a hydrophobic interaction. The sebilitating effect of Ser
154 is in line with this assumption. On the other hand, Cys
154, which was also able to confer a stable wild-type phenotype, has an SH group at the β carbon. It is tempting to assume that this distinction is responsible for a weak interaction of the CGK-containing 3CD with domain
d in the EMSA assay (
Figure 7). The discrepancy between this inefficiency and functional competence in the RNA replication (
Figure 6) may be due to the presence of two neighboring Cys residues (see above).
Discussing the phenotypic effects of 3C mutations, additional possibilities to accomplish the
oriL/3C interaction, e.g., via another RNA-binding motif of 3C, KFRDI at positions 82–86 [
22,
44], should be taken into account. Adaptive changes of Pro
88 into Ser, Thr, or Leu observed in several viruses with unfavorable tripeptides at positions 154–156 (
Table 7) may presumably be linked to the proximity of position 88 to Tyr
6 and His
89, involved in
oriL recognition [
45]. Pro
88, being located in a small helix, can affect the orientation of the neighboring His
89, which is known to interact with Tyr
6 of 3C [
42], the distance between their aromatic rings being 3.31Å (
Figure 9A), which is common for stacking. These two residues have been reported also to be involved in the
oriL/3C interaction [
45] and are highly conserved in polioviruses [
38]. The close proximity of His
89 to TGK (6.43 Å and 10.66 Å to Lys
156 and Cys
153, respectively) and Tyr
6 to Gly
155 (5.28 Å) points to possible effects of substitutions in CTGK to the mutual orientation of His
89 and Tyr
6, which could be compensated by substitution of Pro
88 by more conformationally flexible Ser, retaining a His
89/Tyr
6 interaction in the RNA-recognition.
For full functionality, position 156 could be occupied not only by the wild-type Lys but also by positively charged Arg, whereas the negatively charged Glu at this position was lethal, suggesting an electrostatic interaction in the tetraloop/3CD affinity. It may also be noted that the Lys
156Ala replacement was reported to inhibit the capacity of 3CD to stimulate uridylylation of VPg [
42], which is known to depend on the
oriL/3CD interaction [
16]. The lack of a positive charge at position 156 (in mutants with TGS and TGM tripeptides) could be partially compensated by the appearance of such a charge (e.g., in Arg) at position 153, just preceding the relevant tripeptide. Of note is that Lys153 has almost the same steric potential to interact with RNA-ligands as Lys156, as follows from the comparison of crystal structures of TGK-containing (poliovirus) and KIGQ
156-containing (rhinovirus A2) 3C proteins: Lys
153 of rhinovirus exposes its positive charge to the same surface area as Lys
156 of poliovirus, though this area in the former 3C has a somewhat lower overall positive charge, due to a lesser abundance of basic amino acids [
42,
46] (
Figure 9, compare panels (B) and (C)).
It is not clear whether debilitating effects of certain “poor” residues in the relevant tripeptide were linked to the disappearance of distinct RNA-protein interactions directly involving these residues or to changes in the protein conformation and solubility. In the latter cases, the possibility of dynamic changes of this conformation to modulate its functionality should be considered. It may be worth remembering that the functionally optimal conformation of the tetraloop of domain d of
oriL could be provided by different sequences of the YNMG consensus, and it has been proposed that certain non–YNMG sequences are able to temporarily acquire a YNMG-like conformation as a result of molecular dynamics, acquiring thereby some level of functionality [
20].