2.1. Sequence Analysis Links Evolutionary Patterns in SARS-CoV Spike Proteins with Shared Conserved Hotspots at the Binding Interface
To quantify these observations, we performed sequence alignment analysis and highlighted differences between SARS-CoV and SARS-CoV-2 sequences (
Figure 2). ConSurf approach has shown a robust performance in evolutionary analysis of proteins and the latest version of this method [
38,
39] was used in our study. The evolutionary analysis yielded an overall sequence identity of 78%. The functional domains of the SARS-CoV sequence showed some differences, revealing that the S1 subunit had ≈66% identity, while S2 fusion domain is considerably more conserved with 92% identity. The sequence conservation of S1 RBD yielded ≈73% identity, while the RBM region involved in direct interactions with the ACE2 receptor displayed a more significant variability as similarity between SARS-CoV and SARS-CoV-2 sequences dropped to ≈50% (
Figure 2). This analysis is fully consistent with previous studies [
40,
41,
42] suggesting that a significant sequence variability of the SARS-CoV interacting residues from RBM motif may be accompanied by the corresponding local conformational changes. Structural studies confirmed these assertions revealing small but important changes in the RBM residues of SARS-CoV-RBD and SARS-CoV-2-RBD proteins [
21,
22,
23,
24,
25,
26]. We utilized two different sequence conservation measures Consurf score [
38,
39] and KL sequence conservation score [
43,
44,
45,
46,
47] to identify regions of high conservation and characterize in some details the extent of conservation and variability of the binding interface residues (
Figure 4). We analyzed the results of sequence analysis in the context of structural and functional information, particularly to quantify the conservation patterns of the hotspot residues and uncover evolutionary origins of SARS-CoV-2 binding selectivity with ACE2. Structural studies established that the core of the SARS CoV spike protein is formed by a five-stranded antiparallel β sheet (β1 to β4 and β7), with three short connecting α helices exhibiting significant sequence conservation (
Figure 1). The disulfide bonds C323-C348, C366-C419 (in atom numbering of SARS-CoV-RBD, pdb id 2AJF) are formed by fairly conserved cysteine residues (
Figure 4A,B). Interestingly, one of the bridges C467-C474 residues in the binding region and is anchored by a two-stranded β sheet (β5 and β6) with these cysteine residues showing a moderate variability (
Figure 4A). In particular, we further examined the sequence conservation profile of the RBM region (residues 424–494) that displayed a significant variability. Notably, the RBM region is tyrosine rich (Y436, Y438, Y440, Y442, Y454, Y475, Y481, Y484 in SARS-CoV-RBD). While many of these residues make interactions with ACE2, only two sites Y442 and Y484 are replaced in the SARS-CoV-2-RBD by L455 and Q498, respectively (
Figure 3B,C). The sequence analysis confirmed that the majority of RBM tyrosine residues are conserved. However, Y442 and Y484 positions yielded positive ConSurf conservation scores, indicating that these positions are variable and may contribute to the binding affinity differences between SARS-CoV-RBD and SARS-CoV-2-RBD complexes with ACE2 (
Figure 4A). We specifically highlighted Consurf and KL conservation score values for the RBD residues, including those positions that differ between SARS-CoV-2 and SARS-CoV proteins (
Figure 4A,B). As expected, the substituted interacting residues in the RBM region displayed an appreciable sequence variability. Moreover, the binding interface residues that differ between SARS-CoV-2-RBD and SARS-CoV-RBD form clusters of variable positions (corresponding to high positive Consurf scores and, respectively, low KL scores). Hence, despite structural similarities of the binding interfaces, the individual modifications in the multiple variable positions may lead to a number of moderate interaction changes that could collectively bring about a significant cumulative contribution and contribute to the binding selectivity of the SARS-CoV-2-RBD protein with ACE2 host receptor.
According to our results, sequence analysis using ConSurf and KL evolutionary conservation metrics yielded a consensus by identifying several binding interface residues as conserved hotspots. Indeed, both methods pointed to a group of conserved hotspot residues from the RBM region including K390, V404, Y440, N473, Y475, and Y481 sites in SARS-CoV-RBD (
Figure 4A,B,
Table 1). The strategically located Y440, N473, Y475, and Y481 hotspot residues anchor the RBM region and are shared between SARS-CoV-RBD and SARS-CoV-2-RBD proteins (
Table 1). Some of these residues belong to a conserved segment 472-LNCYWPL-478 in SARS-CoV (486-FNCYFPL-492 in SARS-CoV-2) in which Y475 position (Y489 in SARS-CoV-2) emerged as a central binding energy hotspot (
Figure 4A). It is worth noting that segment 472–486 forms a loop that is not present in certain S-proteins of coronavirus isolated in bats [
40]. This loop extends the interaction area between SARS-CoV-2-RBD and human ACE2 and can contribute to binding interactions. The conserved segment is flanked by F486 in SARS-CoV-2 which is a conservative replacement of L472 in SARS-CoV and is implicated as an important contributor of binding affinity [
14]. The original structural studies of SARS-CoV spike protein in the complex with ACE2 suggested that residues L472, N479, and T487 that are immediately adjacent to the conserved patch could be critical for binding selectivity [
14,
24,
25,
26]. These positions in SARS-CoV-2 correspond to F486, Q493, and N501 residues involved in critical interactions of SARS-CoV-2-RBD with ACE2 (
Figure 3A,D).
The structure of ACE2 consists of the N-terminus (residues 19–102, 290–397, and 417–430) and C-terminus subdomain II (residues 103–289, 398–416, and 431–615) that form the opposite sides of the active site cleft [
48,
49,
50,
51]. The conservation patterns for ACE2 using both metrics revealed a cluster of conserved residues in the subdomain I (
Figure 4C,D). The ACE2 interaction sites with SARS-CoV-RBD proteins involve α-helices of ACE2 (residues 20–52, 56–86, 324–331) and β-strand (residues 341–361). We quantified the interface contacts between SARS-CoV-RBD/SARS-CoV-2RBD and ACE2 using PRODIGY approach [
52,
53] that counts the number of interatomic contacts at the interface of a protein–protein complex within a 5.5 Å distance threshold (
Tables S2 and S3). The list of the intermolecular contacts showed that the total number of contacts made by SARS-CoV-2 RBD with ACE is only moderately greater than the ones formed by SARS-CoV-RBD (
Tables S2 and S3). Among ACE2 residues involved in the interfacial contacts several interacting residues are relatively conserved (F28, Y41, Y83, G326, N330), while some other interfacial sites (T27, D30, K31, H34, D38) are more variable (
Figure 4C,D). The hydrophobic ACE2 residues buried in the central segment of the binding interface tend to be more conserved, where residues F28, L79, Y83 form contacts with Y489, N487, and F486 of SARS-CoV-2 (
Figure 3C,D). Another important conserved ACE2 site corresponds to Y41 position that together with Y505 in SARS-CoV-2-RBD “sandwich” the recognition hotspot bridges K353-D38 (
Figure 3A). At the same time, some other ACE2 residues at the interface periphery could be more variable and tolerant to conserved amino acid substitutions. These results are consistent with deep mutagenesis studies, where a number of ACE2 variants with the increased binding to the RBD were identified, suggesting that sequence of human ACE2 is only suboptimal for binding of SARS-CoV spike proteins [
28]. This experimental study discovered feasibility of hydrophobic substitutions at T27 position, potential of D30E modification, and tolerance to aromatic substitutions at K31 position. Consistent with these revelations, sequence analysis showed that these ACE residues featured positive Consurf scores for T27, D30, and K31 pointing to their evolutionary variability (
Figure 4C,D).
To summarize, the results indicated that several conserved sites could form important recognition hotspots of the SARS-CoV binding interface with ACE2. We also proposed that binding free energy differences between SARS-CoV and SARS-CoV-2 RBD may be influenced by cumulative contributions of large number of variable positions. Although sequence analysis can identify important variable RBM regions implicated in binding, the extent of sequence variability may not necessarily directly correlate with the functional importance of respective residues for binding selectivity. In general, sequence analysis demonstrated that the extensive binding interface between SARS-CoV-RBD and ACE2 proteins can involve contributions of both conserved and more variable positions that are broadly distributed and collectively drive the binding mechanism in which sequence variability and structural plasticity can be important for rendering binding selectivity of SARS-CoV-2-RBD protein.
2.2. Coevolution and Mutual Information Interdependences of the SARS-CoV-RBD and ACE2 Suggest Potential Regulator and Transmitter Sites of Binding and Communication
The important complementary information can be gained by considering coevolutionary dependencies of protein residues in which functional and structural changes are often accompanied by compensatory mutations at the protein–protein interfaces to modulate activity. Coevolving residues can mediate protein recognition in multiprotein complexes and are often spatially close to each other, forming clusters of interacting residues that are located near functionally important sites [
54,
55,
56,
57]. The coevolving residues may be also clustered in mobile regions surrounding structurally stable functional sites and form interaction networks of evolutionarily coupled residues that facilitate protein conformational changes [
58]. Using coevolutionary residue matrices determined by the shared mutual information, we constructed the network of coevolutionary couplings in the SARS-CoV-RBD complexes in which the nodes represented protein residues and links corresponded to coevolutionary dependencies (or extent of shared mutual information) between these residues [
59]. For each residue, we computed cumulative mutual information (cMI) parameter, that evaluates the shared mutual information between a given residue and other protein residues, and proximity-based mutual information parameter (pMI) approximating the amount of mutual information shared by a given residue with the structurally close neighboring nodes [
43,
44]. Using this approach, we characterized mutual information interdependencies between functionally important regulatory sites and identified clusters of residues involved in coevolutionary couplings. The computed sequence-based cMI and structure-based pMI parameters (
Figure 5) provided several important insights that expanded our understanding of evolutionary traits at the binding interface regions. According to our results, the high pMI interfacial residues are largely conserved and are aligned with the central segment of the interface for SARS-CoV-RBD (Y436, Y440, Y475, W476, Y481, Y484) and SARS-CoV-2-RBD (Y449, Y451, Y489, F490, Y495, Q498) (
Figure 5A,C). It is also evident that high pMI sites are enriched by conserved tyrosine residues located at the intermolecular interface. Owing to their conservation, these high pMI positions are located at the epicenter of coevolutionary couplings and are characterized by local structural environment that is enriched by coevolving residues (
Figure 5A,B). According to our previous studies [
59,
60,
61] high pMI positions are often aligned with the structurally stable regulatory centers that dictate which channels of signal transmission are activated in coevolutionary networks. The results of the current analysis suggested that these conserved interface residues may be required for recognition during virus entry into the host receptor and structural stabilization of the binding interface around these key anchor points. Structural positions of these sites in the central segment of the binding interface may provide the necessary “stabilization anchors” that hold SARS-CoV-RBD and SARS-CoV-2 RBD proteins in the recognition mode. We observed that a number of more variable binding residues in the vicinity of the “anchor” sites displayed high cMI values, indicating that the interfacial regions structurally proximal to more conserved central residues may leverage coevolutionary dependencies to facilitate binding with ACE2 (
Figure 5C). Many residues from the recognition loop of the RBM (residues 470–491 in SARS-CoV-2) featured high cMI values (
Figure 5B,
Table 2) showing that this flexible region is enriched by sites sharing mutual information and therefore may be critical for signal transmission and binding. Among high cMI residues in the SARS-CoV-RBD are L472, N479, and T487 (F486, Q493, and N501, respectively, in SARS-CoV-2-RBD) (
Figure 5C,
Table 2). These RBM residues showed appreciable sequence variability and were also aligned with the cMI peaks, (
Figure 5C,
Table 2). Furthermore, these coevolving RBM residues form a network of interactions with Y41, K353, E37, D38, E35, and K31 residues that also displayed high cMI values in ACE2 (
Figure 5D).
Our findings support the assertion that these clusters of coevolving mobile residues are important for binding and could be implicated in the transmission of virus signals. By mapping residues featuring high pMI and cMI values, we observed that structurally stable and conserved residues Y475, W476, L478 in SARS-CoV (Y489, F490, and L492 in SARS-CoV-2) are located in the middle of the binding interface and could form a cluster of high pMI sites (
Figure 6). The proximity-based coevolutionary signature of these sites occupying strategic interface positions may be associated with their functional role as regulators of signaling between interacting proteins in the coevolutionary network.
This specific signature of functional sites could make them indispensable for the integrity of coevolutionary interaction networks, which may cause even minor mutations at these positions to be highly detrimental. We argue that mutations in these conserved tyrosine positions may interfere with both the stability and recognition of the virus protein and drastically change binding for both SARS-CoV spike proteins. We observed that highly coevolving residues in SARS-CoV-RBD tend to occupy the mobile loops of the binding interface and may be involved in the conformational and interaction rearrangements during binding with ACE2 (
Figure 6). These residues are only moderately conserved and could act as flexible carriers of allosteric signals between interacting proteins. Structural mapping highlighted a dense cluster of RBD residues from the mobile interacting loop that is enriched by coevolving residues in SARS-CoV-2-RBD (
Figure 6). A cluster of highly coevolving centers can be also seen in the β-hairpin region of ACE2 (residues 348–360) as residues R357, D355, and K353 feature high cMI values and may be involved in the transmission of the binding signal. Based on our observations, this cluster of highly coevolving interacting residues in the β-hairpin region of ACE2 may be involved in propagation of functional signals across the binding interface and serve as a mediating cluster for virus transmission. In this model, the flexibility of highly coevolving residues sharing a significant degree of mutual information may allow for compensatory mutations in the interfacial regions to secure structural features that control allosteric signaling between interacting proteins in the complex. We argue that the observed interplay of evolutionary and coevolutionary signatures for the binding interface residues may contribute to their distinct functions as regulators and transmitters of allosteric signaling and govern binding selectivity of the SARS-CoV-2 protein.
2.3. Conformational Dynamics of the SARS-CoV and SARS-CoV-2 RBD Proteins: A Comparative Analysis of ACE2-Induced Protein Mobility and Dynamics-Driven Allostery
To explore conformational landscapes of the SARS-CoV interactions with human ACE2 in sufficient details, we performed both atomistic MD simulations (
Figure 7) and coarse-grained (CG) simulations (
Figure 8) of the SARS-CoV and SARS-CoV-2 structures in their unbound and bound forms. All-atom MD simulations of the SARS-CoV structures in different functional states can provide a rigorous quantitative assessment of atomistic fluctuations and interaction details. However, these simulations can become time-consuming when performed for multiple SARS-CoV structures and complexes under various simulation conditions and environments. To complement all-atom MD simulations and compare quantitative and qualitative insights gained from molecular simulations, we also performed multiple CG simulations of the SARS-CoV and SARS-CoV-2 structures and their complexes with ACE2 (
Figure 8). Among objectives of this analysis was to analyze the extent to which CG simulations can capture and reproduce both quantitative and qualitative patterns in the atomistic dynamics and therefore represent a powerful complementary alternative for molecular studies of the SARS-CoV-2 binding and interactions.
Using these simulation approaches applied to the structures of SARS-CoV and SARS-CoV-2 RBD proteins in the unbound and ACE2-bound forms, we performed a detailed comparative analysis of the conformational dynamics profiles and examined salient features of the dynamic conformational landscapes. In MD simulations of the SARS-CoV-RBD with ACE2, the structure of the complex fluctuated within fairly narrow RMSD values until it reached a plateau after about 300 ns and continued to maintain this equilibrium through the remaining simulation time (
Figure S1A). The structural integrity and thermal stability of the SARS-CoV-2-RBD complex with ACE2 during MD simulations was retained with RMSD < 4.5 Å from the starting crystal structure (
Figure S1B). The simulations reached the first plateau with RMSD ≈3.5 Å after ≈300 ns followed by the increased fluctuations at ≈600 ns and subsequent thermal stabilization with RMSD ≈4.5 Å from the crystal structure (
Figure S1B). Atomistic simulations revealed a dynamic nature of the SARS-CoV/SARS-CoV-2 structures that maintained a considerable degree of residual mobility (
Figure 7). CG simulations displayed generally more stable profiles of the SARS spike protein and ACE2 in their unbound forms, pointing to the increased stabilization of the binding interface residues in the complexes (
Figure 8). The conserved core of SARS-CoV-RBD consists of five antiparallel β strands (β1 to β4 and β7), with three connecting α-helices (
Figure 1). Atomistic MD simulations showed that these regions (residues 341–344, 364–367, 383–390, 493–500, 394–403, 431–438 in SARS-CoV-RBD) are stable and structurally virtually indistinguishable between the SARS-CoV and SARS-CoV-2 proteins (
Figure 7A,B).
Nonetheless, both all-atom simulations and CG simulations (
Figure 8A,B) revealed some moderate thermal motions in the core of SARS-CoV-RBD and displayed greater structural rigidity for core regions in the SARS-CoV-2-RBD complex with ACE2. The antiparallel β-sheets (β5 and β6) (residues 439–441 and 478–481 in SARS-CoV-RBD and residues 451–454 and 492–495 in SARS-CoV-2-RBD) that anchor the RBM region to the central core showed a particularly significant stabilization, that was especially noticeable in all-atom MD simulations (
Figure 7A,B). By anchoring the RBM region, these strands ensure proper complementarity of the concave surface with the ACE2 enzyme. The thermal stability of these regions is preserved in both the unbound spike forms and complexes with ACE2 enzyme, suggesting that the structural integrity of these antiparallel β-sheets is a prerequisite for establishing functional interfaces with host receptor. All-atom MD simulations also indicated that stability of the protein core in the unbound and bound forms of SARS spike proteins can be comparable. Moreover, smaller fluctuations were seen for the unbound structure in the loop regions that surround the protein core and are not involved in ACE2 interactions (residues 345–382 in SARS-CoV) (
Figure 7A,B).
The CG simulation profiles showed that the protein core of five antiparallel β strands (β1 to β4 and β7) can be comparable and even more stable in the unbound SARS-CoV form while the RBM region (residues 423–494 in SARS-CoV and residues 435–506 in SARS-CoV-2) is more flexible (
Figure 8A,B). Both atomistic and CG simulations indicated that this region is flexible in the unbound forms of SARS spike proteins, but undergoes stabilization in the ACE2-bound complexes. However, the pattern of the increased stabilization in the RBM region is more revealing in all-atom MD simulations that unambiguously pointed to the greater rigidity of this region in the SARS-CoV-2 RBD complex with ACE2 (
Figure 7A,B). Atomistic simulations also revealed important subtle differences in the mobility of RBM residues that are not immediately apparent from CG simulations, indicating that atomistic resolution of thermal fluctuations and interaction contacts may be critical for quantitative understanding of the dynamic differences between structurally similar SARS-CoV-RBD and SARS-CoV-2 RBD proteins. In the SARS-CoV complex with ACE2, several regions from the RBM (residues 425–440, 480–494) displayed fluctuations that were similar in the unbound and bound protein forms, indicating that the interactions in this region with ACE2 are moderate and could tolerate fluctuations in this region (
Figure 7A). In the complex, binding contacts may only partly reduce flexibility of SARS-CoV at the interfacial loop 455–476 that still retains a considerable degree of residual mobility (
Figure 7A). Interestingly, the conformation of this segment in the SARS-CoV complex is identical to the one adopted in the unbound SARS-CoV form. Hence, the RBM regions in the SARS-CoV complex with ACE2 may be characterized by only a partial reduction of mobility at the binding interface (
Figure 7A). The RBM loops in SARS-CoV-RBD (residues 461–471, 474–476, and 480–491) experienced appreciable fluctuations in the complex with ACE2, particularly loop 480–491 displaying higher fluctuation with respect to the rest of the RBD structure. These RBM regions in the SARS-CoV-2 RBD protein (residues 474–485, 488–490, and 494–505) featured noticeably smaller fluctuations, especially for the ridge loop 494–505 that becomes largely immobilized upon gaining productive stabilizing contacts with the host receptor ACE2 (
Figure 7B). A generally similar trend could be also inferred from CG simulations where the RBM interface residues in the SARS-CoV-RBD experienced thermal fluctuations above the rigidity threshold of 1.0 Å (
Figure 8A) while the corresponding positions in the SARS-CoV-2-RBD complex become immobilized with RMSF fluctuations of 0.7–0.8 Å (
Figure 8B).
Importantly, both coarse-grained and atomistic simulations consistently revealed a stronger stabilization of the RBM residues for SARS-CoV-2 spike protein in the complex with ACE2 (
Figure 7B and
Figure 8B). In this case, the interfacial loop segments (residues 436–455 containing an important motif 444-KVGGNYNY-451 and residues 470–506 that feature the interacting motif 495-YGFQPTNG-502 in SARS-CoV-2-RBD) displayed significantly reduced fluctuations as compared to SARS-CoV Spike RBD (
Figure 7B). As a result of the host receptor-induced stabilization, a significant number of SARS-CoV-2 residues across the entire interface (G446, Y449, F486, Y489, G496, Q498, T500, N501, and G502) become rigidified and are positioned to make strong specific interactions with ACE2. In addition, a portion of the interfacial loop (residues 473–487) that is mobile in the unbound form undergoes a significant stabilization and adopts a different conformation in the SARS-CoV-2-RBD as compared to the SARS-CoV-RBD complex (
Figure 9). These changes position SARS-CoV-2 residues Y473, A475, G476 to establish contacts with ACE2. Moreover, SARS-CoV-2 residues Y489 and F486 pack against F28, L79, M82, and Y83 on ACE2 to form a critical patch of hydrophobic interactions at the interface (
Figure 3D). Importantly, not only these residues showed markedly reduced fluctuations, but the large interfacial stretch of residues across entire binding interface (residues 486-FNCYFPLQSYGFQ-498) including key Q493 and Q498 interacting sites exhibited a stronger stabilization as compared to their respective counterparts in SARS-CoV (
Figure 7A,B). Indeed, the corresponding residue stretch 472-LNCYWPLNDYGFY-484 in the SARS-CoV displayed a greater mobility and is marked by numerous amino acid changes between SARS-CoV-2 and SARS-CoV (F486 vs. L472, F490 vs. W476, Q493 vs. N479, S494 vs. D480, Q498 vs. Y484, respectively). Hence, atomistic simulations established that significant structural similarity between SARS-CoV and SARS-CoV-2 bound complexes with ACE2 may be accompanied by subtle and yet noticeable dynamic changes that may be important for driving binding selectivity preferences of SARS-CoV-2 RBD protein. Hence, atomistic and CG simulations indicated that the RBM residues in the SARS-CoV-2 RBD protein undergo a more significant structural stabilization upon ACE2 binding as evident by marked differences in the fluctuations of the unbound and bound spike protein. In some contrast, the corresponding dynamic changes between the unbound and bound forms of the SARS-CoV-RBD are markedly smaller, suggesting that a considerable degree of protein mobility can be retained in the complex, including some variability at the binding interface.
Conformational dynamics profiles of ACE2 residues were very similar in the unbound and bound forms as evident from both atomistic simulations (
Figure 7C,D) and CG simulations (
Figure 8C,D). The protein core residues displayed significant stability and only small changes were observed for the binding interface residues in ACE2. These segments of ACE2 were stable in both unbound and bound forms (
Figure 7C,D). The two ridges of the RBM interact with the ACE2 receptor by contacting the inter-helical loops on one side and a unique β-hairpin (residues 348–360) on the other side of the interface. The small portion of the β-hairpin turn containing hotspot residue K353 may be relatively flexible in the unbound ACE2 form when structural restraints from viruses are absent. Among important observations revealed by both atomistic and CG simulations is a noticeable redistribution of conformational dynamics in the SARS-CoV and SARS-CoV-2 structures upon complex formation with ACE2. The exchange of conformational mobility is manifested by stabilization of the ACE-2 bound RBM regions and the increase of thermal motions in the rest of the protein. Moreover, such redistribution of thermal fluctuations is more pronounced in SARS-CoV-2-RBD that is coupled with a stronger stabilization of the entire binding interface. These findings imply that binding of SARS-CoV-2 spike with the host receptor may exhibit signs of a dynamically driven allosteric mechanism which is typically exemplified by the lack of structural changes between the unbound and bound forms, coupled with an exchange of conformational mobility between local protein regions [
62,
63,
64,
65]. This hypothesis requires more extensive work and detailed analysis of the collective dynamics and binding-induced changes in the organization and modularity of dynamic interaction networks for the unbound SARS-CoV-RBD and ACE proteins and complexes. This extends beyond the scope of the present study and will be presented elsewhere. In this context, our most recent study of the SARS-CoV-2 spike trimers using perturbation analysis and network modeling presented evidence that the SARS-CoV-2 spike protein can function as an allosteric regulatory engine that fluctuates between dynamically distinct functional states [
37].
To determine dynamic patterns of the key binding interface contacts, we monitored the interfacial residue contact stability during all-atom MD simulations by computing the productive contact time throughout the MD trajectory of the SARS-CoV-RBD complex with ACE2 (
Table S4) and SARS-CoV-2 RBD complex (
Table S5). This analysis revealed the larger number of persistent stabilizing contacts in the SARS-CoV-2 RBD complex with ACE2, where most of the interfacial interactions remained stable throughout the entire simulation period (
Table S5). In some contrast, some interfacial contacts in the SARS-CoV-RBD complex could form, break, and then reestablish during the simulation period (
Table S4). Of particular interest is dynamics of contacts for binding hotspots that involve K31 and K353 residues of ACE2. In the ACE2 enzyme, K31 forms the intramolecular salt bridge with E35 while K353 forms a salt bridge with D38. These intramolecular bridges are preserved in the complex with SARS-CoV-RBD. In the SARS-CoV complex with ACE2, K31 hotspot in ACE2 maintains stable interactions with Y442 and Y475 of SARS-CoV-RBD (
Table S4). Strikingly, the corresponding positions in the SARS-CoV-2 RBD (L455 and Y489, respectively) are unable to establish persistent stable contacts with K31 even though simulations registered a close structural overlap for corresponding sites in the SARS-CoV-2 and SARS-CoV-RBD. The loss of the intramolecular salt bridge K31-E35 broken in the structure of the SARS-CoV-2 complex with ACE2 is counterbalanced by new contacts made by E484, Q493, F456, and F490. Several of these contacts are more persistent in their duration (E484-K31 and Q493-K31), while interaction contacts with hydrophobic residues F456 and F490 are more dynamic (
Table S4). Indeed, the interaction between Q493 of SARS-CoV-2 and K31 hotspot of ACE2 was reported in the experimental studies [
15]. These results are consistent with recent MD simulations of the SARS-CoV-RBD complexes [
66].
We also report the residue solvent-accessible surface (SASA) in the SARS-CoV-RBD and SARS-CoV-2 RBD complexes that were obtained by averaging the SASA computations over the length of MD simulation trajectories (
Figure S2). This analysis helped to evaluate the average buriedness of the interfacial residues in the course of simulations. Among deeply buried residues with small SASA in the SARS-CoV complex are binding interface positions Y440, Y442, L443, F460, Y481, Y484 residues (
Figure S2A). At the same time, a number of residues in this region can be partially exposed. Consistent with dynamics profiles, the antiparallel β-sheets β5 and β6 (residues 439–441 and 478–481) in the central segment of the interface showed relatively small SASA ≈15–20 Å
2, although K439 and R441 featured larger SASA values of 69.7 and 61.9 Å
2, respectively (
Figure S2A). In the central region of the interface residues V404, Y442, F460 are deeply buried, while several other positions (W476 and N479) showed appreciable solvent accessibility. These antiparallel β-sheet regions in the SARS-CoV-2 (residues 451–454 and 492–495) are well buried at the interface with SASA values ≈5–10 Å
2 (
Figure S2C). The RBM residues in the central segment (K417, L456, F456, Y473, F490, and Q493) showed markedly smaller SASA values than their counterparts in SARS-CoV, particularly for F456 and Q493 that are completely buried (
Figure S2A,C). These observations reinforced conformational dynamics analysis, suggesting that the entire binding interface is considerably more rigidified in the SARS-CoV-2 complex owing to structural restrictions to maintain multiple favorable contacts with ACE2. Of particular notice is a comparison of residues L472, N479, and T487 (SARS-CoV) with corresponding positions F486, Q493, and N501 residues in SARS-CoV-2 that showed dynamic differences and are implicated in critical interactions that determine binding preferences of the SARS-CoV-2 complex. In the SARS-CoV complex, N479 residue is fairly exposed with large SASA value of 90.6 Å
2, whereas the respective site Q493 in SARS-CoV-2 is completely buried with SASA of 0.5 Å
2. Other residues in this region have similar SASA values, being either completely buried (T487 → N501) or showing an appreciable and similar solvent accessibility of ≈70 Å
2 and dynamic mobility (L472 → F486). Although this analysis indicated the importance of N479 → Q493 modification and a unique role of Q493 as a strong driver of binding selectivity, we also noticed a broad range of dynamic and solvent exposure variations among RBM residues. These results provided an additional support to our hypothesis that binding affinity of SARS-CoV-2 with ACE2 may arise from a cumulative contribution of many interfacial residues with different dynamic and solvent exposure profiles.
In summary, the central finding of the conformational dynamics analysis is the evidence of a much broader and consistent stabilization across the entire RBM interface for the SARS-CoV-2-RBD protein. According to these results, subtle but important differences in mobility of the binding interfaces for SARS-CoV-2 may not be confined to several spatially localized binding hotspots, but instead appeared to be more broadly distributed across many RBM residues. This may be contrasted with SARS-CoV-RBD dynamics patterns in which ACE2 binding may induce thermal stability and strong interaction contacts in the specific hotspots, while other interface patches could retain considerable mobility and exhibit weaker intermolecular contacts. Based on these observations, we suggested that the stronger binding affinity and the associated virus transmission may arise from cumulative dynamic and energetic changes of many SARS-CoV-2 residues at the binding interface with ACE2 enzyme.
2.4. Mutational Scanning and Energetic Analysis of the Binding Interfaces: A Cooperative Effect of Multiple Residues Drives SARS-CoV-2 Binding with Host Receptor
Based on the detailed analysis of the conformational dynamics, we conjectured that the energetic characterization of the binding interface residues using ensemble-averaged estimates could better quantify binding differences between SARS-CoV-RBD and SARS-CoV-2 RBD. We employed the equilibrium ensembles generated from all-atom MD simulations of the SARS-CoV and SARS-CoV-2 complexes with ACE2 to perform a systematic alanine scanning of the protein residues in the complexes (
Figure 10). In particular, we examined the binding free energy changes of the interface residues to determine the hotspot residues and identify differences in the binding interactions between SARS-CoV and SARS-CoV-2 spike proteins. Sequence conservation analysis and conformational dynamics pointed to potentially critical residues at the binding interface that are different between SARS-CoV and SARS-CoV-2 (
Figure 3,
Table S1). Strikingly, these residues are stretched across the entire interface and include the following important modifications between SARS-CoV and SARS-CoV-2 at the N terminus (R426 → N439, Y484 → Q498, T487 → N501), in the central bridge of the interface (V404 → K417, Y442 → L455, L443 → F456, F460 → Y473, W476 → F490, and N479 → Q493), and in the C terminus (L472 → F486) (
Figure 2). While previous studies have highlighted the importance of these positions, the mechanistic role and relative contributions of these residues to protein stability and binding selectivity remain debatable and require further investigation. Additionally, there are several strategically located conserved positions in the middle of the binding interface of SARS-CoV and SARS-CoV-2 (Y440 → Y453 and Y475 → Y489) and N-terminus segment (Y491 → Y505) that are likely to contribute to both protein stabilization and binding affinity of spike proteins.
To aid in a comparative analysis of the binding free energy changes, we performed alanine scanning using dynamic ensembles of multiple SARS-CoV-2 structures bound with ACE2 (
Figure 10), including the crystal structure of SARS-CoV-2 spike RBD bound with ACE2 (pdb id 6M0J) [
24], the crystal structure of SARS-CoV-2 chimeric RBD bound with human ACE2 (pdb id 6VW1) [
25], and the structure of ACE2 with B
0AT1 and RBD of SARS-CoV-2 (pdb id 6M17) [
26]. First, we found that the strongest binding energy hotspots are conserved between SARS-CoV and SARS-CoV-2 proteins and are located in the central segment of the interface (Y440 → Y453 and Y475 → Y489) and N-terminus segment (Y491 → Y505) (
Figure 10). These residues are virtually superimposable in the crystal structures and form a very similar number of favorable hydrophobic contacts with ACE2 (
Figure 3A,C). Indeed, a large and comparable loss of the binding energy was found for these residues in both SARS-CoV and SARS-CoV-2 (2.82/2.85 kcal/mol for Y475A/Y489A; 1.31/1.91 kcal/mol for Y440A/Y453A; 1.78/1.78 kcal/mol for Y491A/Y505A, respectively) (
Figure 10A,B). Importantly, these conserved binding energy hotspots also correspond to the regulatory sites in coevolutionary residue networks that may control signal propagation across binding interfaces. Other important binding residues in the SARS-CoV RBD showing a significant energy loss upon alanine modifications included Y442, N473, and Y484 positions (
Figure 10A). In these positions, Y442A change of 2.15 kcal/mol can be compared with a smaller loss for L455A of 1.62 kcal/mol, while L472A with 0.9 kcal/mol change can be contrasted to a larger loss for F486A of 1.68 kcal/mol (
Figure 10A,B). Interestingly, in several other important positions, binding free energy losses caused by alanine mutation can decrease in going from the SARS-CoV-RBD to SARS-CoV-2 RBD. For instance, the energy change upon N473A mutation of 1.6 kcal/mol can be opposed to the N487A loss of 1.13 kcal/mol, and Y484A mutational change of 1.99 kcal/mol can be compared to Q498A energy loss of 1.75 kcal/mol (
Figure 10A,B). The computed binding energy changes for the SARS-CoV-2 RBD structures consistently predicted residues Q493 and N501 as the strongest energetic hotspots (
Figure 10B–D). The important difference in the binding energies resulted from amino acid modification of N479 in SARS-CoV to Q493 in SARS-CoV-2. The binding free energy loss of 2.52 kcal/mol for Q493A mutation can be contrasted to a much smaller change of 0.82 kcal/mol for N479A (
Figure 10A,B). The error bars for the binding free energy changes were relatively small (≈0.05–0.15 kcal/mol). Another key difference may arise from T487 → N501 substitution, where the binding energy loss of 2.48 kcal/mol for N501A is markedly larger than the change of 1.32 kcal/mol for T487A mutation in SARS-CoV (
Figure 10A,B). More importantly, the binding energy hotspots of SARS-CoV-2 can be better supported by structurally proximal residues that collectively contribute to the broader binding interface as compared to SARS-CoV-RBD. The key hotspot Q493 in the central segment of the interface is supported by residues K417, Y453, L455, and F456 that contribute significantly to the binding energy of SARS-CoV-2 (
Figure 10B–D). Another notable difference is between F456 in SARS-CoV-2 and corresponding L443 in SARS-CoV. The binding energy difference of 2.18 kcal/mol for F456A may be contrasted to only 1.06 kcal/mol for L443A mutation in SARS-CoV. The more favorable interactions could be also seen for the residue stretch 501-NGVGY-505 surrounding N501 hotspot in SARS-CoV-2 as compared to corresponding region 487-TGIGY-491 in SARS-CoV (
Figure 10A,B). The important conclusion of our analysis is that the binding energy contributions appear to be distributed more broadly and densely across the SARS-CoV-2 RBD binding interface. In some contrast, the binding energy of the SARS-CoV RBD may be determined by several sparsely localized patches of the interface residues.
Our results also provided some useful insights to the binding mechanism based on rearrangements of two virus-binding hotspots defined by a salt bridge between K31 and E35 and a salt bridge between K353 and D38 in the SARS-CoV-RBD (
Figure 11A,B). According to the performed energetic analysis, the network of SARS-CoV-2 RBM interactions formed by the middle segment of the interface (K417, Y453, L455, F456, and Q493) with K31 and E35 is vastly different and considerably stronger than the corresponding interactions made in the SARS-CoV complex (
Figure 10). The loss of salt bridge K31-E35 broken in the SARS-CoV-2 RBM complex is counterbalanced not only by the hydrogen bonding interactions with Q493 but is also reinforced through an extensive contact network with other RBM residues (
Figure 11C,D). The energetic analysis revealed that the binding energy hotspots on ACE2 corresponded to residues T27, F28, K31, H34, Y41, and Y83 displaying consistently larger mutation-induced free energy changes in the SARS-CoV-2 RBM complex (
Figure 10B–D). Hence, the virus-binding hotspot on ACE2 formed by K31, H34, and E35 may be an important driving force of binding selectivity for SARS-CoV-2 RBD (
Figure 11C,D). Another hotspot on ACE2 is K353-D38 salt bridge that is surrounded by hydrophobic walls formed by Y41 and D37 residues [
67]. This hotspot is supported by T487 and Y491 in SARS-CoV-RBD (
Figure 11A,B) and by N501 and Y505 in SARS-CoV-2-RBD (
Figure 11C,D). Although this hotspot is important for recognition of spike protein, the binding energy contributions of the corresponding residues in the SARS-CoV-RBD (
Figure 10A) and SARS-CoV-2 RBD (
Figure 10B–D) are relatively moderate and similar, suggesting that this hotspot may be less critical for driving binding selectivity of SARS-CoV-2-RBD.
We also conducted a systematic alanine scanning of the RBD spike protein residues in which FoldX energy function was used to estimate protein stability changes in all studied complexes (
Figure S3). As expected, the protein stability scanning profiles were generally similar for SARS-CoV (
Figure S3A) and SARS-CoV-2 residues (
Figure S3B–D). However, the results also indicated the greater stability of the binding interface residues from RBM, displaying larger individual peaks and also showing stronger stability of the interfacial residues surrounding the key hotspots (
Figure S3B). Collectively, our results pointed to several interesting trends: (a) the strongest binding energy hotspots are conserved between SARS-CoV and SARS-CoV-2 proteins (Y440 → Y453, Y475 → Y489 and Y491 → Y505); (b) the largest energetic differences may be attributed to the key energy hotpot residues Q493 and N501 in SARS-CoV-2 RBD; (c) SARS-CoV-2 binding selectivity preferences could be strongly influenced by the virus-binding hotspot K31 and H34 in the middle of the interface through an extensive interaction network with K417, Y453, L455, F456, and Q493; (d) the binding energy contributions are broadly and densely distributed across many interfacial residues of the SARS-CoV-2 RBD supporting the key hotspot positions. The results suggested that the differences in the binding affinities of SARS spike proteins may arise from a cumulative contribution of small energetic changes distributed across the entire interface rather than from localized changes in several binding energy hotspots.