Capturing a Crucial ‘Disorder-to-Order Transition’ at the Heart of the Coronavirus Molecular Pathology—Triggered by Highly Persistent, Interchangeable Salt-Bridges

The COVID-19 origin debate has greatly been influenced by genome comparison studies of late, revealing the emergence of the Furin-like cleavage site at the S1/S2 junction of the SARS-CoV-2 Spike (FLCSSpike) containing its 681PRRAR685 motif, absent in other related respiratory viruses. Being the rate-limiting (i.e., the slowest) step, the host Furin cleavage is instrumental in the abrupt increase in transmissibility in COVID-19, compared to earlier onsets of respiratory viral diseases. In such a context, the current paper entraps a ‘disorder-to-order transition’ of the FLCSSpike (concomitant to an entropy arrest) upon binding to Furin. The interaction clearly seems to be optimized for a more efficient proteolytic cleavage in SARS-CoV-2. The study further shows the formation of dynamically interchangeable and persistent networks of salt-bridges at the Spike–Furin interface in SARS-CoV-2 involving the three arginines (R682, R683, R685) of the FLCSSpike with several anionic residues (E230, E236, D259, D264, D306) coming from Furin, strategically distributed around its catalytic triad. Multiplicity and structural degeneracy of plausible salt-bridge network archetypes seem to be the other key characteristic features of the Spike–Furin binding in SARS-CoV-2, allowing the system to breathe—a trademark of protein disorder transitions. Interestingly, with respect to the homologous interaction in SARS-CoV (2002/2003) taken as a baseline, the Spike–Furin binding events, generally, in the coronavirus lineage, seems to have preference for ionic bond formation, even with a lesser number of cationic residues at their potentially polybasic FLCSSpike patches. The interaction energies are suggestive of characteristic metastabilities attributed to Spike–Furin interactions, generally to the coronavirus lineage, which appears to be favorable for proteolytic cleavages targeted at flexible protein loops. The current findings not only offer novel mechanistic insights into the coronavirus molecular pathology and evolution, but also add substantially to the existing theories of proteolytic cleavages.


Introduction
There has been a dramatic shift [1] in the latest COVID-19 research from its early chapters (2019-20 → 2020- 21), brought about by epidemiological and evolutionary (genome comparison) studies of late [2,3], reporting the presence of an arginine-rich polybasic Furin-like cleavage site at the Spike-S1/S2 junction of SARS-CoV-2 (FLCS Spike ), absent teins. Most structures of Spike glycoproteins solved to date are in their pre-fusion state, which is its active state, relevant for both the FLCS Spike -Furin and the RBD Spike -hACE2 interactions. The pre-fusion forms, aimed to address different specific research queries related to the coronavirus molecular pathology, are either mutated or abrogated at the FLCS Spike region (i.e., the Spike-S1/S2 junction) or otherwise engineered with stabilizing mutations [2,24,42], aimed at obtaining overall structural information. However, even with these stabilizing modulations, all cryo-EM coronavirus Spike structures lack experimental (primary) data, and, hence, atomic coordinates for their FLCS Spike patches (missing stretches of 10-12 amino acids [43]) harboring the polybasic activation loop ( 681 PRRAR 685 ) in SARS-CoV-2 Spike, and, equivalent homologous pentapeptide sequence motifs in other coronavirus Spikes. The pre-fusion structure of SARS-CoV-2 Spike (PDB ID: 6XR8) [24] could decipher the structure of 25 ordered peptides downstream of S2 hitherto unreported including residues of N terminus, several peripheral loops and glycans [24]. Even this most insightful high resolution structural study, which could reveal the distinct conformational states (pre-and post-fusion forms) of the SARS-CoV-2 Spike, was unable to ascertain the conformation of the surface exposed disordered FLCS Spike loop harboring the Spike-S1/S2 junction [24]. The region is further clearly and categorically declared as a "surface exposed disordered loop" [24]. The presence of such 'disordered activation loops' (also called 'natively unstructured loops' [44] or, less technically, 'flexible' loops/regions) at proteolytic sites is common to many protease families (e.g., Caspases, Furins, Rho-activated kinases etc.) and in spite of decade(s)-long controversies are still assumed to serve as key structural and kinetic determinants of protease substrates [44]. The loops, often making the substrates more susceptible to protease binding and cleavage [33], proteasomal degradation [45], phospho-regulation and consequent priming of viral pathogenesis [46] are believed to have been evolved often from sensitive and fragile globular-disorder intermediates such as coiled-coil assemblies [47]. Having said that, there is little experimental structural information that is available on the cleavage loops for our current subjects, the coronavirus Spikes.
The loop-disorder is further supported by extensive (coarse-grained) multi-microsecond molecular dynamic (MD) simulations of the representative SARS-CoV-2 Spike structures (6VXX, closed state; 6VYB: partially open state, ectodomain) [40] with modeled FLCS Spike loops (residue ranges: 676-690) [48] that show violation of ergodic hypothesis [40], hinting towards a flexible, albeit biased, conformation of the activation loop for its function. In other words, during the entire course of these long MD simulations, the loops remained largely unstructured [40], sampling several outwardly extended conformations making them attractive and accessible cleavage sites for Furin and other pro-protein convertases. Thus, the loop disorder does not appear to be short-termed, but rather dynamically sustained in the free form of the Spike. As an alternative and arguably a complementary approach, the same paper further models an ensemble of loops ab initio [40] via RosettaRemodel [49] from the closed state Spike structure (6VXX). Subsequent to the modeling, followed by clustering and sorting based on energy, the lowest energy conformation was retained and further refined by kinetic closure [50]. In the conclusive remarks [40], the authors comment that the ab initio modeling indicated that there might be formation of short helices near the cleavage site of the loop; however, these observations are not really supported by any data and/or geometric analysis, such as the Ramachandran plot. The very fact that the FLCS Spike (in its 681 PRRAR 685 ) contains pockets of heavily localized positive charge clouds (a common cause of protein disorder [51]) makes it naturally and intrinsically prone to structural disorder. This makes the analysis even more interesting and non-trivial, and further implies that there needs to be a 'disorder-to-order transition' [51] of the said loop (i.e., the FLCS Spike region) upon interaction (or binding) with Furin. That is because the binding needs to be strong enough to sustain the disordered substrate (FLCS Spike ) jammed into metastable intermediate conformation(s) that support the efficient cleavage of the Spike-S1/S2 junction by the Furin catalytic triad. Recent experimental studies primarily based on functional assays have referred to the 'ostensibly unresolved' FLCS Spike as a 'protruding out loop-like structure' in the exterior [35] of the SARS-CoV-2 Spike, away from the RBD Spike . Others [52] have completely neglected the plausible conformational variation (entropy) of the missing patch(es) in their molecular docking and dynamics studies. Thus, little has been genuinely explored structurally on the Spike-Furin interaction. Given this background, here, in this paper, we present a rigorous structural dynamics study with strong theoretical rationales to penetrate deeper into the Spike-Furin interaction in SARS-CoV-2 in an atomistic detail. To the best of our knowledge, we are the first group to report on the key 'disorder-to-order transition' occurring at the SARS-CoV-2 Spike-Furin interface. Based on present knowledge and understanding, the revealed transition, together with the disorder, intrinsic to the CoV-2 FLCS Spike [24], should find its place at the very heart of the coronavirus molecular pathology. We further demonstrate that the involved 'disorder-toorder transition' in CoV-2 is triggered and sustained by highly persistent interchangeable salt-bridge dynamics, which is often characteristic to protein disorder transitions [53,54]. We also present a novel application of the legendary Ramachandran plot [55] in probing the said 'transition' at the CoV-2 Spike-Furin interface. The conclusions should also hold true for lately evolving CoV-2 variants [30,32].

Details and Rationales of Experimental Structural Templates Used for Docking and Dynamics
Following earlier studies on the use of the SARS-CoV-2 Spike glycoprotein [17,56,57], the cryo-EM structure of its pre-fusion form (PDB ID: 6XR8, 2.9 Å, 25,995 protein atoms [24]) was used as its representative structure (receptor). For the host protease, Furin (ligand), we first made a thorough structural survey of its domains and enzyme active sites, especially that of the catalytic domain and triad). There are only a few experimental (X-ray) structures of Furin to be found presently in the Protein Data Bank [43] (PDB ID: 1P8J, 5JXH) which are of equivalent resolutions with insignificant structural deviations (C α -RMSD: 0.31 Å) upon alignment. More importantly, the region spanning catalytic residues in both are well conserved. 1P8J, (resolution: 2.6 Å), the first Furin structure ever to be solved for Furin [34], is undoubtedly the most studied and best characterized Furin structure, and is also insightful in terms of proteolytic cleavage mechanisms [52,[58][59][60][61]. It is for these reasons that 1P8J was chosen as the representative host protease (Furin) structure in the current study. Again, 1P8J has 8 identical chains (pairwise RMSD < 0.2 Å) in its asymmetric unit, combining to as many as 11 bio-assemblies. The majority of the bio-assemblies (bioassemblies: 3-10) are monomeric, which is the interactive molecular species (or, functionally effective bio-assembly) involved in the Spike-Furin interaction [52]. This made us further retain the first chain (chain A) of 1P8J alone.
Further, to set an appropriate baseline for the Spike-Furin interaction in SARS-CoV-2, a second round of docking was performed by taking the same ligand (Furin: 1P8J) and docking it onto the representative homologous Spike structure (PDB ID: 7AKJ) [62] of SARS-CoV (2002. This cryo EM structure was solved in the process of exploring 'the convalescent patient sera option' for the (what seemed then) possible prevention and treatment of COVID-19, as a natural extension of SARS [63]. The neutralizing antibody Fab fragments (chains: D-H & L in 7AKJ), used to pull down the SARS-CoV Spike, binds at the top of the Spike canopy or the Spike S1 subunit involved in receptor binding [62] which is situated way too far from the Spike-S1/S2 interface (harboring its FLCS Spike ) to all for any realistic interference with the Spike-Furin binding. The Spike-Furin guided docking in SARS-CoV (to be discussed in Section 2.3.3) was followed by repeating all subsequent baseline structural dynamics calculations for the Spike-Furin complex. To set this appropriate baseline is indispensable, particularly for the quantitative interpretation of equilibrium thermodynamic parameters of the Spike-Furin binding, due to be discussed in later sections (Section 2.6).

Dataset of Coronavirus Spikes
For an evolutionary analysis of sequence-based disorder predictions, a second dataset was compiled, consisting of 44 experimentally-solved (exclusively cryo-EM) structures (Table S1, Supplementary Materials) of the pre-fusion form CoV/CoV-2 Spike culled at a resolution of 'not worse than 3 Å' from the PDB. These coronavirus Spike structures are solved to serve different specific research objectives at various pH and other varying physico-chemical conditions [19,62,64,65]. None of these structures were found with any experimental (primary) data for their FLCS Spike patch (located at their Spike-S1/S2 junctions) resulting in missing atomic coordinates (remarked under 'REMARK 465 in the corresponding PDB files) for the said patch (10-12 amino acids).

Modeling of Missing Disordered Loops in Spike
To explore the interaction dynamics between SARS-CoV-2 Spike (PDB ID: 6XR8) and Furin (1P8J), first we needed to have 'all-atom' atomic models for both partners. Furin (1P8J) is much smaller (468 residues) compared to the Spike homo-trimer (3 identical chains, with 1149 residues per chain) and contains no missing stretches in its X-ray structure except for the terminal of most residues. The Spike trimeric structure, however, being a large glycoprotein, characteristically contains missing stretches of residues (of roughly similar lengths) localized at strategic positions, adding up to missing coordinates for 42 amino acids in 6XR8. As discussed in the previous subsection (Section 2.1.2), to have such missing stretches of residues, especially at the Spike-S1/S2 junctions, is common to all Spikes (Table S1) irrespective of the particular coronavirus species. These missing stretches map to flexible [40], as well as disordered [24], loop protrusions resulting from the extended internal packing involved in the trimerization of the monomeric S units. The missing disordered stretches in the representative SARS-CoV-2 Spike pre-fusion structure (6XR8) were then modeled by MODELLER (v.10.1) [66] using its full-length (proteomic [67]) sequence (https://www.rcsb.org/fasta/entry/6XR8/display, (accessed on 15 November 2021)) obtained from the Protein Data Bank [43]. To account for the loop disorder in the modeled missing stretches (in 6XR8), an ensemble modeling approach was adapted. To that end, the 'automodel' module of MODELLER was implemented in an iterative cycle of 500 runs, producing that many conformationally non-redundant 'all-atom' trimeric Spike atomic models, only varying among themselves at the modeled missing stretches.
Missing residues for the baseline structure of SARS-CoV Spike (7AKJ) were also built in a similar fashion using MODELLER, though opting for a much more reduced space of conformational sampling (50 runs of 'automodel'). The lowest energy model among these (ranked by the all-atom Rosetta energy function, availed through Rosetta@home [68]) was retained as the representative 'all-atom' SARS-CoV Spike structure to be used for all subsequent baseline calculations. This sampling may be considered adequate to represent the mildly varying (reduced) conformational space of the missing FLCS Spike patch ( 664 SLLRSTS 670 , https://www.rcsb.org/fasta/entry/7AKJ/display, (accessed on 15 November 2021)) in 7AKJ, which is much shorter than its homologous missing patch in 6XR8 ( 677 QTNSPRRARSVA 689 ). The FLCS Spike in SARS-CoV is further composed mostly of either small-polar (serines, threonines) or hydrophobic (leucines) side chains with constrained rotameric variations. Noticeably, the single charged residue (R667) situated amidst the 7 residue missing patch in 7AKJ is conserved (as R685) in its evolutionarily descendant sequence in 6XR8 ( Figure S1, Supplementary Materials). course grain sampling and fine grain refinements. First, a grid-based rigid-body docking is performed between the receptor (static, fixed at the center of a cubic box) and the ligand (dynamic, placed on a movable grid) in PIPER [70] using its fast Fourier transform (FFT) correlation approach, sampling billions of conformations. The PIPER computed interaction energies (for poses sampled at each grid point) are produced in the form of correlation functions that make the scoring compatible with FFTs, rendering high numerical efficiency of sampling. The latest PIPER-derived scoring function is an upgraded variant of the original internal energy function (https://www.vajdalab.org/protein-protein-docking, (accessed on 15 November 2021)) which is based on the sum of terms representing shape complementarity, electrostatic, and desolvation contributions. A large number of unrealistic poses are then filtered out by PIPER and the 1000 lowest-energy poses are retained. These are then undertaken to RMSD -based structural clustering (using a relaxed 10 Å cutoff for iRMSD [71]), wherein the largest clusters [69] are retained, representing the most likely poses. While the largest clusters do not necessarily contain the most near-native poses, the success rate is generally quite high across families of protein complexes [69]. Usually ten or fewer clusters (~10-12 pose/cluster) are retained at this stage, adding up to 100-120 docked poses per run. The final refinement is then performed on these selected poses by rigorous rounds of energy minimization. The blind 'ensemble docking' performed between Furin (ligand) and each Spike model (receptor) does not impose any additional active-site or contact residue constraints. This resultant initial pool of Spike-Furin docked poses under each docked ensemble (i.e., for each Spike model) contained in them the ligand docked onto widely varying sites spread all over the trimeric Spike receptor, which were then pulled down into one unsorted set. Since the 'desired interface' would necessarily involve the FLCS Spike , the pentapeptide motif ( 681 PRRAR 685 ) was then used as 'contact residue filter' to discard obviously and/or trivially incorrect docked poses. To that end, buried surface areas (see Section 2.3.4.1) and shape complementarities (see Section 2.3.4.2) estimated at the desired docked interfaces were used to filter the 'plausible' docked poses in two successive rounds. These filtered 'plausible' poses were then further re-ranked by a carefully designed scoring function (see Section 2.3.4.1) optimally combining the steric effects of the two high-level structural descriptors. Thus, in effect, the blind docking could be made to work in a guided manner. The top ranked docked pose (RR1 CoV-2 ) obtained was further taken into long MD simulations (see Section 2.4) and subsequent analyses of its structural dynamics.

Cross-Validation by Guided Docking in 'Zdock + IRaPPA Re-Ranking'
The cluspro docking was further cross-validated by Zdock with its combined feature of IRaPPA re-ranking [72] implemented. To that end, the Spike chains (receptors) were extracted from the top (re-)ranked (see Section 2.3.1) cluspro docked pose, RR1 CoV-2 , and, Furin (1P8J, ligand) was docked onto it with the three arginines (R682, R683, R685, from the first of the three symmetry-related identical Spike chains) pertaining to the 681 PRRAR 685 motif (in FLCS Spike ) specified as contact residues on the receptor molecule. No contact residues were specified from the ligand molecule. The returned top ranked model (ZR1 CoV-2 ) was retained and further simulated (see Section 2.4) for all subsequent structural dynamics analyses. In continuation to the earlier discussion (in Section 2.1.1) on setting up appropriate baselines to interpret the Spike-Furin interaction in SARS-CoV-2, Furin (1P8J, ligand) was docked onto the SARS-CoV Spike (7AKJ, receptor) in Zdock (+IRaPPA re-ranking) in yet another independent docking exercise. To that end, a direct guided mode of docking (similar to ZR1 CoV-2 ) was adapted, wherein the whole FLCS Spike patch (residues originally missing in the experimental structure of 7AKJ) was specified as plausible contact residues on the Spike (receptor). This missing FLCS Spike patch, built by MODELLER (see Section 2.2) maps to residues 664-670 ( 664 SLLRSTS 670 ) and those pertaining to the first of the three symmetry-related identical Spike chains were specified as the Spike contact residues for Furin. Again, (in the same spirit to that of ZR1 CoV-2 ) no contact residues were specified from Furin. The returned top ranked model (ZR1 CoV ) was retained, simulated (see Section 2.4) and used as a baseline for all subsequent structural dynamics calculations, probing for the absence of the 'PRRAR' motif in FLCS Spike . For initial filtering of (Spike-Furin) docked poses (ClusPro 2.0) from the atomic accessible surface areas (ASA) were calculated by NACCESS [73], traditionally following the Lee and Richards algorithm [74] for all heavy atoms pertaining to each partner protein molecule (Furin and Spike) in their bound and unbound forms. For the unbound form, the two partner molecules in the bound docked-pose were artificially physically separated and each of them was considered independently, in isolation. The atomic accessibilities were then summed up for their source residues. For each (ith) residue belonging to a docked pose, two ASA(i) values were obtained, one for its bound form (ASA bound (i)) and the other for its unbound form (ASA unbound (i)). A residue falling in the interface would, thus, undergo a net non-zero change (∆ASA(i) = 0) in its two ASA(i) values. In other words, these would be the residues at the interface that gets buried upon complexation and are characterized by a net non-zero buried surface area (BSA(i) > 0).
For the case of Spike-Furin docking (SARS-CoV-2), BSA was used as an initial filter to select those interfaces alone that harbors the FLCS Spike loop. In the process, the obviously incorrect poses with Furin docked elsewhere in the Spike were discarded. This was achieved by monitoring BSA PRRAR , the summed up BSA for the residues belonging to the pentapeptide 681 PRRAR 685 motif. The poses that possessed BSA PRRAR > 0 were then filtered and retained from the initial pool of ab initio docked poses.
BSA PRRAR for this BSA filtered set of docked poses was further normalized by the total ∆ASA (summed over all residues pertaining to the docked pose) to render the normalized buried surface area for the docked pentapeptide surface patch (nBSA PRRAR ). The normalization can be expressed by the following equation.
where, although, 'resi' stands for all residues pertaining to the protein chain ('subscripted' to the corresponding ∆ASA sum-over term), it is the interfacial residues (∆ASA(i) = 0) that alone actually contribute to the denominator. The use of nBSA defined at protein-protein interfaces [75] can be considered analogous to that of the surface overlap parameter [76,77], which has been used extensively in tandem with shape complementarity to study packing within protein interiors. Wherever applicable, burial (bur) of solvent accessibility for a protein residue (X) was computed (following standard methods [76][77][78][79]) by taking the ratio of its ASA when embedded in the protein to that when in a Gly-X-Gly tripeptide fragment with its fully extended conformation.
The analysis of residue-wise burial was particularly intended to survey the Furin structure, in order to access its intrinsic propensity to bind to FLCSSpike-like disordered and/or flexible loops that harbor patches of highly localized and dense positive charge clouds (e.g., the 'PRRAR' pentapeptide motif).

Shape Complementarity
For a chosen docked pose that has passed the initial BSA filter (Section 2.3.4.1), the shape complementarity [80,81] at its interface was computed by the shape correlation (Sc) statistic originally proposed, formulated and programmed (as the sc program, part of the CCP4 package [82]) by Lawrence and Colman [80]. Sc is a correlation function defined in the range of −1 (perfect anti-complementarity) to 1 (perfect complementarity). It elegantly combines both the alignment as well as the proximity of interacting surfaces and is essentially local in nature (resulting from Van der Wall's packing). The higher the Sc, the tighter is the chain packing at the interface. Well-packed protein-protein interfaces (irrespective of their biological origin and size) usually hit a thin optimal range of Sc values (~0.55-0.75) [17,80].
For the case of Spike-Furin docking (SARS-CoV-2), Sc was computed for the BSA filtered (see Section 2.3.4.1) interfaces alone, which refers to those docked poses that harbor the FLCS Spike loop in its interface. Furthermore, for Sc, we narrowed down our observation window to the docked FLCS Spike loop alone, which was necessary and sufficient for the cause of scoring and re-ranking the initially selected (plausible) docked poses. To that end, Sc of the corresponding surface patch (Sc FLCS ) was computed against its docked surface patches (combined) coming from Furin. Sc is local in nature and can be directly computed for and segregated among pairs of interacting surface patches in multi-body interactions [76,77]. To render an accurate Sc FLCS for each chosen docked pose, the sc (CCP4) [82] input file was made to retain coordinates of the corresponding docked patch (resi. 677-688) alone for the interacting Spike chain, appended with coordinates for all atoms pertaining to the docked Furin molecule.

The Sdock Score
The S dock score, designed for the purpose of re-ranking of BSA filtered (plausible) ab initio docked poses was computed by the following equation.
where, w 1 (=5.78) appropriately weighted the two components (nBSA, Sc) of the score. In effect, the S dock score elegantly combined the surface fit and overlap at the Spike-Furin interface. The BSA filtered (plausible) poses were then scored and re-ranked by S dock . The design of S dock can be considered analogous to deriving the magnitude of the resultant of two mutually orthogonal vectors in 2D Euclidean space.

Molecular Dynamic Simulations
The top ranked docked SARS-CoV-2 Spike human Furin complex (RR1 CoV-2 ) consisting of 60224 atoms (inclusive of hydrogens) were used as the initial structural template for an explicit water all atom molecular dynamics (MD) simulation run. All MD simulations were performed using Gromacs v2021.3 [83] with OPLS-AA force field [84]. Force-field parameters for the surface-bound glycans of SARS-CoV-2 Spike (6XR8) were used directly from a recent earlier study on the same protein from this laboratory [17], while those for SARS-CoV Spike (7AKJ) were built in an identical manner using the glycoprotein builder at GLYCAM-Web (www.glycam.org, (accessed on 15 November 2021)) [85]. Cubic box of edge dimension 225.1 Å, solvated by a total (N sol ) of 348608 TIP3P water molecules, was used to solvate the protein complex with an application of periodic boundary conditions of 10 Å from the edge of the box. The system was then charge-neutralized (N ion ) with 21 Na + ions by replacing the TIP3P waters. The size of the hydrated system, thus, amounted to 899539 (protein+non-protein) atoms. Bond lengths were constrained by the LINCS algorithm [86] and all long-range electrostatic interactions were determined using the smooth particle mesh Ewald (PME) method [87]. Energy minimization was performed with the steepest descent algorithm until convergence (~1000 steps) with a maximum number of steps set to 50,000. All simulations were performed at 300K. Temperature equilibration was conducted by the isochoric-isothermal NVT ensemble (constant number of particles, volume, and temperature) with a Berendsen thermostat [88] for 100 ps. The system was then subjected to pressure equilibration in the NPT ensemble (constant number of particles, pressure, and temperature) for 100 ps using the Parrinello-Rahman protocol [89], maintaining a pressure of 1 bar. Coordinates were written subsequent to necessary corrections for periodic boundary conditions (PBC) using the GROMACS command 'gmx trjconv' using its -pbc option. Backbone RMSDs were computed using the GROMACS command 'gmx rms' and monitored throughout the trajectory. For RR1 CoV-2 , the production run was set to 300 ns, which may be considered sufficiently long, as the system showed convergence. For crossvalidation purposes, a second MD simulation was set up with ZR1 CoV-2 as the template (see Section 2.3.2) and run for 100 ns with identical cubic box dimensions, N sol and N ion . To set up appropriate baselines, yet another third independent simulation was set up and run for 100 ns using ZR1 CoV as the template (see Section 2.3.3) with cubic box of edge dimension 190.6 Å, solvated by 214,418 water molecules and charge-neutralized by 45 Na + ions. All three simulation trajectories attained equilibrium, ensuring seamless downstream calculations. For subsequent analyses of structural dynamics, structural snapshots were extracted at a regular interval of 10 ps from all three trajectories, leading to 30,000 snapshots for RR1 CoV-2 (the subject), 10,000 snapshots for ZR1 CoV-2 (the subject for cross-validation) as well as ZR1 CoV (the baseline). All simulations were performed on a local workstation with Gromacs v2021.3 [83], with CUDA acceleration v11.2 powered by an NVIDIA RTX 3080 GPU with 8704 CUDA compute cores, resulting in an average output simulation trajectory of~8.4 ns/day.

Identifying Salt-Bridges at the Spike-Furin Interface
For each selected Spike-Furin interface, which could either belong to the static ensemble of top ranked docked poses (see Section 2.3) or timeframes/snapshots pertaining to MD simulation trajectories (see Section 2.4) produced from top ranked selected docked poses, first, its interfacial inter-residue contact map was extracted. An interfacial inter-residue contact at the said interface was defined and detected when two heavy atoms, one coming from a Spike-and one coming from a Furin-residue was found within 4.0 Å of each-other.
Further, from this interfacial contact map, ionic bonds/salt-bridges were identified following standard definitions and computational techniques [53,54,90,91]. To that end, those inter-residue contacts were assembled and characterized as salt-bridges/ionic bonds, where two oppositely charged side chain heavy atoms, a nitrogen (N + ) and an oxygen (O -), coming from two different amino acid residues from the two molecular partners (Spike and Furin) were found within 4.0 Å of each-other. In a salt-bridge thus formed, the positively charged nitrogen refers to side chain amino groups (-NH 3 + /=NH 2 + ) of lysines/arginines/doubly protonated histidines (His+) while the negatively charged oxygen refers to side chain carboxylic groups (-COO − ) of glutamates/aspartates. Following standard analytical methods [53,54], first, all unique salt-bridges occurring at least once in the MD simulation trajectories were identified and accumulated. The dynamic persistence (pers) of each unique (non-redundant) salt-bridge was then calculated as the ratio of the number of structural snapshots to which the salt-bridge was found to form with respect to the total number of snapshots sampled (at regular intervals) in the trajectory. As has been already mentioned (Section 2.4), an interval of 10 ps was chosen, leading to 30,000 snapshots for the 300 ns trajectory (RR1 CoV-2 ) and 10,000 snapshots for the 100 ns trajectories (ZR1 CoV-2 : cross-validation, ZR1 CoV : baseline). Likewise, for the static ensemble compiled of the top-ranked 100 docked poses (RR1 CoV-2 to RR100 CoV-2 ), a static equivalent of persistence, namely, occurrence (occ) was procured for each salt-bridge occurring at least once in the ensemble using an equivalent ratio to that of the dynamic persistence (pers). Occurrence (occ) for each salt-bridge was defined as the ratio of docked poses to which the salt-bridge had occurred with respect to the total number of selected docked poses (=100) in the ensemble. Even a single occurrence of a salt-bridge in an ensemble was considered accountable in this analysis. Normalized frequency distributions of salt-bridge persistence (and occurrence) were plotted for the corresponding ensembles for further analyses.

Average Contact Intensities of Salt-Bridges
To take care of the variable degrees of intensities (effectively, ionic strengths) of atomic contacts for salt-bridges, contact intensities (CI) were defined and computed for each saltbridge as the actual number of inter-atomic contacts involved in a salt-bridge. In other words, CI is the number of ion-pairs to be found within 4Å between the two interacting side chains in a salt-bridge. Considering all unique combinations of possible salt-bridges (Arg ↔ Glu, Lys ↔Asp etc.), CI can vary from 1 to 4. Time series averages, defined as the average contact intensity (ACI) of these salt-bridges were then computed for each non-redundant salt-bridge from the MD simulation trajectories pertaining to each subject under test (RR1 CoV-2 , ZR1 CoV-2 , ZR1 CoV ). Together, persistence (pers) and average contact intensity (ACI) can be considered as 'ensemble descriptors of salt-bridges'. To account for their cumulative contribution in terms of salt-bridge strength and sustenance, a weighted persistence term (wpers(i)) was further defined for each ith non-redundant salt-bridge in an ensemble, as the direct product of pers(i) and ACI(i).
By definition, wpers would have a theoretical range of (0, 4). Furthermore, in order to draw a direct comparison between cumulative contact intensities (CCI) of the ionic bond networks formed across the different Spike-Furin interfaces in SARS-CoV (ZR1 CoV ) and SARS-CoV-2 (RR1 CoV-2 , ZR1 CoV-2 ), the following sum-over measure was defined, designed and implemented.
where, pers(i) and ACI(i) were defined as before (in Equation (5)) and N is the total number of non-redundant salt-bridges found at least once in a dynamic ensemble. In the current context, CCI can be considered as a measure of structural degeneracy [92] that is made to function as a global network descriptor raising a limiting threshold at the coronavirus Spike-Furin interfaces, allowing, absorbing and accommodating different alternative ionic bond network architectures as long as they are befitting to the task of catalyzing the Spike-S1/S2 cleavage.

Calculation of Structure-Based Equilibrium Thermodynamic Parameters (∆H, ∆S, ∆G) for the Spike-Furin Binding
As a mean to probe a highly likely event of 'enthalpy entropy compensation' associated implicitly with the the Spike-Furin interaction, structure-based equilibrium thermodynamic parameters (∆H, ∆S, ∆G) were calculated for the selected representative structure (RR1 CoV-2 ) along its entire MD simulation trajectory (300 ns) using the standalone (C++ with boost library) version (v.4) of FoldX (http://foldxsuite.crg.eu/, (accessed on 15 November 2021)) [93,94]. FoldX has its energy terms carefully parameterized by actual experimental data from protein engineering studies [93], which, together with its high computational speed, are definite edges over the traditional MM(PB/GB)SA approaches [95]. It is for these reasons that FoldX is slowly but surely taking over traditional approaches in structurebased thermodynamic calculations, particularly in the domain of protein engineering and stability analysis [96,97]. FoldX is built on a 'fragment-based strategy' that exploits the power of fragment libraries [98] in the same direction to that of the most compelling 'fragment assembly simulated annealing' approach in protein structure prediction attributed to David Baker and Rosetta [68]. Along with net free-energy changes (∆G binding/folding ) the advanced empirical force field of FoldX also returns a plethora of different favorable or disfavored transition enthalpic as well as entropic energy terms for proteins (folding) and PPI complexes (binding) directly from their high-resolution 3D coordinates (using full atomic description). To address the plausible 'enthalpy entropy compensation' in the current context, as enthalpic terms we included the favorable van der Waals (∆H vdwF ) and electrostatic (∆H electro , ∆H elec-kn ) contributions to free energy, as well as the disfavored van der Waals clashes (∆H vdw-clash , ∆H vdw-clash-backbone ); while, to account for the entropic costs, we included the entropic energies for backbone (T∆S mc ) and side chain (T∆S sc ) conformational changes. The choice of the terms was guided by well-just reviews and discerning follow-up studies on FoldX [96,99]. The enthalpic terms were further summed up according to the nature of forces giving rise to each.
To that end, structural snapshots were sampled at 10 ps intervals from the 300 ns MD simulation trajectory (RR1 CoV-2 ), resulting in 30,000 timestamps (or, structural snapshots). Then, for each snapshot, FoldX was run using the command AnalyseComplex with the complexWithDNA parameter set to 'false' and the relevant enthalpic (∆H vdw , ∆H elec ), entropic (T∆S mc , T∆S sc ) and free energy terms (∆G binding ), as detailed above, were computed for each run, indexed appropriately and stored. Time averages (denoted by angular braces '<>' throughout the paper) were computed for each individual term, along with its standard deviations (SD). For a second analysis, focusing purely on the 'entropy arrest' [100][101][102] presumably implicit to the Spike-Furin binding, conformational entropies for backbone (subscripted as 'mc') and side chains (subscripted as 'sc') were recorded (for each timestamp) independently for the FoldX-separated (unbound) receptor (∆S The Ramachandran plot (RP) [55] can be effectively used to probe transitions between disordered and relatively ordered protein structural elements. To achieve this, first, dynamic conformational ensembles need to be assembled, representative of two protein states, say, an unbound (highly disordered) and a bound (relatively ordered) state. Since, RP is essentially based on local steric clashes, the analysis can further be locally restricted to a 'contiguous region of interest' (or, a concerned structural patch), wherein residues are supposed to undergo a two-state transition (say, disorder-to-order). In the current study, this 'contiguous region of interest' is the FLCS Spike loop and the two protein states are the unbound (disordered) and Furin-bound (presumably more ordered) states of the SARS-CoV-2 Spike. The Ramachandran angles (Φ, ψ) can then be computed for residues comprising the concerned structural patch and plotted in an RP for each atomic model/frame in an ensemble state. The individual RPs can then be overlaid for 'disordered' or 'relatively ordered' states. In order to identify whether a connected structural patch (especially relevant for protein loops) supports a finite set of (restrained) structural conformations, the 2-dimensional Euclidean distance in the Φ-ψ vector space of RP was computed for each internal residue comprising the concerned structural patch. The distance between Ramachandran angles of ith and jth residues in Φ-ψ space can be interpreted in terms of the extent of local conformational mismatches between ith and jth residues and may be formally represented as follows: The maximum value of δ(i,j) represents the maximum spread of (Φ, ψ) within the concerned structural patch. It naturally follows that the concerned patch is structurally relatively more ordered when this spread is comparatively less and vice-versa. For some statistical reference, say that <δ> 9d is the 9th decile of δ(i,j). This indicates that 90% of the residues are separated by some distance less than <δ> 9d in the {Φ, ψ} space. Consequently, higher structural order is reflected through lesser values of <δ> 9d , which increases with the increasing disorder in the structure. We took three statistics from the distribution of these distances δ(i,j) obtained for each state: (i) the median (50 percentile, <δ> median ), (ii) the 3rd quartile (75 percentile, <δ> 3q ) and (iii) the 9th decile (90 percentile, <δ> 9d ), which are adequate to collectively render a comparison across states. Use of such a combined statistics instead of the maximum value of δ(i,j) also implicitly avoids possible outlier effects.
It is also necessarily important to estimate the local coherence of structural conformation within the concerned structural patch in general. Such a measure (metric) could be informative in terms of the local tendencies within (say) a protein loop to attain certain restrained local structural conformations. To estimate local structural coherence within the concerned structural patch, the average Euclidean distance (along with the standard deviation) of (Φ, ψ) points between consecutive residues comprising the concerned structural patch was designed and computed in the following way: where |δc| and σ(δ c ) are defined for each atomic model/frame falling within an ensemble (state) and the 'contiguous region of interest' (or concerned structural patch) is N-residues long. Relative lower values of |δ c | represents higher local structural coherence and σ(δ c ) presents a measure of dispersion in local structural coherence within a protein loop. Hence, these ordered parameters may be computed to collectively render a comparison of local structural coherence across states.

Quantifying a Change between Two N-Binned Frequency Distributions and Assessing Its Statistical Significance in Terms of χ 2
An χ 2 test (wherever applicable) was conducted to discriminate between two frequency distributions (say, that of an unbound and a bound state of a protein region spanning different contoured regions the RP) with the χ 2 -statistic being computed (for an N-bin model; df (degree of freedom) = N − 1) by the following equation.
where E(i) represents the frequency 'under the null hypothesis' expected for the ith bin, while O(i) denotes the actually-observed frequency for that same (ith) bin.

Structural Insight into the Furin Cleavage Mechanisms
From a structural, biochemical, as well biophysical perspective, it is crucial to unravel the reaction mechanisms and the involved enzyme kinetics of the Furin cleavage demanding quantum chemical and/or QM/MM (quantum mechanics/molecular mechanics) studies. To that end, a preceding step would be to explore the nature of binding involved in the Spike-Furin interactions via the disordered FLCS Spike loops in SARS-CoV and SARS-CoV-2, and then to compare them. To address this, here we adapted a combined approach of ensemble molecular docking and dynamic simulations followed by conformational analyses. As briefed in the Introduction, the coronavirus Spike structures are devoid of experimental atomic coordinates for the FLCS Spike , patch which is further revealed to be a "surface exposed disordered loop" [24] for the pre-fusion structure of SARS-CoV-2 Spike. There are as many as 44 experimentally-solved (exclusively cryo-EM) structures (Table S1) currently to be found in the PDB (see Section 2.1.2, Materials and Methods) for the pre-fusion form CoV/CoV-2 Spike, culled at a resolution of not worse than 3 Å. These coronavirus Spike structures are solved to serve different specific research objectives at various pH [103] and other varying physico-chemical conditions [19,62,64,65], therefore, often requiring stabilizing (engineered) mutations at the FLCS Spike patches [2,42]. It is almost intriguing that, even with stabilizing modulations, there's not a single cryo-EM structure that has any experimental (primary) data for its FLCS Spike patch. As a result, atomic coordinates of the 681 PRRAR 685 motif in the SARS-CoV-2 Spike (and, equivalent homologous sequence motifs from other coronavirus Spike), along with short flanking regions at both ends (adding up to a stretch of 10-12 amino acids), are missing experimentally [43] and, hence, require computational modeling ( Figure S2, Supplementary Materials). To have such disordered loops appears quite characteristic of the SARS-CoV-2 Spike trimer, which contains a total of four missing stretches of roughly similar lengths at strategic positions, adding up to 42 amino acids in PDB ID: 6XR8. This is, perhaps, reasonable given the extended internal packing involved in the trimerization of the monomeric S units. The highly localized positive charge cloud concentrated over the arginine-rich 681 PRRAR 685 region of the loop further boosts the said probability, as this would instigate electrostatic self-repulsion of the loop adding to its conformational instability. The presence of Proline (P681), the well-known helix breaker, within the SARS-CoV-2 FLCS Spike , plausibly adding to the loop-disorder, has further been suggested to improve the protease active site accessibility for Furin, as well as for other proteases [104].

More the Arginines, More the Disorder' in the FLCS Spike Activation Loops
In order to have a general idea as to how the presence of polybasic sequences (arginines) influence the evolutionarily-manifested disorder in the FLCS Spike , we started the proceedings with an evolutionary analysis of the loop-disorder on compiled coronavirus Spike sequences. There are several AI (artificial intelligence)-trained sequence-based disorder predictors [105][106][107] that return residue-wise disorder probability scores, which are trained primarily on evolutionary sequence data (e.g., mutational co-variance matrices). These sequence-based disorder predictors have their known limits in accuracy [108,109], for not explicitly accounting for the actual three-dimensional structural dynamics of the protein(s)/peptide(s), but, can serve as a good first test of the comparative FLCS Spike loopdisorder among its close evolutionary homologs. A representative set of Spike structures (CoV/CoV-2) were culled (resolution ≤ 3 Å), accumulated (see Section 2.1, Materials and Methods), and their UNIPROT sequences (in FASTA format) derived from proteomics data [67], were extracted from corresponding entries in the Protein Data Bank [43]. The full-length Spike sequences were aligned using MUSCLE [110] and those containing gap(s) at their aligned position(s) homologous to the 681 PRRAR 685 pentapeptide motif (FLCS Spike , SARS-CoV-2) were removed. The final set consisted of all unique and non-redundant pentapeptide sequence motifs (PSGAG, PGSAS, PASVG, PSRAG, PSRAS, PRRAA, PRARR) to be found within the FLCS Spike spanning the entire plethora of coronavirus Spikes. The full-length sequences were then run in the PrDos web-server [106], which combines local sequence information and homology templates using iterative (psi) BLAST. Since the loop-disorder is highly contextual to its neighboring/flanking sequences and to that of the 'highly conserved' trimeric Spike structures (C α -RMSD: 2.3 Å), the default setting of 'template-based' prediction (with the PrDos-FPR (False Positive Rate) set to 5%) was retained as 'turned on'. For all representative full-length Spike sequences, the disorder probabilities of the FLCS Spike regions (i.e., the patches originally missing in the corresponding experimental structures) were unanimously found to increase sharply in the N→C direction around the pentapeptide motifs trending to local maximums ( Figure 1). The regions, consequently, mapped to the ascending halves of the corresponding curve-humps (Figure 1), which should effectively mean 'growing disorders' associated with the FLCS Spike . Interestingly, the mean disorder probabilities show an unmistakable increasing trend (0.45 for PSGAG →→ 0.55 for PRARR) with the successive gradual incorporation of arginines in the pentapeptide sequence motif. PRRAA, PRARR) to be found within the FLCSSpike spanning the entire plethora of coronavirus Spikes. The full-length sequences were then run in the PrDos web-server [106], which combines local sequence information and homology templates using iterative (psi) BLAST. Since the loop-disorder is highly contextual to its neighboring/flanking sequences and to that of the 'highly conserved' trimeric Spike structures (C α -RMSD: 2.3 Å), the default setting of 'template-based' prediction (with the PrDos-FPR ( False Positive Rate) set to 5%) was retained as 'turned on'. For all representative full-length Spike sequences, the disorder probabilities of the FLCSSpike regions (i.e., the patches originally missing in the corresponding experimental structures) were unanimously found to increase sharply in the N→C direction around the pentapeptide motifs trending to local maximums ( Figure  1). The regions, consequently, mapped to the ascending halves of the corresponding curve-humps (Figure 1), which should effectively mean 'growing disorders' associated with the FLCSSpike. Interestingly, the mean disorder probabilities show an unmistakable increasing trend (0.45 for PSGAG →→ 0.55 for PRARR) with the successive gradual incorporation of arginines in the pentapeptide sequence motif.

Filling Up the Voids in the Spike Structures: the FLCSSpike Disordered Ensembles
The disorder, intrinsic to the FLCSSpike, along with the most likely event of 'disorderto-order transition' upon binding to Furin (see Introduction) makes the Spike-Furin interaction an interesting mechanistic chapter both in the context of coronavirus evolution and also in the general framework of proteolytic cleavages [45][46][47]. Particularly intriguing (and, counter-intuitive almost) is the fact that the bait that the SARS-CoV-2 Spike uses to attract their specific and dedicated host-proteases (e.g., Furin for SARS-CoV-2 Spike) are themselves structurally highly nonspecific or conformationally varied by virtue of their

Filling Up the Voids in the Spike Structures: The FLCS Spike Disordered Ensembles
The disorder, intrinsic to the FLCS Spike , along with the most likely event of 'disorder-toorder transition' upon binding to Furin (see Introduction) makes the Spike-Furin interaction an interesting mechanistic chapter both in the context of coronavirus evolution and also in the general framework of proteolytic cleavages [45][46][47]. Particularly intriguing (and, counter-intuitive almost) is the fact that the bait that the SARS-CoV-2 Spike uses to attract their specific and dedicated host-proteases (e.g., Furin for SARS-CoV-2 Spike) are themselves structurally highly nonspecific or conformationally varied by virtue of their intrinsic disorder. Again, in the bound form, the concerned disordered loop (FLCS Spike , SARS-CoV-2) serves as the bait to recruit Furin subsequent to which it needs to be jammed into a much more restricted set of conformations for its efficient (proteolytic) cleavage [34]. The collective dynamics of the Spike-Furin interaction (SARS-CoV-2) would, thus, naturally lead to a conformational selection of the disordered FLCS Spike in its Furin-bound state. To sustain such an energetically costly 'conformational selection', an energy-source is, hence, required to compensate for such a high entropy-loss of the disordered FLCS Spike ensemble. Given the physico-chemical nature of long-and short-range forces sustaining the native bio-molecular environment, such compensating energies would, naturally, be enthalpic. Thus, an enthalpy entropy compensation seems necessary. Thus, the obvious question that follows next would be, 'what might be the source of this enthalpic energy to compensate for such a high entropy-loss?' To address this, first and foremost, the missing disordered stretches in the representative SARS-CoV-2 Spike pre-fusion structure (PDB ID: 6XR8) were ensemble-modeled (see, Section 2.2, Materials and Methods) using its full-length amino acid sequence, derived from proteomic data. Considering that the SARS-CoV-2 FLCS Spike (6XR8) is only a short stretch of 12 amino acid residues ( 677 QTNSPRRARSVA 689 ), 500 uniquely varied loop-conformations were sampled for the missing patch, eventually leading to the many 'all-atom' trimeric SARS-CoV-2 Spike atomic models. Such an ensemble can essentially and adequately represent the conformational variations in the disordered SARS-CoV-2 FLCS Spike (missing) patch. During the entire course of this modeling, the atoms that were already experimentally solved were retained as they were. The average C α -RMS deviations, upon aligning the models pairwise in PyMol (https://pymol.org/2/ (accessed on 15 November 2021)), were found to be appreciably higher  To that end, buried surface areas (BSA) (see Section 2.3.4.1, Materials and Methods) were computed for all residues in each docked pose and the interfacial residues (BSA =0, see Section 2.8) were identified. In addition, the BSA values for residues pertaining to the (i.e., BSA PRRAR >0), the docked pose was considered plausible and worthy to be carried for the next round (i.e., filtered in). All three chains of the Spike trimer (harboring this FLCS Spike ) were treated as equally likely to serve as the docking site. This simple technique ensured that all filtered docked poses (7184 of them) possessed the desired Spike-Furin interface harboring the FLCS Spike [52]. It was interesting to note that, in the top-ranked docked poses (ranked by Cluspro's internal score), for almost all ensemble-docked batches (in 497 out of 500 cases) Furin showed an unmistakable preference to dock at the FLCS Spike (i.e., BSA PRRAR >0) even with this unbiased ab initio blind docking protocol. One common intuition to jam the high-entropy FLCS Spike disordered loop (SARS-CoV-2) into the Furin bound enzyme-substrate complex would be to stabilize the highly localized positive charge cloud electrostatically. It is rather well known that electrostatic interactions play an important role in the dynamic sustenance and transitions associated with protein disorder [111][112][113][114][115][116]. To that end, the suggested electrostatic stabilization of the FLCS Spike would be greatly benefited by the structural proximity of oppositely charged (i.e., anionic) amino acids (coming from Furin) by triggering the formation of Spike-Furin interfacial salt-bridges. The plausibility of the 'salt-bridge hypothesis' is further enhanced by the presence of a surface groove around the Furin [34] catalytic triad (D153, H194, S368) that appears to be befitting to FLCS Spike both in terms of shape and electrostatics ( Figure 2). This groove serves as a potentially attractive docking site evolutionarily for FLCS Spike patches. A detailed independent survey of the two-domain Furin structure (PDB ID: 1P8J, chain A, see Section 2.1.1, Materials and Methods) further reveals that it has an abundance of anionic residues (30 aspartates and 25 glutamates), which are spread rather homogeneously all over its catalytic and P domains (see Introduction). While the majority (58%) of these residues are either partially or completely exposed to the solvent (bur ≥ 0.05) (see Section 2.3.4.1, Materials and Methods), those that are part of the catalytic domain ( Figure 2) are of interest to the given context. Note that even partially exposed anionic residues with lesser exposure (say, 0.05 < bur ≤ 0.15) in vicinity of the triad can, in principle, seldom flip into a more extended and exposed conformation during the course of the Spike-Furin binding (till the Spike-S1/S2 cleavage). Hence, these may also potentially contribute to stabilize the proposed dynamically interchangeable ionic bond networks at the Spike-Furin interface (SARS-CoV-2). To have a closer look into these 'residues of interest', first, a subset of anionic residues in Furin was assembled, where at least one heavy side chain atom from each residue is located (in its crystalline equilibrium state) within a sphere of radius 20 Å centered at the side chain centroid of the Furin catalytic triad (D153, H194, S368). This subset contained 25 residues, most of which (18 out of 25) are 'partially exposed' or 'exposed' (Table S3, Supplementary Materials) and, hence, should be amenable and approachable for formation of interfacial salt-bridges. Again, some of these partially or completely exposed anionic residues (viz., D191, D228, E236, E257, D258, D259, D355, D306; see Table S3) are rather proximal to the catalytic triad (within 15 Å from its side chain centroid). Together, these constitute a non-rigid set of anionic Furin residues that potentially hold the FLCS Spike at the catalytic triad, with the extent of stability required to encompass the cleavage, by virtue of dynamically engaging themselves into ionic bond interactions with the FLCS Spike -arginines coming from 681 PRRAR 685 . The positioning of these anionic side chains also appears to be strategic (Figure 2), such that, cumulatively, the salt-bridges almost never get dissolved during the entire course of the Spike-Furin binding. Such dynamic networks of potentially interchangeable ionic bonds could also be a prime source for the enthalpy required to compensate for the presumably high entropic loss of the FLCS Spike upon binding to Furin. The process would also necessarily accompany a concomitant transition of the FLCS Spike from a disordered (unbound) to a relatively ordered (Furin bound) state. Recent findings also reveal parallel evolution of a Q675H variant [117] of the SARS-CoV-2 Spike wherein the mutated Histidine further imparts hydrogen bond formation at the Spike-Furin interface, and, thereby exposes the FLCS Spike even more to the Furin binding pocket with optimized directionality of the 681 PRRAR 685 arginines. To demarcate the proposed binding pocket, a semi-opaque surface representation (colored according to atom types) is chosen over and above a backbone cartoon. The catalytic triad (D153, H194, S368) is presented as 'balls and sticks' and are labeled in 'white' over and above the cartoon display. Among all exposed and partially exposed anionic residues (see Table S3, Supplementary Materials), the ones that are in close vicinity to the dockable groove surface are also shown in 'balls and sticks' and are further labeled in 'black'. These residues are amenable to form interfacial saltbridges with the FLCSSpike-arginines (in 681PRRAR685) and collectively add to the potential of forming dynamically interchangeable ionic bond networks at the Spike-Furin interface.

Validations and Cross-Validations of the 'Salt-Bridge Hypothesis'
3.6.1. In RR1CoV-2 In order to test the validity of the 'salt-bridge hypothesis', an all-atom explicit water MD simulation (of 300 ns) was run (see Section 2.4, Materials and Methods) for the top ranked Spike-Furin docked pose (RR1CoV-2), and the trajectories were tested for the presence of salt-bridges (see Section 2.5.1, Materials and Methods). Prior to the production run, the structural template (RR1CoV-2) was undertaken rigorous rounds of energy-minimization with temperature equilibration (see Section 2.4, Materials and Methods). Structural snapshots were sampled at 10 ps (regular) intervals leading to 30,000 snapshots (or time-stamps) accounting for the 300 ns long MD simulation trajectory and salt-bridges were extracted from each snapshot from its interfacial contact map (see Section 2.5.1, Materials and Methods). To set a reference, ionic bond networks were also surveyed in the pool of 100 top ranked static docked poses (Table S2). In both ensembles, dynamic and static, the identification of salt-bridges were followed by their individual survey and also by a statistical analysis of dynamic or static 'ensemble descriptors of salt-bridges' (e.g., Figure 2. The proposed dockable surface groove for FLCS Spike in Furin, proximal to its catalytic triad. To demarcate the proposed binding pocket, a semi-opaque surface representation (colored according to atom types) is chosen over and above a backbone cartoon. The catalytic triad (D153, H194, S368) is presented as 'balls and sticks' and are labeled in 'white' over and above the cartoon display. Among all exposed and partially exposed anionic residues (see Table S3, Supplementary Materials), the ones that are in close vicinity to the dockable groove surface are also shown in 'balls and sticks' and are further labeled in 'black'. These residues are amenable to form interfacial salt-bridges with the FLCS Spike -arginines (in 681 PRRAR 685 ) and collectively add to the potential of forming dynamically interchangeable ionic bond networks at the Spike-Furin interface.

Validations and Cross-Validations of the 'Salt-Bridge Hypothesis'
3.6.1. In RR1  In order to test the validity of the 'salt-bridge hypothesis', an all-atom explicit water MD simulation (of 300 ns) was run (see Section 2.4, Materials and Methods) for the top ranked Spike-Furin docked pose (RR1 CoV-2 ), and the trajectories were tested for the presence of salt-bridges (see Section 2.5.1, Materials and Methods). Prior to the production run, the structural template (RR1 CoV-2 ) was undertaken rigorous rounds of energyminimization with temperature equilibration (see Section 2.4, Materials and Methods). Structural snapshots were sampled at 10 ps (regular) intervals leading to 30,000 snapshots (or time-stamps) accounting for the 300 ns long MD simulation trajectory and salt-bridges were extracted from each snapshot from its interfacial contact map (see Section 2.5.1, Materials and Methods). To set a reference, ionic bond networks were also surveyed in the pool of 100 top ranked static docked poses (Table S2). In both ensembles, dynamic and static, the identification of salt-bridges were followed by their individual survey and also by a statistical analysis of dynamic or static 'ensemble descriptors of salt-bridges' (e.g., persistence/occurrence and average contact intensities; see Section 2.5.1, Materials and Methods). Together, these parameters can effectively be used to decipher and interpret the complex nature of salt-bridge dynamics associated with protein disorder transitions [53,54]. One of the trademark features of salt-bridge dynamics associated with disorder transitions is a trade-off or an optimal balance between transience and persistence of salt-bridges, which serves to sustain a disordered state while a shift of this balance leads to disorder transitions [53,54].
To that end, explicit lists of ionic bond interactions independently for the dynamic (Table 1), static ensembles (Table S4, Supplementary Materials) were sorted based on their persistence (dynamic) and occurrence (static) (see Section 2.5.1.1, Materials and Methods). Given that the static ensemble consisted of near-native as well as not so near-native poses ( Figure S4, Supplementary Materials), it was interesting to find that all (or almost all) Spike-Furin interfacial salt-bridges that had occurred (at least once) in the static ensemble (Table S4) were also found (at least ephemerally) in the dynamic ensemble ( Table 1). The persistence/occurrence profiles collectively reveal that the Spike-Furin interaction has a preferential set of ionic bonds in terms of forming dynamically interchangeable salt-bridges which may vary in their occurrence among the plausible docked poses. Since, RR1 CoV-2 presents possibly the most preferred conformation of the FLCS Spike in its Furin bound state (in SARS-CoV-2), the dynamically sustained high persistence arginine salt-bridges (in 681 PRRAR 685 ) found in RR1 CoV-2 are the most plausible, frequently forming Spike-Furin interfacial salt-bridges ( Figure S4) among alternatives.
The first noticeable observation in the dynamic ensemble (pertaining to RR1 CoV-2 ) was that the three arginines (R682, R683, R685) in the 681 PRRAR 685 almost always remain engaged in dynamically interchangeable ionic bonds formed with anionic side chains coming from the host Furin. These amenable anionic side chains (D191, E230, D233, E236, D259, D264, D306, see Table S4) are indeed physically proximal to the catalytic triad (Figure 3), nicely fitting in the FLCS Spike into the proposed dockable surface groove. While most (if not all) of them fall into the non-rigid 'expected' set (Table S3), they should collectively present some correlated movements with FLCS Spike in order to remain conformationally viable for dynamically stable ionic bond networks. Further, as reflected from the distribution of salt-bridge persistence (see Table 1), the three arginines in 681 PRRAR 685 were often involved in multiple and interchangeable ionic bonds. In other words, the same arginine (target) in 681 PRRAR 685 can simultaneously be in contact (at a given instance) with more than one negatively charged residue coming from the host Furin (neighbor). This will lead to the formation of non-disjoint sets of target-neighbor ionic bond pairs (or, salt-bridges) for the same target. In other words, sets of salt-bridges containing distinct anionic partners for a given target arginine ( 681 PRRAR 685 ) would, thus, be non-disjoint. This implies that the persistence values of the arginine salt-bridges (in 681 PRRAR 685 ) may add up to more than unity (see Table 1). In fact, dynamic interchangeability of counter-ions are common characteristic features of salt-bridges formed at disordered protein regions and/or [53,54] disorder-globular interfaces [90,114], wherein the formed ion pairs keep changing their counter-ionic partners [53]. Such collective dynamics results in salt-bridges of varying persistence and multiplicity across the trajectory, including persistent, as well as transient, salt-bridges [53,54,113,115]. Transient and/or unfavorable salt-bridges have been revealed to be functionally optimized in proteins [53,116] and are often found on enzyme surfaces [116] as well as on strategic locations spread around extended disordered protein regions [53]. For the latter case, one of the evolved key mechanisms towards achieving this functional optimization is to make their charged side chains often amenable to proximally approached ordered protein interactors. The transient nature facilitates the exchange of their counter-ionic partners, thereby triggering the switch from intra-to inter-molecular (interfacial) salt-bridges. The multiplicity (or promiscuity [53]) in the choice of the counterionic partner in a salt-bridge can further be chemically and electrochemically rationalized from the bifurcated nature of side chain groups with degenerate charge centers in four out of the five charged amino acids (guanidium + : Arg; carboxylate -: Asp, Glu; imidazol + : His+) in proteins. However, in spite of this observed multiplicity, each of the three arginines in 681 PRRAR 685 seemed to have their own preference for particular anionic partners: E230 for R682 (pers: 0.93), E236 for R683 (pers: 0.85) and D306 for R685 (pers: 0.90) (see Figure 4). These three key anionic residues involved in the highest persistent arginine salt-bridges (in 681 PRRAR 685 ) were further envisaged to form a combined molecular entity (let us call it the 'anionic triad') made up of discrete molecular components (anionic side chains). The anionic triad and the catalytic triad (D153, H194, S368) remained proximal and approachable (average centroid-to-centroid distance between side chain atoms: 13.16 Å, SD: 0.72) throughout the 300 ns trajectory (RR1 CoV-2 ) with the display of correlated movements, similar to an ordered pair ( Figure 5A). To analytically confirm the visually apparent correlated movement, the inter-triad angle (ε), defined as the planer internal angle subtended by the two position vectors originated from the Furin molecular centroid that connect the centroids of the two triads ( Figure 5B), was surveyed across the trajectory. The inter-triad angle was found to be strictly constrained (<ε> = 45.1 • , SD = 4.7) having an apparently bi-modal distribution with two modes at~42.1 • and~52.8 • . Together, these results effectively portray 'correlated movements' of the two triads (catalytic and anionic) which further indicate the dynamically sustained mutual preference of the FLCS Spike and its dockable surface groove on Furin, and that their binding and subsequent cleavage occurs in close vicinity and supervision of the catalytic triad. This, in turn, effectively supports the salt-bridge hypothesis.     Other interfacial salt-bridges barring those involving the three arginine in 681 PRRAR 685 (FLCS Spike ) were also surveyed in the same details. Among these, one salt-bridge, 'E654 Spike ↔ R193 Furin ', (see Table 1) was noticeable both in terms of its high persistence (pers: 0.65) and the opposite trend in the distribution of its charge centers (negative in Spike and positive in Furin, for a change) in contrast to the arginine salt-bridges (in 681 PRRAR 685 ). Other salt-bridges with brief/instantaneous occurrences (pers < 0.1) could be considered ephemeral. The collective interplay of these fleeting salt-bridges triggers a 'transient dynamics' in disordered protein regions that is indispensable in retaining their flexibility [53] and is also pivotal towards imparting a critical behavior in associated disorder transitions among multiple self-similar fractal states [54]. The presence of transient salt-bridges (pers < 0.1) in significant fractions (70.5% in the dynamic ensemble of RR1 CoV-2 , see Table 1) signals for relatively ordered metastable Furin-bound states of the FLCS Spike , which together retain enough flexibility (see Video S1, Supplementary Materials, https://www.youtube.com/watch?v=wsHKpr9gZ9E (accessed on 15 November 2021)) to favor the Spike-S1/S2 proteolytic cleavage [35,37]. It effectively portrays the constrained inter-triad distance (see Section 3.6.1) between the two triads (anionic and catalytic). Panel (B) plots the normalized frequency distribution of the inter-triad angle (ε) (see Section 3.6.1) which is also found to be constrained. The inset of panel B attempts to pictorially define the inter-triad angle. For purpose of discrimination, residues forming the anionic and the catalytic triads are labeled in font colors 'blue' and 'black' while the 681PRRAR685 arginine are labeled in 'yellow'. Together, panel A and B demonstrates 'correlated movements' of the two triads (catalytic and anionic).
Other interfacial salt-bridges barring those involving the three arginine in 681PRRAR685 (FLCSSpike) were also surveyed in the same details. Among these, one saltbridge, 'E654Spike ↔ R193Furin', (see Table 1) was noticeable both in terms of its high persistence (pers: 0.65) and the opposite trend in the distribution of its charge centers (negative in Spike and positive in Furin, for a change) in contrast to the arginine salt-bridges (in 681PRRAR685). Other salt-bridges with brief/instantaneous occurrences (pers < 0.1) could be considered ephemeral. The collective interplay of these fleeting salt-bridges triggers a 'transient dynamics' in disordered protein regions that is indispensable in retaining their flexibility [53] and is also pivotal towards imparting a critical behavior in associated disorder transitions among multiple self-similar fractal states [54]. The presence of transient and the catalytic (red) triads (see Section 3.6.1) and joins them by yellow dashed-lines along the 300 ns MD simulation trajectories obtained from RR1 CoV-2 . It effectively portrays the constrained inter-triad distance (see Section 3.6.1) between the two triads (anionic and catalytic). Panel (B) plots the normalized frequency distribution of the inter-triad angle (ε) (see Section 3.6.1) which is also found to be constrained. The inset of panel B attempts to pictorially define the inter-triad angle. For purpose of discrimination, residues forming the anionic and the catalytic triads are labeled in font colors 'blue' and 'black' while the 681 PRRAR 685 arginine are labeled in 'yellow'. Together, panel A and B demonstrates 'correlated movements' of the two triads (catalytic and anionic).

In ZR1 CoV-2
As a cross-validation of the 'salt-bridge hypothesis', an independent guided docking was performed in Zdock using IRaPPA re-ranking (see Section 2.3.3, Materials and Methods) and the returned top ranked docked pose (ZR1 CoV-2 ) was simulated in yet another independent MD simulation run for 100 ns (see Section 2.4, Materials and Methods). Following on, structural snapshots were sampled at 10 ps intervals (likewise to that of RR1 CoV-2 ) leading to 10,000 snapshots. Salt-bridges were then extracted from each snapshot from its interfacial contact map (see Section 2.5.1, Materials and Methods) and further sorted based on their dynamic persistence. A second sorted list, equivalent to that of RR1 CoV-2 (Table 1), was procured for ZR1 CoV-2 (Table S5, Supplementary Materials). Counter-ionic partners in most high persistence arginine salt-bridges (in 681 PRRAR 685 ) were preserved in both (dynamic) ensembles (RR1 CoV-2 , ZR1 CoV-2 ) pairing either with the same (R682 Spike ↔ E230 Furin , pers: 0.93, 0.99, respectively) or altered partners (R683 Spike ↔ E236 Furin , pers: 0.85 in RR1 CoV-2 ; R685 Spike ↔E236 Furin , pers: 0.97 in ZR1 CoV-2 ). Noticeably, 306-Asp (Furin) in RR1 CoV-2 (R685 Spike ↔ D306 Furin , pers: 0.90) was replaced by 259-Asp (Furin) in ZR1 CoV-2 (R683 Spike ↔D259 Furin , pers: 0.49). This suggests that there might be multiple plausible conformations (docked poses) mapping to unique (i.e., non-degenerate) yet befitting ionic bond network archetypes all of which could enable the Spike-Furin binding. In fact to have such essential and nominal degrees of freedom generally in bio-molecular fitting (including self-fitting or folding) allows the system to breathe and is of no great surprise (at least) in protein science, often directed by satisfying optimized global physico-chemical constraints while retaining their structural degeneracy. The revelation of secondary and super-secondary structural motifs [118], packing motifs within native globular protein interiors [77], composite salt-bridge motifs within proteins and protein complexes [90], as well as alternative packing modes potentially leading to the same native protein fold (or a befitting hydrophobic core) [119,120] are all instances of the phenomena.
The Spike-Furin interfacial salt-bridges (in both RR1 CoV-2 and ZR1 CoV-2 ) generally varied in terms of their contact intensities (CI) (see Section 2.5.1.2, Materials and Methods) while the high persistence salt-bridges (formed near the Furin catalytic triad) were, by and large, densely connected throughout their entire simulation runs (Figures 6 and S5, Supplementary Materials), hitting appreciably high ACI values in most cases (Tables 1 and S5). Most high persistence salt-bridges also frequently retained maximally connected (CI = CI max = 4) closed ionic bond (bipartite) motifs between bifurcated oppositely charged side chain groups in both subjects (insets of Figure 6 and Figure S5). RR1CoV-2 and ZR1CoV-2 also had great resemblance in their frequency distribution profiles for the Spike-Furin interfacial salt-bridge persistence(s) (unweighted as well as weighted: see Section 2.5.1.1, Materials and Methods) [53] (Figure S6, Supplementary Materials). To quantify this resemblance, the entire theoretical range of persistence (pers) [0,1] was partitioned into 20 equally spaced bins, and for each ensemble, the normalized fre- RR1 CoV-2 and ZR1 CoV-2 also had great resemblance in their frequency distribution profiles for the Spike-Furin interfacial salt-bridge persistence(s) (unweighted as well as weighted: see Section 2.5.1.1, Materials and Methods) [53] (Figure S6, Supplementary Materials). To quantify this resemblance, the entire theoretical range of persistence (pers) [0,1] was partitioned into 20 equally spaced bins, and for each ensemble, the normalized frequencies of salt-bridges falling within each persistence-bin (of bin-width: 0.05) were computed. A similar approach was adapted for weighted persistence (wpers) (see Section 2.5.1.1, Materials and Methods) which maps to a theoretical range of [0,4], equaling a bin-width of 0.2 for a 20-bin model. The Pearson's correlation coefficient (r P ) between these obtained normalized frequencies from the two ensembles (RR1 CoV-2 , ZR1 CoV-2 ) was found to be 0.97 for persistence ( Figure S6A) and 0.93 for weighted persistence ( Figure S6B) (p-value < 0.00001 for both which is significant at 99.9% level). The same correlation (Pearson's) was found to be 0.66 (p-value: 0.001445, significant at 99.9% level) between the frequency distribution profiles (RR1 CoV-2 , ZR1 CoV-2 ), which are plotted for average contact intensities (ACI) of ionic bonds formed at the Spike-Furin interface ( Figure S6A, inner set).

In ZR1 CoV , the Baseline
As introduced in Section 2.3.3 (Materials and Methods), ZR1 CoV (the representative Spike-Furin interaction in SARS-CoV, 2002/2003) served as the baseline for the Spike-Furin interaction in SARS-CoV-2. As has been already discussed (see Section 2.2, Materials and Methods), the FLCS Spike patch in ZR1 CoV ( 664 SLLRSTS 670 , originally missing in 7AKJ) is much shorter than its homologous missing patch in 6XR8 ( 677 QTNSPRRARSVA 689 ), mapping to their corresponding degrees of disorder (higher in the later). The relative composition of the two patches and their pairwise alignments ( Figure S1) further support the observation that, indeed, a lesser degree of disorder is expected for the concerned patch in SARS-CoV than that in SARS-CoV-2. Most notably, the third arginine (R685) of 681 PRRAR 685 in CoV-2 FLCS Spike is evolutionarily conserved (as R667) also in CoV FLCS Spike (see Section 2.2) ( Figure S1). A closer look into the pairwise alignments of the two sequences ( Figure S1) also reveals the strategic insertion of a dipeptide unit ( 681 PR 682 ) followed by two non-synonymous replacements (L665→R683, L666→A684) in the disordered activation loop (see Introduction) of the CoV-2 Spike (6XR8) with respect to its ancestral homologous Spike in CoV (7AKJ). With this background, when we had a good look at the MD simulation trajectories of ZR1 CoV , we found something very insightful. The conserved arginine (R667) in 7AKJ was found to cover a lot of space in ZR1 CoV throughout its entire dynamic trajectory (100 ns) together with an accompanying nearby lysine (K672), feeling up for the absence of the other arginines (in reference to 681 PRRAR 685 , CoV-2) leading to the formation of a homologous dynamically persistent network of interfacial salt-bridges in CoV. The conserved arginine, R667 alone seemed to engage as many as three counter-ionic Furin side chains (E230, E257, D258) forming two high persistent (pers: 0.62, 0.95) and one moderately persistent (pers: 0.28) salt-bridges (see Table S6, Supplementary Materials), while the neighboring lysine (K672) was found in pair with D258 (Furin) with a persistence of 0.58. Notably, D258 among the Furin anionic residues, shared persistent salt-bridges simultaneously with K672 and R667 (see Table S6). In contrast to the 681 PRRAR 685 (CoV-2 Spike), here, in context to 664 SLLRSTS 670 (CoV Spike), the absence of the long and electrostatically repelling neighboring arginine side chains offers R667 the physical space to remain substantially flexible to be simultaneously involved in multiple high persistence salt-bridges. The formation of these interfacial salt-bridges are favored by the proximal looping of the flanking lysine (see Figure S7, Supplementary Materials). The two nonadjacent basic residues (R667, K672) together serves to sustain the homologous Spike-Furin interface in CoV by the formation of several dynamically persistent ionic bonds. To that end, if the emergence of the 'PRRAR' motif (in CoV-2 Spike) is to be considered a solution that is optimized for the most efficient Spike-S1/S2 cleavage at the Spike-Furin interface, the interplay of R667 and K672 in context to the homologous FLCS Spike in CoV appears to be analogous to the event of structural relaxation in mutant protein cores. In other words, the way R667 and K672 cover up the physical space in SARS-CoV to sustain the Furin and catalyze the Spike-S1/S2 cleavage appear to resemble with the collective conformational readjustments of neighboring residues, filling up for packing defects and/or cavities/holes introduced upon hydrophobic substitutions/truncation in native protein core(s) [121,122]. Having said that, the total number of non-redundant Spike-Furin interfacial salt-bridges were found to be literally doubled by the incorporation of the additional arginines (PRRAR) in CoV-2 (17: RR1 CoV-2 , 20: ZR1 CoV-2 ) compared to the baseline (9: ZR1 CoV ) in CoV. In addition, the ionic bond networks seemed to be certainly denser and more intense in CoV-2 with a CCI (see Section 2.5.1.2, Materials and Methods) of 7.65, 10.16 in RR1 CoV-2 , ZR1 CoV-2 compared to 5.48 in CoV (ZR1 CoV ). These are signatures of evolutionary optimization (CoV → CoV-2) at the FLCS Spike with regard to Furin binding and cleavage.

Enthalpy Entropy Compensation Involved in the Spike-Furin Interaction
The 'salt-bridge hypothesis' (see Section 3.5) was proposed based on the intuition that the disordered high-entropy FLCS Spike loop must get jammed into a restricted set of 'allowed' conformations into Furin that are favorable for the Spike-S1/S2 cleavage. Such a conformational selection should necessarily accompany electrostatic stabilization of the highly localized positive charge cloud on 681 PRRAR 685 (FLCS Spike ). The Spike-Furin interaction, thus, implicitly speaks in favor of a 'disorder-to-order transition' that needs an enthalpic source to compensate for the high entropic cost (loss) intrinsic to the supposed transition. One prime source for such enthalpic compensation is salt-bridges, for they impart local rigidity in proteins by jamming conformations [90]. To that end, the 'salt-bridge hypothesis' was only found more plausible by the detection of a potentially dockable surface groove to fit in the FLCS Spike near the Furin catalytic triad (see Section 3.5), surrounded by exposed anionic residues that seemingly possess the potential to the form salt-bridges with the FLCS Spike arginines (in 681 PRRAR 685 ). All structural dynamics analyses (see Section 3.6) unanimously and collectively reveal that the FLCS Spike fits nicely and stably into the proposed dockable groove, stabilized by the formation and sustenance of dynamically interchangeable interfacial salt-bridges (validated in RR1 CoV-2 , and, crossvalidated in ZR1 CoV-2 ). Together, these results speak in favor of an 'enthalpy entropy compensation' intrinsic to the transition from the disordered (free) to the relatively ordered (bound) state of the SARS-CoV-2 FLCS Spike . To further confirm this thermodynamic phenomenon, we computed actual structure based all atom thermodynamic parameters by FoldX (see Section 2.6, Materials and Methods) for the respective MD simulation trajectories pertaining to RR1 CoV-2 , ZR1 CoV-2 (300 ns, 100 ns) and compared the relevant transition enthalpic (∆H vdw , ∆H elec ) and entropic (∆S mc , ∆S sc ) terms associated with the Spike-Furin binding/complexation. Both molecules in their integral forms were considered for the FoldX energy calculations. To set up an appropriate baseline, ZR1 CoV was also included in the calculations and comparison. All relevant transition enthalpic and transition entropic terms individually as well as collectively had retained ( Table 2) a counter trend (∆H vdw/elec < 0, ∆S mc/sc > 0) throughout the entire trajectories of all subjects, RR1 CoV-2 (Figure 7), ZR1 CoV-2 , ( Figure S8A, Supplementary Materials) as well as the baseline, ZR1 CoV ( Figure S8B). This is suggestive of enthalpy-entropy compensations accompanying both Spike-Furin binding events (in CoV-2, CoV). However, as is reflected from the relative magnitudes of the ∆H, ∆S terms (Table 2, Figure 7, Figure S8), the binding in CoV-2 (RR1 CoV-2 , ZR1 CoV-2 ) is attributed with higher entropic costs of the event and, therefore, with a concomitant higher degree of enthalpic compensation than that in CoV (ZR1 CoV ). Table 2. Structure-based equilibrium thermodynamic parameters (FoldX) obtained from the MD simulation trajectories of RR1 CoV-2 , ZR1 CoV-2 , ZR1 CoV representing the Spike-Furin binding in SARS-CoV-2 (subject, cross-validation) and SARS-CoV (baseline) respectively. Time-series averages (in kcal mol −1 ) over the corresponding trajectories for each subject (RR1 CoV-2 , ZR1 CoV-2 , ZR1 CoV ) are given for each energy term (including enthalpic, entropic, free-energy terms) with their standard deviations given in parenthesis.  As a second test, the individual main chain and side chain conformational entropies for the receptor (Spike) and the ligand (Furin) were also surveyed in RR1CoV-2, ZR1CoV-2, ZR1CoV (throughout their respective trajectories) and compared between their unbound (∆  Table S7, Supplementary Materials), confirming the 'entropy arrest' (refer to Section 2.6, Materials and Methods) implicit to the Spike-Furin complexation in both systems (CoV-2, CoV). Though, the comparative transition entropy profiles and the time-averages were in the same range of values for both systems, literally all the surveyed terms experienced a rise of about 3-5% in terms of their average trends from the former to the latter complex (Table S7). Perhaps with no great surprise, the most As a second test, the individual main chain and side chain conformational entropies for the receptor (Spike) and the ligand (Furin) were also surveyed in RR1 CoV-2 , ZR1 CoV-2 , ZR1 CoV (throughout their respective trajectories) and compared between their unbound (∆S 'entropy arrest' (refer to Section 2.6, Materials and Methods) implicit to the Spike-Furin complexation in both systems (CoV-2, CoV). Though, the comparative transition entropy profiles and the time-averages were in the same range of values for both systems, literally all the surveyed terms experienced a rise of about 3-5% in terms of their average trends from the former to the latter complex (Table S7). Perhaps with no great surprise, the most prominent rise (CoV→CoV-2) was found for the side chain conformational entropies (5.7%) of the receptor molecule (∆S ceptor sc , i.e., Spike) undergoing the transition, naturally, for the sequence differences pertaining to the FLCS Spike in both.

Subjects
The binding free energy overall was mildly disfavored (i.e., ∆G binding mildly positive) in both Spike-Furin binding events (Table 2) suggesting perhaps to the characteristic formation of metastable and multi-stable interfaces throughout the coronavirus lineage. A strict negative ∆G binding was obtained in 17.1%, 21.5% of the time-frames in the CoV-2 trajectories: RR1 CoV-2 , ZR1 CoV-2 , while the same fraction was found to be merely 1.8% in ZR1 CoV , the baseline (in CoV). The metastabilities (suggesting an 'on-and-off' mode of binding) appear to be of no great surprise and perhaps anticipated given that the Spike-Furin binding works similar to a preface to the cleavage of a desired peptide bond that seems to be favored upon the transient formation of certain energetically favorable intermediate conformations in the bound FLCS Spike . In fact, given that such disordered activation loops (such as that of FLCS Spike ) are known to serve as key structural and kinetic determinants of protease substrates [44], it would be worth exploring (via future studies) across other families of proteases (see Introduction) harboring such cleavage loops, as to whether the metastabilities also holds true in them. Apart from the revealed characteristic metastabilities, the comparative free energy values for the Spike-Furin binding were roughly twice as much in magnitude in CoV (<∆G binding ≥ 6.429 kcal mol −1 , SD = 3.094: ZR1 CoV ) compared to those in CoV-2 (<∆G binding > = 3.687, 3.043 kcal mol −1 , SD = 3.868, 3.843: RR1 CoV-2 , ZR1 CoV-2 ). The corresponding ∆∆G binding values for RR1 CoV-2 , ZR1 CoV-2 were found to be −2.742, −2.561 kcal mol −1 (see Section 2.6, Materials and Methods), which speaks directly in favor of a much more facilitated transition in the evolutionarily later event, signaling for the intended optimization (irrespective of whether natural or not) to have indeed occurred in SARS-CoV-2.
3.8. Using the Ramachandran Plot to Probe the 'Disorder-to-Order Transition' of the SARS-CoV-2 FLCS Spike Loop upon Furin Binding Finally, the paper takes the opportunity to demonstrate a novel use of the legendary Ramachandran plot (RP) [55] in probing the 'disorder-to-order transition' of the SARS-CoV-2 FLCS Spike loop upon Furin binding. The unbound disordered state was taken to be the ensemble of 500 'all-atom' SARS-CoV-2 Spike atomic models (see Section 3.2) built with its experimentally missing patches modeled with uniquely varied loop-conformations (see Section 2.2, Materials and Methods). On the other hand, 500 timeframes sampled at equal temporal interval of 600 ps from the 300 ns MD simulation trajectory of the Spike-Furin complex (simulated from the rank-1 docked pose in the ClusPro blind-docking) were assembled to represent the bound (presumably ordered) state. The Ramachandran backbone torsion angles (Φ, ψ) were then computed for each atomic model under each ensemble (bound, unbound) for all Spike residues and those pertaining to the FLCS Spike loop were extracted and plotted (overlaid) in the RP (Figure 8). The overlaid distributions clearly show more scatter for (Φ, ψ) points in the unbound ( Figure 8A) compared to bound ( Figure 8B) states. In addition, a re-view of the RPs were felt necessarily important with a sense of contiguity for the connected pentapeptide sequence motif (-681 PRRAR 685 -) embedded in the FLCS Spike loop. Such a sense of contiguity is also essential in terms of backbone tracing in protein crystallography [123] and depicting secondary structural elements [124]. To that end, a simple line-drawing of the successively connected residues belonging to the -P 681 -R 682 -R 683 -A 684 -R 685 -pentapeptide motif was performed ( Figure S10, Supplementary Materials), over and above the standard 'scattered points representation' of the RP.
RPs plotted for the Furin-bound state, it appears highly likely that the flexible FLCSSpike loop, or at least a good part of the loop, is in dynamic equilibrium with multiple short transient secondary structural elements (e.g., short helical turns, beta-strands etc.) in its bound state. Given the trends in {Φ, ψ}, rationalized by the comparatively relaxed τ angle in the bound state, this appears especially relevant to the pentapeptide (-681PRRAR685-) motif ( Figure 8B).

Figure 8. Overlaid Ramachrandran Plots for FLCSSpike pertaining to (A) unbound and (B) Furinbound Spike states.
Each plot is overlaid with 100 atomic models belonging to the same state (unbound or bound). Within each individual plot (i.e., pertaining to each atomic model), the {Φ, Ψ} points for residues in the FLCSSpike loop is plotted as black dots encircled by green while those for residues belonging to the -P681-R682-R683-A684-R685-pentapeptide motif are highlighted by red dots encircled by blue.
Subsequent to the visual inspection and comparison, the RP derived parameters (see Section 2.7, Materials and Methods) were computed for the FLCSSpike patch independently for each state (unbound and Furin-bound) to quantify the distribution of (Φ, ψ) points spanning across the two plots ( Figure 8A,B) and to assess whether this difference is of any significance. By definition (see Section 2.7, Materials and Methods), smaller values of |δc| (say, <30º) statistically indicates that the consecutive residues are conformationally alike or close, which effectively leads to a local structural coherence and relative structural order for the FLCSSpike, while a larger value suggests regular structural conflicts and, consequently, structural disorder. Thus, by definition, |δc| also offers an estimate of how much the FLCSSpike is conformationally varied (or, in other words, distributed among varying structural conformations) on average. Lesser values of |δc| indicate greater tendencies (on an average) of the FLCSSpike to attain closely related structural conformations, while as the value increases, structural degeneracy [54] is manifested within the FLCSSpike.
All the RP-derived parameters (see Section 2.7, Materials and Methods) describe the spread of the concerned molecular patch (i.e., deviation in the Φ-ψ space) and all of them unequivocally drop (i.e., shrink) in the bound state (Table S8, Supplementary Materials) compared to the unbound state. The relative decrease from state-1 (Spike, unbound) to state-2 (Spike, Furin-bound) in this two-state transition is 32.5% in <δ>3q and 55.7% in <δ>9d. This indicates that the distance (δ, defined in the Φ-ψ space) by which 90% of points are separated in the two RPs (states) is increased more than 1.5 times in the unbound state, compared to the bound state. Together, the visual and the quantitative analyses clearly For the unbound state, the successive points clearly hovers around extreme ends of the RP resembling a highly multi modal distribution in terms of occupying different regions in RP indicating high structural conflicts or disorder. In comparison, the same successive points clearly gets shrunk into a constrained distribution for the Furin-bound state, directed to an extended 'generously allowed' region [125] of the RP. More interestingly, this extended region almost perfectly maps to the extended bridged territory of the originally proposed allowed regions [55] for beta-sheets and right handed alpha-(as well as 3 10 -) helices upon the relaxation of the bond-angle, tau (τ: N-Cα-C) [126]. Motivated by these very interesting observations, we further computed the τ angle for the FLCS Spike in both the ensembles (unbound and bound). The τ angle in the Furin-bound state (time-averaged) was indeed found to be more relaxed (<τ Furin-bound >: 109.6 • ; SD (standard deviation): 4.0 • ) and trending to its ideal value of 109.5 • for a tetrahedral sp 3 carbon [127], compared to its unbound state where the average value (<τ unbound >: 107.4 • ; SD: 2.3 • ) was somewhat leftshifted from its tetrahedral ideal value. When surveyed for the -681 PRRAR 685 -pentapeptide motif, the two <τ> values were even more separated with a similar ratio of their standard deviations (<τ unbound >: 107.9 • ; SD: 2.4 • ; <τ Furin-bound >: 110.8 • ; SD: 4.2 • ). In both cases (FLCS Spike , -681 PRRAR 685 -), the standard deviations were 1.7 times more in the bound state (i.e., more relaxed τ angles) than in the unbound state. Moreover, from the overlaid RPs plotted for the Furin-bound state, it appears highly likely that the flexible FLCS Spike loop, or at least a good part of the loop, is in dynamic equilibrium with multiple short transient secondary structural elements (e.g., short helical turns, beta-strands etc.) in its bound state. Given the trends in {Φ, ψ}, rationalized by the comparatively relaxed τ angle in the bound state, this appears especially relevant to the pentapeptide (-681 PRRAR 685 -) motif ( Figure 8B).
Subsequent to the visual inspection and comparison, the RP derived parameters (see Section 2.7, Materials and Methods) were computed for the FLCS Spike patch independently for each state (unbound and Furin-bound) to quantify the distribution of (Φ, ψ) points spanning across the two plots ( Figure 8A,B) and to assess whether this difference is of any significance. By definition (see Section 2.7, Materials and Methods), smaller values of |δ c | (say, <30º) statistically indicates that the consecutive residues are conformationally alike or close, which effectively leads to a local structural coherence and relative structural order for the FLCS Spike , while a larger value suggests regular structural conflicts and, consequently, structural disorder. Thus, by definition, |δ c | also offers an estimate of how much the FLCS Spike is conformationally varied (or, in other words, distributed among varying structural conformations) on average. Lesser values of |δ c | indicate greater tendencies (on an average) of the FLCS Spike to attain closely related structural conformations, while as the value increases, structural degeneracy [54] is manifested within the FLCS Spike .
All the RP-derived parameters (see Section 2.7, Materials and Methods) describe the spread of the concerned molecular patch (i.e., deviation in the Φ-ψ space) and all of them unequivocally drop (i.e., shrink) in the bound state (Table S8, Supplementary Materials) compared to the unbound state. The relative decrease from state-1 (Spike, unbound) to state-2 (Spike, Furin-bound) in this two-state transition is 32.5% in <δ> 3q and 55.7% in <δ> 9d . This indicates that the distance (δ, defined in the Φ-ψ space) by which 90% of points are separated in the two RPs (states) is increased more than 1.5 times in the unbound state, compared to the bound state. Together, the visual and the quantitative analyses clearly and directly portray (from actual structural dynamics data) the transition of the unbound disordered FLCS Spike to a relatively ordered Furin-bound state in SARS-CoV-2 Spike.
Lastly, to render a statistical significance to the change in the obtained distributions of (Φ, ψ) points in the RP associated with the two-state transition (unbound → bound) of the FLCS Spike , a χ 2 test was performed. Ten distinct bins corresponding to disjoint regions in the RP were considered. We adapted the Procheck [128] version of the RP to reproduce the Ramachandran (Φ, ψ) contours ( Figure 8) and used the MATLAB inbuilt function 'inpolygon' for the frequency distribution of (Φ, ψ) points into these bins. For reasons of simplicity, the later-extended generously allowed regions [125,128] of the RP were avoided. The 10 bins, thus, represented three allowed regions for regular secondary structural elements (β-sheets, Rα-helices, Lα-helics) (R: Right-handed; L: Left-handed), six partially allowed regions (of largely varying areas) across the plot, and, the entire left-over disallowed region, pulled into the 10 th bin. The null hypothesis was taken to be 'no or little (i.e., insignificant) changes caused in the unbound (Φ, ψ) points (FLCS Spike ) upon binding to Furin (i.e., Expected: unbound; Observed: bound)'. The χ 2 (see Section 2.8, Materials and Methods) value obtained from the differential counts ( Figure 9) of points (unbound → bound) in this 10-bin distribution (Degree of Freedom (df) = 9) was found to be 3650.32, which is~131 times higher than that of the upper-tail critical χ 2 value for df = 9 at 99.9% level of significance (χ 2 0.001 = 27.88). Based on these numbers, the null hypothesis was rejected, which should mean that the 'unbound → bound' change was indeed significant in the FLCS Spike in terms of their relative RP distributions even at the 99.9% level. In other words, the 'disorder→order transition' of the FLCS Spike upon binding to Furin was evident and unmistakable.
The RP has previously been used to probe transitions among α-helix, Π-helix and turns in context to the phosphorylation of smooth muscle myosin [129]. Having said that, the visual impact of simple line-drawing (to portray sequence contiguity), as well as the collective use of RP derived metrices, to the best of our knowledge and belief, together presents yet another novel use of the evergreen and multifaceted Ramachandran plot. obtained from the differential counts ( Figure 9) of points (unbound → bound) in this 10bin distribution ( Degree of Freedom (df) = 9) was found to be 3650.32, which is ~131 times higher than that of the upper-tail critical χ 2 value for df = 9 at 99.9% level of significance (χ 2 0.001 = 27.88). Based on these numbers, the null hypothesis was rejected, which should mean that the 'unbound → bound' change was indeed significant in the FLCSSpike in terms of their relative RP distributions even at the 99.9% level. In other words, the 'disorder→order transition' of the FLCSSpike upon binding to Furin was evident and unmistakable. Figure 9. Distribution of differential counts obtained from binning of the RP for unbound and bound states of FLCSSpike. For binning details please refer to main-text. Bins 1, 2 & 8 represents Ramachandran allowed regions for β-sheets, Rα-helices, Lα-helices, while bins 3 to 7 and 9 maps to six disjoint partially allowed regions and bin-10 stands for the pulled-down disallowed region.
The RP has previously been used to probe transitions among α-helix, Π-helix and turns in context to the phosphorylation of smooth muscle myosin [129]. Having said that, the visual impact of simple line-drawing (to portray sequence contiguity), as well as the Figure 9. Distribution of differential counts obtained from binning of the RP for unbound and bound states of FLCS Spike . For binning details please refer to main-text. Bins 1, 2 & 8 represents Ramachandran allowed regions for β-sheets, Rα-helices, Lα-helices, while bins 3 to 7 and 9 maps to six disjoint partially allowed regions and bin-10 stands for the pulled-down disallowed region.

Conclusions and Perspective
In parallel to the ongoing efforts to find a sustainable therapeutic solution to curb the coronavirus pandemic, debates are also ongoing regarding the origin of SARS-CoV-2, stirred up by genome comparison studies of late, revealing the emergence of the 681 PRRAR 685 motif in the SARS-CoV-2 Spike, absent in other related respiratory viruses. The strategic presence of such polybasic motifs in FLCS Spike -like flexible loops in coronavirus and other related respiratory viral lineages leads to local protein disorder [24], intrinsic to these activation loops; the one in SARS-CoV-2 is believed to play a key role in the drastic increase in viral host cell entry and transmissibility. To the very best of our knowledge, the current study is the first of its kind that entraps a 'disorder-to-order transition' in the SARS-CoV-2 FLCS Spike while it undergoes host Furin binding that is optimized for a more efficient proteolytic cleavage of its S1/S2 junction than that in SARS-CoV. The optimization and the consequent increase in proteolytic cleavage efficiency is unambiguous from all analyses performed (Sections 3.5-3.8) but is perhaps the most clear and direct from the fairly negative ∆∆G binding values returned from the two events (Section 3.7). The study further reveals the key role of dynamically interchangeable, persistent salt-bridges at the Spike-Furin interface, which seems to be an evolutionarily conserved feature of the coronavirus lineage and is substantially enhanced in the case of SARS-CoV-2 due to the presence of the three arginine (R682, R683, R685) in the 681 PRRAR 685 motif amid its FLCS Spike . The host Furin, orchestrated with a preponderance of exposed amenable anionic residues (E230, E236, D259, D264, D306) strategically positioned around its catalytic triad, overwhelmingly favors polybasic disordered substrates such as that of the 681 PRRAR 685 motif (SARS-CoV-2) for binding, cleavage and consequent host cell entry of the virus (Sections 3.5 and 3.6). The resultant Spike-Furin interfacial salt-bridges not only serves as a prominent enthalpy source for the process (compensating for the entropic loss of the FLCS Spike undergoing 'disorder-to-order transition') but also favors the system to retain its characteristic metastabil-ities favorable for proteolytic cleavages targeted at flexible protein loops (Section 3.7). The current study also helps to open up new research avenues across other related protease families harboring such cleavage loops, as to whether these revealed metastabilities also holds true in them. The findings are perfectly consistent with the established theories of salt-bridge dynamics in context to IDPs serving to retain their characteristic structural plasticity by the continuous triggering of phase transitions among their self-similar disordered states [53,54]. Further, from the combined results of salt-bridge and thermodynamic analyses (see Sections 3.6 and 3.7), it strongly appears that the Furin cleavage seeks opportunities for transient formation of favorable intermediate conformations in the bound FLCS Spike to make the final unfailing strike on the desired peptide bond. The probabilities of this strike's success is naturally far greater in SARS-CoV-2 (than in SARS-CoV), since it has the more intense salt-bridge networks formed and sustained in its Spike-Furin interface (for the presence of the 681 PRRAR 685 arginines). These findings further rationalize the substantially greater extent of cleavage (59.6%) of the SARS-CoV-2 Spike (into its S1/S2 products) in the wild-type virion than in its ∆PRRA mutant (14.5%) [35]. In conclusion, over and above offering a novel perspective into the coronavirus molecular evolution, this study also makes the SARS-CoV-2 Spike-Furin interaction mechanistically insightful, adding new dimensions to the existing theories of proteolytic cleavages per se.