Computational Study of Helicase from SARS-CoV-2 in RNA-Free and Engaged Form

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of the pandemic that broke out in 2020 and continues to be the cause of massive global upheaval. Coronaviruses are positive-strand RNA viruses with a genome of ~30 kb. The genome is replicated and transcribed by RNA-dependent RNA polymerase together with accessory factors. One of the latter is the protein helicase (NSP13), which is essential for viral replication. The recently solved helicase structure revealed a tertiary structure composed of five domains. Here, we investigated NSP13 from a structural point of view, comparing its RNA-free form with the RNA-engaged form by using atomistic molecular dynamics (MD) simulations at the microsecond timescale. Structural analyses revealed conformational changes that provide insights into the contribution of the different domains, identifying the residues responsible for domain–domain interactions in both observed forms. The RNA-free system appears to be more flexible than the RNA-engaged form. This result underlies the stabilizing role of the nucleic acid and the functional core role of these domains.


Introduction
A novel coronavirus (severe acute respiratory syndrome coronavirus 2, SARS-CoV-2) has been identified as the pathogen responsible for the outbreak of a severe, rapidly developing pneumonia, coronavirus disease 2019 (COVID-19) [1]. The first event of the SARS-CoV-2 emergency occurred when a group of patients at hospitals in Wuhan, China, reported cases of pneumonia with an unknown cause. The symptoms were characterized by fever, pneumonia, cough, and dyspnea. The first known case can be dated to 8 December 2019, while the international state of emergency was announced by the WHO on 30 January 2020. On 11 March in the same year, the World Health Organization (WHO) defined COVID-19 as a pandemic [2]. Up to now, the COVID-19 disease has caused more than 6 million deaths worldwide. The novel coronavirus belongs to the severe acute respiratory syndrome-related coronavirus species, which arises from the genus Betacoronavirus, a genus that infects only mammals, from the family Coronaviridae. The family belongs to the order Nidovirales, which includes viruses with single-stranded RNA genomes of positive polarity [3]. The virus, as far as humans are concerned, is responsible for respiratory infections that manifest themselves as a mild to severe clinical picture [1]. The beta-coronavirus group also includes severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV), which emerged in 2002 and 2012, respectively, which caused severe respiratory diseases with high pathogenicity [3].
The coronavirus virion is defined by four structural proteins: spike, envelope, membrane, and nucleocapsid [4,5]. The latter covers the genome, which consists of a positive-exhibits the hallmarks of the superfamily to which it belongs. It consists of five domains. The first is one of the most conserved domains in the Nidovirales order, which is the Nterminal zinc binding domain (ZBD, Residues 1-100) that coordinates three Zn(II) ions; a helical "Stalk" domain (101-150); a β-barrel domain (1B, 151-261); and two "RecA-like core helicase domain" subdomains (namely 1A and 2A, Residues 262-442 and 443-596, respectively), which can be referred to as the catalytic core of the helicase, since it holds the residues known to be responsible for nucleotide binding and hydrolysis [19]. Although NSP13 is a highly conserved protein-in fact, it differs from SARS-CoV by only one amino acid [23]-its exact role in the viral cycle has not yet been delineated; in addition, it is a potential and interesting drug target [23]. The structures of NSP13 in the RNA-free and in the RNA-engaged form are nearly identical in their general folding and non-structural proteins (NSPs) and structural and accessory proteins. NSPs are encoded in ORF1a and ORF1b. Cap-dependent translation begins at ORF1a and produces pp1a, which encompasses NSP1-11, or pp1ab, which encompasses NSP12-16. (B) Ribbon diagram of the RdRp-(NSP13) 2 supercomplex structure in the RNA-engaged form (PDB ID: 7RDY). The NSP12 (RdRp), NSP7, and NSP8 subunits are in light blue, gray, and green, respectively. The two NSP13 subunits are in blue and violet for the RNA-engaged and the RNA-free form, respectively. The template and the product RNA strands are in red and orange, respectively. Zn(II) and Mg(II) ions, as well as ADP molecules are reported as spheres colored according to the atom type. (C) Ribbon diagram of NSP13 in the RNA-free form (PDB ID: 6ZSL). The ribbons are colored in red, orange, yellow, green, and blue to highlight the position of the ZBD, Stalk, 1B, and 1A and 2A domains, respectively. Zn(II) ions are reported as purple spheres.
The high-resolution structure of NSP13 was resolved in 2020 at 1.94 Å resolution in the RNA-and ADP-free form ( Figure 1C) [19]. From a structural point of view, NSP13 exhibits the hallmarks of the superfamily to which it belongs. It consists of five domains. The first is one of the most conserved domains in the Nidovirales order, which is the N-terminal zinc binding domain (ZBD, Residues 1-100) that coordinates three Zn(II) ions; a helical "Stalk" domain (101-150); a β-barrel domain (1B, 151-261); and two "RecA-like core helicase domain" subdomains (namely 1A and 2A, Residues 262-442 and 443-596, respectively), which can be referred to as the catalytic core of the helicase, since it holds the residues known to be responsible for nucleotide binding and hydrolysis [19].
Although NSP13 is a highly conserved protein-in fact, it differs from SARS-CoV by only one amino acid [23]-its exact role in the viral cycle has not yet been delineated; in addition, it is a potential and interesting drug target [23]. The structures of NSP13 in the RNA-free and in the RNA-engaged form are nearly identical in their general folding and in the domains' composition. However, the domains are found in slightly different positions (Figures S1 and S2 in the Supporting Information) because of the presence of the RNA fragment. In particular, in the RNA-engaged form, Domain 1B moves slightly away from the other domains with which it is in contact in the RNA-free form to make room for the single strand of RNA that runs through the entire structure of the protein. Domain 2A is also positioned slightly differently from the other domains in the two forms.
This movement is also attributable to the presence of the RNA fragment and the presence of an ADP molecule in a pocket at the interface between Domains 1A and 2A. These rearrangements cause an overall change in the structure, so that the root mean square deviation (RMSD) between the Cα of the two experimental structures is 0.32 nm.
The aim of this work was to investigate NSP13 from a structural and dynamical point of view by taking advantage of the recent structures of the protein in the RNA-free [19] and in the RNA-engaged form [22], the latter of which was extracted from the RdRp-(NSP13) 2 supercomplex. Therefore, two 10-µs-long sets of atomistic molecular dynamics simulations were implemented to examine the different conformational behaviors of NSP13 separated by the RTC complex in the presence or in the absence of an RNA fragment at the microsecond timescale. In addition to the classical structural observations, it was deemed interesting to explore the specific contact frequencies occurring between the single helicase domains. In other words, emphasis was placed on the residues involved in the mutual interactions of individual domains. The latter analysis was carried out in the aim of obtaining detailed information on the individual residues that change their interaction partners from the RNA-free to the RNA-engaged form of the SARS-CoV-2 helicase.

General Behaviour of the RNA-Free and the RNA-Engaged Systems
The structural stability of the two helicase systems was tested during the simulations by calculating the deviations of each structure from the starting coordinates and the fluctuation of each residue with respect to the average structure after superimposition on the Cα atoms.

Deviations from the Starting Structure
The RMSD for the Cα atoms of the whole protein was calculated and is depicted in Figure 2A. The behavior of the RMSD as a function of the simulation time showed that both the RNA-free and the RNA-engaged systems relax to conformations that differ by around 0.34 and 0.22 nm from the original system, respectively. For both simulations, the plateau of the structural drift was reached within the first microsecond. Notably, the RMSD of the RNA-free form was slightly higher than that of the RNA-engaged form. In other words, it appears that the presence of the RNA fragment and of the ADP molecule provides some additional stability to the NSP13 protein. In addition to the total RMSD, the deviations belonging to each domain were also calculated (see Figure 2B and Figures S2-S5 in the Supporting Information). The largest differences between the RNAfree and the RNA-engaged form were observed in the case of the ZBD and 2A domains. In the case of the ZBD domain ( Figure 2B), the system appeared to show higher stability in the case of the RNA-free system, while the RNA-engaged system showed RMSD values that often oscillated between~0.30 nm and up to more than 0.40 nm throughout the trajectory. This behavior, which could be due to the active contribution of the ZBD domain in RNA processing, was not affected even by excluding the residues at the N-terminal from the RMSD calculation. The coordination geometries of the three Zn(II) ions did not change during both simulations. The 2A domain remained extremely stable in the RNA-engaged form ( Figure 2B), while in the RNA-free form, it experienced higher structural variability. The latter result can be ascribed to the presence of both the RNA fragment, which pushes Domain 1A towards Domain 2A, and the presence of the ADP molecule coordinated by the Mg(II) ion, which stabilises the interaction between Domain 1A and Domain 2A, resulting in a reduction in the fluctuations of the latter.

Backbone Fluctuations
The residue-based root mean square fluctuation (RMSF) of the trajectories was calculated for the Cα atoms to measure the flexibility of the protein chain ( Figure 2C). The data show that there are different RMSF in specific regions of the protein. Residues 50 to 100, which belong to ZBD, possess higher flexibility. Residues 180 to 220 also show significant variation, indicating a certain flexibility of the 1B domain to which they belong. This is an expected result, considering the poorness in the secondary structural elements of the latter domains and the presence of long loops. Remarkably, in the RNA-free form, there are two clearly visible RMSF peaks, one around the region of Residues 335-339 and another near the region of Residues 482-486. These areas belong to two loops located in Domains 1A and 2A that are in direct contact with the RNA strand in the RNA-engaged starting structure. Moreover, the nucleotide binding site is mainly located in a cleft between Domains 1A and 2A. This explains the presence of the RMSD peaks in the structure when the RNA is not present, unlike the system occupied by the nucleic acid, which has considerably lower RMSF values in these regions due to the RNA's stabilizing effect. On the other hand, the loop formed by Residues 204-208 is more flexible in the RNA-engaged form due to a displacement of the 2A domain, of which such a loop is part, due to insertion of the RNA. This movement of the 2A domain causes the breaking of some interactions with Domain 1A stabilizing the 204-208 loop in the RNA-free form (see also Section 2.3.1 below). Moreover, the 415-420 region experiences larger fluctuations in the RNA-engaged form. This is because of the absence of a small α-helix in the RNA-free form (Residues 419-421) that is not present any longer in the RNA-engaged form, possibly due to a conformational rearrangement that is necessary to make room for the nucleic acid. The absence of this helix causes major fluctuations in the 415-420 region, which lacks secondary structural elements and is exposed to the solvent in the RNA-engaged form. Other regions do not show significant RMSF differences within the two systems, apart for the N-terminal residues.  RMSD values for NSP13 in the RNA-free and in the RNA-engaged form are in dark cyan and orchid, respectively. The bold lines were obtained by applying a Savitzky-Golay filter [24] in order to cut out the noise. To the right of each plot, a violin representation of the RMSD distribution is provided. (C) Root mean square fluctuations values as a function of the residue number for NSP13 in the RNAfree (dark cyan) and RNA-engaged (orchid) form. The vertical lines highlight the position of the residues that are within 0.35 nm from the RNA strand in the RNA-engaged starting structure, while the horizontal bars report the position of each domain in the sequence and are colored according to the domain coloration in Figure 1C.

Backbone Fluctuations
The residue-based root mean square fluctuation (RMSF) of the trajectories was calculated for the Cα atoms to measure the flexibility of the protein chain ( Figure 2C). The data show that there are different RMSF in specific regions of the protein. Residues 50 to 100, which belong to ZBD, possess higher flexibility. Residues 180 to 220 also show significant variation, indicating a certain flexibility of the 1B domain to which they belong. This is an expected result, considering the poorness in the secondary structural elements of the latter domains and the presence of long loops. Remarkably, in the RNA-free form, there are two clearly visible RMSF peaks, one around the region of Residues 335-339 and RMSD values for NSP13 in the RNA-free and in the RNA-engaged form are in dark cyan and orchid, respectively. The bold lines were obtained by applying a Savitzky-Golay filter [24] in order to cut out the noise. To the right of each plot, a violin representation of the RMSD distribution is provided. (C) Root mean square fluctuations values as a function of the residue number for NSP13 in the RNA-free (dark cyan) and RNA-engaged (orchid) form. The vertical lines highlight the position of the residues that are within 0.35 nm from the RNA strand in the RNA-engaged starting structure, while the horizontal bars report the position of each domain in the sequence and are colored according to the domain coloration in Figure 1C.

Clustering
A cluster analysis was performed for both the RNA-free and the RNA-engaged system in accordance with the GROMOS algorithm [25]. For each trajectory, 10,000 frames were analyzed and clustered using a cutoff of 0.17 nm. For the RNA-free system, 109 clusters were found ( Figure 3A), with approximately 65% of the frames contained into the first four clusters. The trajectory of the RNA-engaged system was clustered in only 62 clusters instead, with 64% of the frames contained in the first two clusters ( Figure 3B). Thus, it appears that the presence of the nucleic acid induces a lower conformational variability in the structure of NSP13, in line with what has been observed in the RMSD and RMSF analyses. variability in the structure of NSP13, in line with what has been observed in the RMSD and RMSF analyses.
The main difference in the conformation of the representative structure of the first two clusters in both systems ( Figure 3C) is the relative position of the ZBD domain with respect to the other domains, along with other minor conformational changes. A comparison of the most representative structures in the most populated clusters in the RNA-free and RNA-engaged form of NSP13 showed a difference in the relative position of the ZBD domain with respect to the other domains and also a variation in the position of the 1B and 2A domains that can be ascribed to the presence of the single-stranded RNA fragment. Notably, Domain 1B is translated by about 5-7 Å and rotates about 30° clockwise to give space to the nucleic acid fragment ( Figure 3D). The remaining regions of the proteins are almost perfectly superimposable.  Motion correlations between various sub-parts of NSP13 can be characterized by a calculation of the covariance matrices of the amino acids' displacements. The corresponding maps computed using the coordinates of the Cα atoms are shown in Figure 4A and 4B for the RNA-free and RNA-engaged forms, respectively. NSP13 shows a slightly different dynamic behavior in the absence or in the presence of the RNA fragment. In particular, the ZBD domain appears to move as a single entity in the RNA-free form, while it appears to behave as two sub-domains of~50 residues each, with a negative motion correlation, meaning that they move in opposite directions. Moreover, in the RNA-free form, the motion of the ZBD domain is anticorrelated with that of the Stalk, 1A, and 2A domains and is weakly correlated with the 1B domain. The ZBD domain in the RNA-engaged form shows a more complex motion with respect to the other domains because of the division into two sub-domains. The Stalk domain's motion is correlated with that of the 1A domain while it is anticorrelated to that of the 2A domain in both systems. On the other hand, the Stalk domain's motion appears to be anticorrelated to that of Domain 1B in the RNA-free form, while the presence of the RNA fragment induces a strong Stalk-1B motion correlation. The motion of the 1B domain appears to be anticorrelated with respect to both the 1A and 2A domains. Finally, Domain 1A's motion is always anticorrelated with respect to the motion of the 2A domain, even if the presence of the RNA fragment seems to weaken this correlation of the motions.

Conformational Space
The conformational space revealed by the first two covariance eigenvectors was also inspected, and the resulting maps were used to search for significative sub-states of the conformational populations in the two systems. Figure 5A shows how the RNA-free form includes a region characterized by only one of the most visited conformations during the entire simulation (Basin A). The most representative structure in the absence of nucleic acid showed little difference from that identified by RMSD-based clustering ( Figure 3A). On the other hand, a trajectory analysis of the RNA-engaged form showed two important differences from the previous case. First, two basins were noted, one narrow and highly populated (Basin B in Figure 5B), and the second less populated and wider (Basin C). The

Principal Component Analysis
In order to characterize the main motions characterizing NSP13 in the two different forms studied in this work, the two trajectories were further analyzed using principal component analysis (PCA). In general, the dynamic behavior of NSP13 in both states could can be classified into at least 20 collective motions without any one of them being truly predominant over the others. The projection of the first two principal components, PC1 and PC2, on the structures of the RNA-free and RNA-engaged NSP13 was then analyzed. In the case of the RNA-free NSP13 (Figure 4C), the first two PCs accounted for 26% and 14% of the variance. The motion of PC1 could be mainly described as the swing of the ZBD domain with respect to the 2A domain, while PC2 corresponded to a general rearrangement of all the domains without any predominant motion being noticed. In the RNA-engaged NSP13 (Figure 4D), the projection of PC1 (accounting for 19% of the variance) again revealed a swinging motion of the ZBD domain, but this time toward Domains 1B and 1A. Finally, in the case of the PC2 projection of the RNA-engaged form (12% of variance), it was possible to observe how the Stalk domain remained immobile by acting as a fulcrum while the other domains underwent a structural rearrangement mainly involving the ZBD and 1B domains.

Conformational Space
The conformational space revealed by the first two covariance eigenvectors was also inspected, and the resulting maps were used to search for significative sub-states of the conformational populations in the two systems. Figure 5A shows how the RNA-free form includes a region characterized by only one of the most visited conformations during the entire simulation (Basin A). The most representative structure in the absence of nucleic acid showed little difference from that identified by RMSD-based clustering ( Figure 3A). On the other hand, a trajectory analysis of the RNA-engaged form showed two important differences from the previous case. First, two basins were noted, one narrow and highly populated (Basin B in Figure 5B), and the second less populated and wider (Basin C). The structures belonging to Basin C were observed in the first part of the simulation, and then evolved into the typical conformations of Basin B. The main difference between the representative structures of the two basins was in the conformation of the ZBD and 2A domains: the former bound more tightly to the rest of the protein, while the latter moved slightly away from the ZBD domain. The other domains did not undergo any significant changes. The RNA fragment did not undergo significant positional changes. The greater conformational variability (i.e., the greater amplitude of the single basin found) of the RNA-free form of NSP13 compared with the RNA-engaged form was in agreement with what has been seen in previous analyses.

Comparison with Previous Simulations
The results presented in this manuscript are in excellent agreement with what has been obtained previously from a model of NSP13 from SARS-CoV-2 obtained by starting with the homologous protein from SARS-CoV and then inserting an ADP molecule and the nucleic acid fragment by modeling [26]. Our results show that on a five-fold time scale, using experimental structures and a state-of-the-art model for metal ions, very similar results were observed for the dynamics of NSP13's domains. However, if one inserts the RNA fragment a posteriori, it is difficult to correctly model the movement of the 1B domain that is necessary to make room for the nucleic acid.

Interdomain Contact Frequencies
The contacts among the five NSP13 domains were analyzed by calculating interresidue distance matrixes. In particular, the strategy adopted was to calculate the normalized inter-residue contact frequencies, defining two residues as being in contact when the distance between any two heavy atoms was less than 3.5 Å for at least the 50% of the simulation time. The emerging picture is reported in Scheme 1, where the number of interdomain contacts in the RNA-free and RNA-engaged NSP13 simulations is reported. The general emerging picture is that the absence of the RNA fragment closes the nucleic acid cavity between the 1B and 1A domains. Indeed, as noted earlier, Domain 1B is the one that changes orientation the most because of the presence of RNA. On the other hand, the presence of RNA reinforces the contacts between the 1A and 2A domains. The interactions between the pairs of domains (1B-2A and 1A-2A) in the two simulated forms of NSP13 are discussed in detail below, while the interactions between the other pairs of domains are given in the Supplementary Information (Figures S6-S10). Finally, an analysis of the interactions of the NSP13 residues in the RNA-engaged form with the RNA fragment and the ADP molecule was carried out. Of the three right-hand panels, the leftmost one and the rightmost one represent a structure from Basins B and C, respectively. In the central panel, the superimposition between the two latter structures is reported. When only one structure is reported, the ribbons are colored as in Figure 1C; in the superimposition, the structures from Basins B and C are in dark and light green, respectively. The Zn(II) and Mg(II) ions are shown as spheres, while the ADP molecules are shown as in sticks colored according to the atom type.

Comparison with Previous Simulations
The results presented in this manuscript are in excellent agreement with what has been obtained previously from a model of NSP13 from SARS-CoV-2 obtained by starting with the homologous protein from SARS-CoV and then inserting an ADP molecule and the nucleic acid fragment by modeling [26]. Our results show that on a five-fold time scale, using experimental structures and a state-of-the-art model for metal ions, very similar results were observed for the dynamics of NSP13's domains. However, if one inserts the RNA fragment a posteriori, it is difficult to correctly model the movement of the 1B domain that is necessary to make room for the nucleic acid.

Interdomain Contact Frequencies
The contacts among the five NSP13 domains were analyzed by calculating inter- NSP13. The surfaces are colored according to the relative free energy. Representative structures from the more populated regions are reported in the bottom panels for the single highly populated region of RNA-free NSP13 (Basin A, left panel) and RNA-engaged NSP13 (panels on the right). Of the three right-hand panels, the leftmost one and the rightmost one represent a structure from Basins B and C, respectively. In the central panel, the superimposition between the two latter structures is reported. When only one structure is reported, the ribbons are colored as in Figure 1C; in the superimposition, the structures from Basins B and C are in dark and light green, respectively. The Zn(II) and Mg(II) ions are shown as spheres, while the ADP molecules are shown as in sticks colored according to the atom type.

B-2A Interactions
The comparison of the interactions between the 1B and 1A domains ( Figure 6) reflects the conformational displacement of the 1B domain upon RNA binding. Indeed, none of the numerous contacts detected in the RNA-free form was retained in the RNA-engaged form. In the RNA-free form, most of the interactions were H-bonds or salt bridges, while the only long-lasting contact in the RNA-engaged form (Lys202-Asp483) consisted of a salt bridge between the interacting side chains.
hand, the presence of RNA reinforces the contacts between the 1A and 2A domains. The interactions between the pairs of domains (1B-2A and 1A-2A) in the two simulated forms of NSP13 are discussed in detail below, while the interactions between the other pairs of domains are given in the Supplementary Information (Figures S6-S10). Finally, an analysis of the interactions of the NSP13 residues in the RNA-engaged form with the RNA fragment and the ADP molecule was carried out. Scheme 1. Number of contacts between the five NSP13 domains along the trajectory in the RNAfree (left) and RNA-engaged (right) forms. The domains are colored as in Figure 1C, while the thickness of the lines connecting the domains is proportional to the number of contacts. If no line connecting two domains is observed, then no contacts occurred during the simulation.

B-2A Interactions
The comparison of the interactions between the 1B and 1A domains ( Figure 6) reflects the conformational displacement of the 1B domain upon RNA binding. Indeed, none of the numerous contacts detected in the RNA-free form was retained in the RNA-engaged form. In the RNA-free form, most of the interactions were H-bonds or salt bridges, while the only long-lasting contact in the RNA-engaged form (Lys202-Asp483) consisted of a salt bridge between the interacting side chains. Scheme 1. Number of contacts between the five NSP13 domains along the trajectory in the RNA-free (left) and RNA-engaged (right) forms. The domains are colored as in Figure 1C, while the thickness of the lines connecting the domains is proportional to the number of contacts. If no line connecting two domains is observed, then no contacts occurred during the simulation.

A-2A Interactions
An analysis of the contacts between Domains 1A and 2A during the simulations (Figure 7) showed that the presence of RNA did not particularly perturb the interface between the two domains. The list of contacts was about the same in the RNA-free and RNA-engaged forms. However, in the RNA-engaged form, the contacts were in greater numbers, and interactions that were more labile in the RNA-free form became more stable and long-lasting.

RNA-NSP13 Interactions
In the RNA-engaged form of NSP13, the RNA fragment appeared to establish several stable interactions with residues belonging to the Stalk, 1B, 1A, and 2A domains (Figure 8). The list of interactions did not change during the simulation, showing that the RNA fragment was not able to move into the channel at the microsecond timescale. This is consistent with the energy requirements (provided by ATP hydrolysis) required for RNA unwinding, which cannot be simulated by equilibrium molecular dynamics. The first nucleotide was fully exposed to the solvent and did not establish any interactions with NSP13. On the other hand, each of the remaining six nucleotides established one interaction with at least two NSP13 residues.

ADP-NSP13 Interactions
Finally, the interactions of the ADP molecule present in the RNA-engaged form were also inspected. The ADP molecules formed extremely stable interactions (i.e., with a lifetime larger than 90% of the simulation time) with the nearby residues, which were Pro283, Pro284, Gly285, Thr286, Gly287, Lys288, His290, and Lys320 in the 1A domain, together with Arg443, Lys465, Glu540, and Lys569 in the 2A domain. The position and the coordination geometry of the Mg(II) ion was stable throughout the entire 10 µs of the simulation. The ADP molecule was bound in a pocket between Domains 1A and 2A, probably helping to stabilize the interaction between these two domains, in agreement with what was observed in Sections 2.1.2 and 2.3.2 Figure 6. Plot of the interface contact matrix between Domains 1B and 2A in RNA-free (A) and RNAengaged (B) NSP13. Contacts are colored from white to black if they were observed in 0% or 100% of the simulation time, respectively. In Panels (C) (RNA-free) and (D) (RNA-engaged), a detail of the 1B-2A interface taken from the representative structure of the most populated cluster is given. The ribbons are colored as in Figure 1C, while the residues involved in a contact lasting for at least 50% of the simulation time are reported as sticks colored according to the atom type.

A-2A Interactions
An analysis of the contacts between Domains 1A and 2A during the simulations (Figure 7) showed that the presence of RNA did not particularly perturb the interface between the two domains. The list of contacts was about the same in the RNA-free and RNA-engaged forms. However, in the RNA-engaged form, the contacts were in greater Figure 6. Plot of the interface contact matrix between Domains 1B and 2A in RNA-free (A) and RNA-engaged (B) NSP13. Contacts are colored from white to black if they were observed in 0% or 100% of the simulation time, respectively. In Panels (C) (RNA-free) and (D) (RNA-engaged), a detail of the 1B-2A interface taken from the representative structure of the most populated cluster is given. The ribbons are colored as in Figure 1C, while the residues involved in a contact lasting for at least 50% of the simulation time are reported as sticks colored according to the atom type.

RNA-NSP13 Interactions
In the RNA-engaged form of NSP13, the RNA fragment appeared to establish several stable interactions with residues belonging to the Stalk, 1B, 1A, and 2A domains ( Figure  8). The list of interactions did not change during the simulation, showing that the RNA fragment was not able to move into the channel at the microsecond timescale. This is consistent with the energy requirements (provided by ATP hydrolysis) required for RNA unwinding, which cannot be simulated by equilibrium molecular dynamics. The first nucleotide was fully exposed to the solvent and did not establish any interactions with NSP13. On the other hand, each of the remaining six nucleotides established one interaction with at least two NSP13 residues.  Figure 1C, while the NSP13 residues involved in a contact lasting for at least 50% of the simulation time are reported as sticks, and the RNA nucleotides are shown as balls and sticks. All atoms are colored according to the atom type.

RNA-NSP13 Interactions
In the RNA-engaged form of NSP13, the RNA fragment appeared to establish several stable interactions with residues belonging to the Stalk, 1B, 1A, and 2A domains ( Figure  8). The list of interactions did not change during the simulation, showing that the RNA fragment was not able to move into the channel at the microsecond timescale. This is consistent with the energy requirements (provided by ATP hydrolysis) required for RNA unwinding, which cannot be simulated by equilibrium molecular dynamics. The first nucleotide was fully exposed to the solvent and did not establish any interactions with NSP13. On the other hand, each of the remaining six nucleotides established one interaction with at least two NSP13 residues.  Figure 1C, while the NSP13 residues involved in a contact lasting for at least 50% of the simulation time are reported as sticks, and the RNA nucleotides are shown as balls and sticks. All atoms are colored according to the atom type.  Figure 1C, while the NSP13 residues involved in a contact lasting for at least 50% of the simulation time are reported as sticks, and the RNA nucleotides are shown as balls and sticks. All atoms are colored according to the atom type.

Materials and Methods
The model on which the simulations of the SARS-CoV-2 helicase in the RNA-free form were based was the crystallographic structure determined at a resolution of 1.94 Å with two subunits in the asymmetric unit (PDB ID 6ZSL). Generally, the electron density was of high quality, except for the lack of resolution of Residues 95-102, 186-193, 203-206, and 593-601 for Chain A and for Residues 1, 337-339, and 594-601 for Chain B. To correct these unresolved areas, homology modeling was performed using the Modeller 10.0 software [27] so Chain B included the residues not included in Chain A.
The starting structure for the helicase in the RNA-engaged form was the cryo-EMresolved structure with a resolution of 3.10 Å of the SARS-CoV-2 replication-transcription complex bound to the NSP13 helicase-NSP13(2)-RTC-engaged class (PDB ID 7RDY). After this, only the helicase with the bound RNA fragment and the ADP molecule were retained.
The open-source web server H++ [28] was used to predict the protonation state of the proteins, based on the calculated pKa values of titratable groups. For the simulation, the Amber14sb force field [29] and TIP3P water model [30] were used for the protein and water molecules, respectively. For the helicase in the RNA-engaged form, the Mg(II) ion was treated by applying the optimized parameters described by Grotz et al. [31]. GAFF [32] ADP parameters were generated using the Antechamber software [33] included in the Amber 20 suite [34] and converted into GROMACS format by ParmEd. Zn(II) ions were treated by using the non-bonding parameters described by Macchiagodena et al. [35]. According to the same procedure, the Zn(II) binding cysteine and histidine residues were modified in order to account for the polarization effects of the cation.
The resulting structures were then placed in a truncated octahedral water box, where a solvent buffer zone of 10 Å was set by using the tools included in the GROMACS 2020.1 suite [36][37][38]. The systems were neutralized by adding Na + ions by using the genion tool included in GROMACS 2020.1. Analogously, additional Na + and Cl − ions were placed in the water box to achieve physiological ionic strength (150 mM). The systems were energy-minimized with the "steepest descents" algorithms for 1500 steps and then equilibrated at 300 K and 1 atm by performing 1 ns of gradual annealing using GROMACS 2020.1. Next, to equilibrate the solvent around the protein, positional restraints were imposed on the protein, and each system was equilibrated for 50,000 steps with a time step of 2 fs at 300 K. The system's pressure was coupled to a Berendsen [39] barostat, and the temperature was coupled to a V-rescale thermostat [40]. The LINCS algorithm [41] was used for imposing constraints on the hydrogen-containing bonds. Periodic boundary conditions (PBC) were applied and the Particle Mesh Ewald (PME) method [42] was used to calculate the electrostatic interactions. The cutoff values for the real part of the electrostatic interactions and for the van der Waals interactions were set to 10 Å. In the 10-µs-long production runs, the same protocol was adopted, with the exception of the use of the Parrinello-Raman barostat [43]. In a protein, interdomain movements not involving large conformational rearrangements usually occur on the sub-microsecond timescale [44,45]. For this reason, a simulation 10 µs long was deemed sufficient to provide the necessary information to study the dynamic behavior of NSP13's domains in the RNA-free and RNA-engaged forms and the interdomain interactions. All calculations were performed at CINECA on the MARCONI100 accelerated cluster, based on nodes consisting of two IBM Power9 processors and four NVIDIA Volta GPUs per node. The dynamic performance on this architecture was typically 200 ns/day using 1 MARCONI100 node for each simulation.

Conclusions
In this work, extensive all-atom plain MD simulations were used to study the helicase (NSP13) of SARS-CoV-2 in its RNA-free form and RNA-engaged form. What emerged from the analyses was the main contribution of Domains 1A and 2A, the functional cores of the helicase, which make a major contribution to the stability and flexibility of the system of which they are part. Our simulations also highlighted the role of Domain 1B, which is responsible for forming the cavity in which the single RNA strand is lodged. The role of the ZBD domain as a domain supporting the functional activity of the helicase was also observed. The study showed a difference in the stability of ZBD domain between the RNA-free and RNA-engaged forms. As explained above, the ZBD is involved in the interaction with the other NSPs of the RTC. The differences highlighted during the simulations suggested that the presence of RNA can modify the dynamics of this domain, leading to the major stability of the interactions and promoting the activity of the complex itself. These results can be useful, not just to unveil the dynamics of NSP13 in the presence or absence of RNA, but also to define the main residues responsible for the stability of the system, representing putative regions to consider as targets for the development of new drugs. In particular, the NSP13 region defined by the residues found in contact with the RNA fragment ( Figure 8) in the RNA-engaged form, as well as the residues mainly contributing to the interaction between the 1B and 2A domains in the RNA-free form (Glu162, Arg178, Gly203, and Asp204 from the 1B domain, and Lys524, Thr530, Ser517, Arg490, and Lys524 on the 2A domain) can be used in the search for putative binding pockets for the design of potential NSP13 inhibitors.
After these observations, the analyses moved on to a more structural level, investigating the interactions existing among the various domains in both systems under examination. The residues responsible for the domain-domain interactions were identified, outlining a profoundly different picture between NSP13 in the RNA-free and RNA-engaged forms. Looking ahead, it would be inspiring to run a molecular dynamics simulation of the unfolding of the RNA's double helix to observe what the arrangement of the protein domains interacting with the RNA would be.