Binding of GS-461203 and Its Halogen Derivatives to HCV Genotype 2a RNA Polymerase Drug Resistance Mutants

: Hepatitis C Virus (HCV) is reported to develop GS-461203 resistance because of multiple mutations within the RNA-dependent RNA Polymerase (RdRp) of HCV. The lack of a high-resolution structure of these RdRp mutants in complex with GS-461203 hinders efforts to understand the drug resistance. Here we decipher the binding differences of GS-461203 in the wild type and mutated systems T179A or M289L of HCV RdRp Genotype 2a using homology modeling, molecular docking, and molecular dynamics simulation. Key residues responsible for GS-461203 binding were identiﬁed to be Arg48, Arg158, Asp318, Asp319, and Asp220, and that mutations T179A or M289L have caused conformational changes of GS-461203 in the RdRp active site. The afﬁnities of GS-461203 were reduced in T179A system, but it became slightly stronger in the M289L system. Furthermore, we designed two new analogues of GS-461203 which encouragingly induced more stable interactions than GS-461203, and thus resulted in much better binding energies. This present study reveals how a single mutation, T179A or M289L, will modulate GS-461203 binding in HCV RdRp Genotype 2a, while introducing two novel analogues to overcome the drug resistance which may be good candidate for further experimental veriﬁcation.


Introduction
The Hepatitis C virus (HCV) has caused a pandemic which currently infects approximately 180 million people worldwide. HCV has a single-stranded, positive sense RNA genome belonging to genus Hepacivirus of Flaviviridae family [1]. Based on its viral genome sequence, HCV is divided into eight distinct genotypes (GT1-GT8), differing by a maximum nucleotide sequence of 33% between GTs and tenths of subtypes (e.g., 2a-2r) [2][3][4][5], differing by less than 15% of the nucleotide sequence [5][6][7][8]. The GT difference has a direct impact on the antiviral agent chosen for therapy, which highlights the importance of investigating a drug binding mode to a specific GT of HCV [9]. In addition, geographic distribution of GT varies. For instance, subtype 1a is common in the U.S. and in Europe, while subtype 1b is found worldwide [7,10]. The GT2 particularly accounts for 11.0% infections worldwide, which is predominantly found in African and Asian countries, each with 23.7% and 18.6% prevalence, respectively [2,7].
Targeted antiviral agents have dramatically improved the treatment of HCV patients, which includes the drug Sofosbuvir that targets the RNA-dependent RNA polymerase (RdRp) of HCV [11,12]. The RdRp constitutes the catalytic core of the HCV replicase system without any counterpart in mammals, and thus is a prime target for antiviral drug development [13][14][15]. Sofosbuvir is a prodrug which is metabolized into its active form (GS-461203), and is the only approved drug of nucleotide inhibitors which replaced interferon (IFN) for the treatment of HCV [16]. Sofosbuvir is clinically used in all GTs due to the highly conserved RdRp among all GTs and its high barrier of resistance is well known. However, its use in HCV GT2 was reported to cause resistance associated variants (RAV) in the RdRp which impaired GS-461203 activity. The prevalence of RAV in GT2 was reported to be 4.0% [17] with point mutations including T179A or M289L, in addition to S282T, which was common in GS-461203 use in all GTs [17][18][19][20][21]. The prevalence of RAV for T179A and M289L in GT2 were 4.89% and 2.72%, respectively, while the variant of S282T was 0.28% in GT1b [17]. The S282T or M289L substitutions were observed to cause~3-fold increase EC 50 s, while the T179A effect was not significant [18].
Studying drug resistance has been hindered by the unavailability of a high-resolution structure of these RdRp mutant-GS-461203 complexes. Molecular dynamics simulations (MDS) is well appreciated for its ability to provide the binding dynamics, structural and energetic properties at a molecular level within protein-ligand interactions [22]. A previous study revealed the binding and unbinding mechanism of GS-461203 and substrate UTP to HCV RdRp GT2a wild type (WT) and S282T mutation using 200 ns conventional and steering MDS [23]. Here we extended MDS to two more important mutants T179A and M289L in GT2a. More importantly, novel analogues were designed to overcome the drug resistance caused by these mutations.
Here we built the homology models using the PDB structure (PDB ID 4WTG) of Sofosbuvir diphosphate in complex with RdRp of the Hepatitis C Viral GT2a with S15G, E86Q, E87Q, C223H and V312I mutations [24], docked GS-461203 to the homology models, and performed MDS for 2 × 500 ns to evaluate the effects of the single mutations T179 or M289 on GS-461203 binding. The GS-461203 binding behaviors in both WT and mutated HCV RdRp was discussed. Based on the insights, we designed two new GS-461203 analogues and analyzed their binding stabilities to the T179A system using 500 ns MDS. Encouragingly, the two analogues had stronger affinities than GS-461203 toward T179A system, which could be good candidates for further experimental studies.

Ligand Docking
First, we performed molecular docking of the prepared GS-461203 into the RdRp receptor complex in order to evaluate the ability of the Schrodinger Maestro's Extra Precision (XP) Glide Docking methods to reproduce the experimental structure [25,26]. The docked pose of GS-461203 and crystal structure of Sofosbuvir diphosphate was similar ( Figure S1), which indicated that the dock2ing protocol was valid. Subsequently, GS-461203 was docked into T179A or M289L systems. The docked conformation of GS-461203 in WT and mutant systems are similar with little difference seen particularly in oxygen and phosphate atoms. Binding energies predicted by XP-docking were −17.673 kcal/mol, −18.549 kcal/mol, and −17.447 kcal/mol, for the WT, T179A, and M289L, respectively. Figure 1 depicts the docked conformations of GS-461203 in WT and mutants' systems.

The RMSD Values
The RMSD of protein Cα and GS-461203 of each system is depicted in Figures 2 and S2. It was shown that the protein Cα was relatively more stable in WT compared to T179A and M289L, with mean values of 2.108 Å, which was lower than those in T179A (2.631 Å) and M289L (3.109 Å). Meanwhile, GS-461203 had a slight RMSD increase until~250 ns in WT but became stable throughout the rest of the simulation, while it showed a higher RMSD movement since~30 ns in the T179A. In M289L, GS-461203 seems to stabilize after~100ns in the M289L system. During MDS, the mean RMSD of GS-461203 was 1.474 Å, 2.682 Å, and 2.083 Å, in WT, T179A, and M289L, respectively. A similar trend was observed in the individual trajectory that the mean RMSD of protein and ligand in WT system was lower than those in T179A and M289L (Table 1 and Figure 2), which indicated that the protein mutations has impacted the GS-461203 conformation to be more fluctuant.

The RMSD Values
The RMSD of protein Cα and GS-461203 of each system is depicted in Figure 2 and Figure S2. It was shown that the protein Cα was relatively more stable in WT compared to T179A and M289L, with mean values of 2.108 Å , which was lower than those in T179A (2.631 Å ) and M289L (3.109 Å ). Meanwhile, GS-461203 had a slight RMSD increase until ~250 ns in WT but became stable throughout the rest of the simulation, while it showed a higher RMSD movement since ~30 ns in the T179A. In M289L, GS-461203 seems to stabilize after ~100ns in the M289L system. During MDS, the mean RMSD of GS-461203 was 1.474 Å , 2.682 Å , and 2.083 Å , in WT, T179A, and M289L, respectively. A similar trend was observed in the individual trajectory that the mean RMSD of protein and ligand in WT system was lower than those in T179A and M289L (Table 1 and Figure 2), which indicated that the protein mutations has impacted the GS-461203 conformation to be more fluctuant.

The RMSD Values
The RMSD of protein Cα and GS-461203 of each system is depicted in Figure 2 and Figure S2. It was shown that the protein Cα was relatively more stable in WT compared to T179A and M289L, with mean values of 2.108 Å , which was lower than those in T179A (2.631 Å ) and M289L (3.109 Å ). Meanwhile, GS-461203 had a slight RMSD increase until ~250 ns in WT but became stable throughout the rest of the simulation, while it showed a higher RMSD movement since ~30 ns in the T179A. In M289L, GS-461203 seems to stabilize after ~100ns in the M289L system. During MDS, the mean RMSD of GS-461203 was 1.474 Å , 2.682 Å , and 2.083 Å , in WT, T179A, and M289L, respectively. A similar trend was observed in the individual trajectory that the mean RMSD of protein and ligand in WT system was lower than those in T179A and M289L (Table 1 and Figure 2), which indicated that the protein mutations has impacted the GS-461203 conformation to be more fluctuant.

The Protein-Ligand Interaction Analysis
To acquire a protein-ligand interaction during MDS, analysis of the SID was performed. Interaction fraction with corresponding residues and protein-ligand interactions which sustained more than 20% of MDS is depicted in Figures 3, S3 and S4. In the WT system ( Figure 3A), Hbond interactions were observed between the phosphate group of GS-461203, in addition to their interactions through water mediated Hbonds, at residues Arg158 (87%), Arg48 (57%), Arg222 (55%), and Cys223 (34%). The water mediated Hbond interactions was additionally observed between the GS-461203 uridine base with Gly283 (45%), Thr287 (36%) and between Asp225 (39%) with GS-461203 phosphate group. Manganese ions both interacted with 100% of simulation time with oxygen atoms of phosphate group aside from their interactions with Asp318, Asp319, Asp220, and Thr221, each with 100% simulation time, which indicates their role to stabilize GS-461203 binding.
In the T179A system ( Figure 3B), Hbond interactions between GS-461203 with residues Arg158, and Arg48 were observed to increase to 102% and 75% simulation time, respectively. Additionally, the Hbond with Arg222 and Cys223 and the water mediated Hbond interactions between Gly283 and with the GS-461203 uridine base, and between Arg158 and Asp225 with GS-461203 phosphate groups disappeared. Manganese ion interaction with the phosphate group of GS-461203 were also decreased to 51% as compared to WT system, while ionic interactions with Asp318, Asp319, and Asp220 were kept at 100% simulation time. It was clear that the T179A mutation had diminished effects on ligand-protein interactions.

System
Trajectory Mean RMSD of Protein Cα Atoms (Å)

The Protein-Ligand Interaction Analysis
To acquire a protein-ligand interaction during MDS, analysis of the SID was performed. Interaction fraction with corresponding residues and protein-ligand interactions which sustained more than 20% of MDS is depicted in Figures 3, S3 and S4. In the WT system ( Figure 3A), Hbond interactions were observed between the phosphate group of GS-461203, in addition to their interactions through water mediated Hbonds, at residues Arg158 (87%), Arg48 (57%), Arg222 (55%), and Cys223 (34%). The water mediated Hbond interactions was additionally observed between the GS-461203 uridine base with Gly283 (45%), Thr287 (36%) and between Asp225 (39%) with GS-461203 phosphate group. Manganese ions both interacted with 100% of simulation time with oxygen atoms of phosphate group aside from their interactions with Asp318, Asp319, Asp220, and Thr221, each with 100% simulation time, which indicates their role to stabilize GS-461203 binding.
(A)  In the T179A system ( Figure 3B), Hbond interactions between GS-461203 with residues Arg158, and Arg48 were observed to increase to 102% and 75% simulation time, respectively. Additionally, the Hbond with Arg222 and Cys223 and the water mediated Hbond interactions between Gly283 and with the GS-461203 uridine base, and between In the M289L system ( Figure 3C), Hbond interactions between GS-461203 and Arg48, and GS-461203 and Cys223 were also observed with increased percentages of 112% and 44% simulation times, respectively. Alternatively, those with Arg158 were decreased to 57% simulation time. In addition, water-mediated Hbond interactions between the GS-461203 uridine base with Gly283 and Thr287 were observed to slightly increase each with 42% and 31% simulation time, respectively. Additional Hbond interactions were observed with Phe224 (30%) and Lys155 (34%), which did not exist in WT system. The Hbond with Arg222 disappeared in the M289L system. Just as in the WT and T179A systems, Thr221 contributed to Mn ions stabilization. Manganese ions were in close proximity with residues Asp318, Asp319, and Asp220 which were observed to stabilize the phosphate group of GS-461203. It is clearly seen that the binding modes of GS-461203 has changed in the two mutant systems.

Cluster Analysis: The Effect of Mutation on Whole Protein Structure
To acquire the dominant conformations of the protein-ligand complex, we performed clustering analysis on the four systems. The protein alignment between T179A and WT systems from the most populated cluster showed that the T179A mutant has the RMSD and TM-score values of 2.76 Å and 0.90268, respectively, while those between M289L and WT systems resulted in the RMSD and TM-score values of 3.95 Å and 0.82731, respectively. It is clear that the M289L movement was larger than that of T179A, each with reference to WT. Figure 4 displays a comparison of protein structures between WT with each T179A and M289L taken from most populated clusters.
. Pharm. 2022, 90, x FOR PEER REVIEW 6 of 20 42% and 31% simulation time, respectively. Additional Hbond interactions were observed with Phe224 (30%) and Lys155 (34%), which did not exist in WT system. The Hbond with Arg222 disappeared in the M289L system. Just as in the WT and T179A systems, Thr221 contributed to Mn ions stabilization. Manganese ions were in close proximity with residues Asp318, Asp319, and Asp220 which were observed to stabilize the phosphate group of GS-461203. It is clearly seen that the binding modes of GS-461203 has changed in the two mutant systems.

Cluster Analysis: The Effect of Mutation on Whole Protein Structure
To acquire the dominant conformations of the protein-ligand complex, we performed clustering analysis on the four systems. The protein alignment between T179A and WT systems from the most populated cluster showed that the T179A mutant has the RMSD and TM-score values of 2.76 Å and 0.90268, respectively, while those between M289L and WT systems resulted in the RMSD and TM-score values of 3.95 Å and 0.82731, respectively. It is clear that the M289L movement was larger than that of T179A, each with reference to WT. Figure 4 displays a comparison of protein structures between WT with each T179A and M289L taken from most populated clusters. Further, we analyzed the ligand binding sites from best cluster of each system ( Figure  S5) and those from the least populated clusters ( Figure S6). We found out that the GS-461203 maintained its interactions with key residues in WT, T179A, and M289L system. The residues D318, D319, D220 was in close distance with both Manganese ions which further interacted with phosphate group of GS-461203 as also found in experimental [24]. The residues R48 and R158 was also in close distance to the phosphate groups of GS-461203. However, the interactions of fluorine atom of GS-461203 with N291 was more intense in M289L system as indicated by the distance of 2.73 Å , which was much lower than those in WT (5.36 Å ) and T179A (5.95 Å ). In addition, the phosphate group of GS-461203 in M289L system was involved in additional interactions with K155 and F224 with distances of 2.67 Å and 3.76 Å , respectively, which was absent in WT and T179A system. Figure 5 shows the active site conformation of WT, T179A, and M289L systems taken from most populated cluster. Further, we analyzed the ligand binding sites from best cluster of each system ( Figure S5) and those from the least populated clusters ( Figure S6). We found out that the GS-461203 maintained its interactions with key residues in WT, T179A, and M289L system. The residues D318, D319, D220 was in close distance with both Manganese ions which further interacted with phosphate group of GS-461203 as also found in experimental [24]. The residues R48 and R158 was also in close distance to the phosphate groups of GS-461203. However, the interactions of fluorine atom of GS-461203 with N291 was more intense in M289L system as indicated by the distance of 2.73 Å, which was much lower than those in WT (5.36 Å) and T179A (5.95 Å). In addition, the phosphate group of GS-461203 in M289L system was involved in additional interactions with K155 and F224 with distances of 2.67 Å and 3.76 Å, respectively, which was absent in WT and T179A system.

MM-GBSA Binding Energy
The MM-GBSA binding energy was calculated to assess the effectiveness of the mutation on the binding energy of GS-461203. The total binding energy for the WT system was −45.5 kcal/mol, which was lower than that in the T179A (−35.0 kcal/mol) system but was higher than that in the M289L system (−54.0 kcal/mol). The binding energy changes between the WT system and those in the T179A and M289L systems were 10.5 kcal/mol and 8.5 kcal/mol, respectively. Clearly, among the two mutations, T179A mutation resulted in the lower affinity of GS-461203, but a mutation in the M289L system resulted in a slightly stronger affinity of GS-461203. Table 2 shows the binding energy calculated for last 50 ns of each system. It was observed that the electrostatic energy (∆E ele ) is the dominant factor when considering favorable binding contributions, each with −35.3 kcal/mol, −28.7 kcal/mol, and −55.1 kcal/mol in WT, T179A, and M289L, respectively. The differences of electrostatic interactions (∆∆E ele ) were 6.6 kcal/mol and 19.8 kcal/mol in T179A and M289L systems, each compared to the WT system. The electrostatic energies were more positive in the T179A system, but it became more favorable in M289L compared to WT system. The possible explanations are the disappearance of Hbonds with Arg222 and Cys223, and water-mediated Hbond interactions with Gly283, Thr28, Arg158, and Asp225 in the T179A system ( Figure 3B). The increased percentage of the electrostatic interactions with Arg48, Cys223, Gly282, Thr287 and the newly formed Hbond interactions with Phe224 and Lys155 in the M289L system ( Figure 3C) seem to induce more favorable electrostatic interactions compared to the WT. We also calculated the binding energy for S282T system, and we found out the total binding energy for the S282T system was −41.3±5.4 kcal/mol (Table S1). The binding free energy difference between S282T and WT in our mmgbsa calculation is 4.2 kcal/mol, which is closer to the experimental value (3-fold difference in EC 50 , ∆∆G = 0.6 kcal/mol) than that from the previous study (10.8 kcal/mol) [23].
Meanwhile, the van der Waals interactions were more favorable in the WT system (−6.0 kcal/mol) than in T179A (−2.3 kcal/mol) and M289L (4.5 kcal/mol). Clearly, the mutations have significantly reduced the van der Waals interactions, particularly in the M289L system with 10.5 kcal/mol change compared to the WT system, respectively. This may be due to the reduced van der Waals interactions with Ile160 in M289L system (Figure 3), which existed in the WT system. Meanwhile, the lipophilic interaction (∆E lipo ) did not change significantly after introducing mutations.

Protein RMSF Analyses
The RMSF of protein Cα atoms were compared between protein WT and the mutants. Each prominent RMSF value was assigned with black, green, and blue arrows in WT, T179A, and M289L systems, respectively (Figures 7 and S7). The location of point mutation in each system was assigned with orange asterisks. The highest peaks are residues K151 of WT in both T179A and M289L systems excluding the carboxyl end of the protein. In M289L, the highest peak was at P22, while the highest peak of T179A was at A97, without considering the fluctuation of the carboxyl end. Concisely, mutation in T179A and M289L has the largest impact on residues A97, K151, and P22, respectively. In M289L, the highest peak was at P22, while the highest peak of T179A was at A97, without considering the fluctuation of the carboxyl end. Concisely, mutation in T179A and M289L has the largest impact on residues A97, K151, and P22, respectively.

Ligand RMSF Values
The RMSF values of the ligand atoms were recorded as depicted in Figures 8 and S8. The RMSF of GS-461203 in WT system was lower with mean RMSF values (1.093 Å ) compared to those in T179A (1.555 Å ) and M289L (1.505 Å ) systems, which indicated that GS-461203 was more stable in the WT system compared to in T179A and M289L systems. It is shown that the most fluctuant atoms were those in the the phosphate group (atom numbers 28-30).

The GS-461203 Dihedral Angle Profiles
The dihedral angle of GS-461203 was depicted in Figure 9. It was shown that the rotatable bonds, particularly those colored in light blue, light green, yellow, light purple, red, and orange show a wider distribution in T179A and M289L systems, each as compared to the WT system. The wider dihedral angle distributions were also observed in pink for the T179A and M289L systems. Likewise, the green in T179A system, while the purple one shows similar distribution in both WT and mutant systems. In brief, the mutant systems induced a wider distribution of dihedral angles within GS-461203.

Ligand RMSF Values
The RMSF values of the ligand atoms were recorded as depicted in Figures 8 and S8. The RMSF of GS-461203 in WT system was lower with mean RMSF values (1.093 Å) compared to those in T179A (1.555 Å) and M289L (1.505 Å) systems, which indicated that GS-461203 was more stable in the WT system compared to in T179A and M289L systems. It is shown that the most fluctuant atoms were those in the the phosphate group (atom numbers 28-30).
In M289L, the highest peak was at P22, while the highest peak of T179A was at out considering the fluctuation of the carboxyl end. Concisely, mutation in M289L has the largest impact on residues A97, K151, and P22, respectively.

Ligand RMSF Values
The RMSF values of the ligand atoms were recorded as depicted in Figur The RMSF of GS-461203 in WT system was lower with mean RMSF values (1.0 pared to those in T179A (1.555 Å ) and M289L (1.505 Å ) systems, which indicat 461203 was more stable in the WT system compared to in T179A and M289L sy shown that the most fluctuant atoms were those in the the phosphate group ( bers 28-30).

The GS-461203 Dihedral Angle Profiles
The dihedral angle of GS-461203 was depicted in Figure 9. It was shown tatable bonds, particularly those colored in light blue, light green, yellow, li red, and orange show a wider distribution in T179A and M289L systems, ea pared to the WT system. The wider dihedral angle distributions were also o pink for the T179A and M289L systems. Likewise, the green in T179A system purple one shows similar distribution in both WT and mutant systems. In bri tant systems induced a wider distribution of dihedral angles within GS-46120

The GS-461203 Dihedral Angle Profiles
The dihedral angle of GS-461203 was depicted in Figure 9. It was shown that the rotatable bonds, particularly those colored in light blue, light green, yellow, light purple, red, and orange show a wider distribution in T179A and M289L systems, each as compared to the WT system. The wider dihedral angle distributions were also observed in pink for the T179A and M289L systems. Likewise, the green in T179A system, while the purple one shows similar distribution in both WT and mutant systems. In brief, the mutant systems induced a wider distribution of dihedral angles within GS-461203.

The Secondary Structure of Protein
The SSE of protein monitored during MDS were plotted in Figure 10, which displays the distribution of SSE including Alpha-helices and Beta-strands by residue. The total percentage of SSE for WT, T179A, and M289L were 47.40%, 46.09%, and 46.12%, respectively.

The Secondary Structure of Protein
The SSE of protein monitored during MDS were plotted in Figure 10, which displays the distribution of SSE including Alpha-helices and Beta-strands by residue. The total percentage of SSE for WT, T179A, and M289L were 47.40%, 46.09%, and 46.12%, respectively.
The Alpha-helices composed 37.68% in WT system, which was comparable with 36.25% and 36.41% in T179A and M289L system, respectively, which indicated that the mutation had minor effects on SSE of protein. The Alpha-helices composed 37.68% in WT system, which was comparable with 36.25% and 36.41% in T179A and M289L system, respectively, which indicated that the mutation had minor effects on SSE of protein.
WT T179A M289L Figure 10. Protein SSE for the WT, T179A, and M289L systems during MDS. The alpha helices, beta sheets, and random coil were represented by red, blue, and white spaces.

Designing New Analogues
To improve GS-461203′s binding to the T179A mutation, we designed new GS-461203 analogues by substituting the fluorine atom on the 2′ position for either chloride or iodide atoms. The new structures of GS-461203 analogues are depicted in Figure 11.

Designing New Analogues
To improve GS-461203 s binding to the T179A mutation, we designed new GS-461203 analogues by substituting the fluorine atom on the 2 position for either chloride or iodide atoms. The new structures of GS-461203 analogues are depicted in Figure 11. The Alpha-helices composed 37.68% in WT system, which was comparable with 36.25% and 36.41% in T179A and M289L system, respectively, which indicated that the mutation had minor effects on SSE of protein.
WT T179A Figure 10. Protein SSE for the WT, T179A, and M289L systems during MDS. The alpha helices, beta sheets, and random coil were represented by red, blue, and white spaces.

Designing New Analogues
To improve GS-461203′s binding to the T179A mutation, we designed new GS-461203 analogues by substituting the fluorine atom on the 2′ position for either chloride or iodide atoms. The new structures of GS-461203 analogues are depicted in Figure 11. Figure 11. The designed GS−461203 analogues. Figure 11. The designed GS−461203 analogues.

CompA CompB
The two new analogues were submitted for MDS each for a duration of 500 ns. The RMSD values of Cα protein are depicted in Figure S9. It was seen that the protein Cα RMSD in T179A-CompA (2.324 Å) and T179A-CompB (2.199 Å) were lower than that of the T179A system (2.631 Å). Meanwhile, the ligand RMSD of GS-461203 in T179A was 2.682 Å, which was higher than those in T179A-CompA (1.869 Å) and T179A-CompB (2.173 Å).
In the T179A-CompA and T179A-CompB systems ( Figure S10), Hbond interactions between GS-461203 and Arg158, and GS-461203 and Arg48 were also observed with high percentages of simulation time. The interaction between Manganese ions and the phosphate group of GS-461203 was increased to 100% simulation time, compared to the T179A system. In addition, Mn ions interactions with Asp318, Asp319, and Asp220 were kept at 100% simulation time. Moreover, the water-mediated Hbond interactions with Gly283 and Thr287 were observed with each 61% simulation in T179A-CompA system, which was absent in T179A system; while in T179A-CompB system, interactions with Thr287 were observed with a 58% simulation time. The newly formed Hbond with Thr221, Cys223, Asp225, and Phe224 were observed with high percentage of simulation time in T179A-CompA system while in the T179A-CompB system, interactions with Thr221, Asn291, Arg280, and Phe224 were recorded high percentages of simulation time.
The MM-GBSA binding energies are shown in Table 3. The total binding energy for the T179A system (−35.0 kcal/mol) was much higher than T179A-CompA (−73.2 kcal/mol) and T179A-CompB (−74.0 kcal/mol). The binding energy differences between T179A system and those in T179A-CompA and T179A-CompB systems were 38.2 kcal/mol and 39 kcal/mol, respectively. Clearly, the newly designed analogues have stronger affinities than that of GS-461203 in the T179A system. It is clear that the electrostatic energy (∆E ele ) is the dominant factor in favorable binding contributions, each with −28.7 kcal/mol, −53.0 kcal/mol, and −53.6 kcal/mol in WT, T179A-CompA, and T179A-CompB, respectively. The differences of electrostatic interactions (∆∆E ele ) were 24.3 kcal/mol and 24.9 kcal/mol in the T179A-CompA and T179A-CompB systems, each compared to the T179A system. The electrostatic energies have largely increased in T179A-CompA and T179A-CompB systems. The possible explanation of this is that the additional Hbond and Hbond water mediated interactions with Gly283 and Thr287, Thr221, Cys223, Asp225, and Phe224 in the T179A-CompA system, and interaction with Thr287, Thr221, Asn291, Arg280, and Phe224 in the T179A-CompB system.
Meanwhile, the van der Waals interactions were more favorable in the T179A-CompA and T179A-CompB systems (−14.4 kcal/mol and −13.3 kcal/mol, respectively) than in the T179A system (−2.3 kcal/mol). The newly designed analogues have significantly increased van der Waals interaction with the protein, with 12.1 kcal/mol and 11.0 kcal/mol differences, respectively, as compared to the T179A system ( Figure S11). This was also seen for lipophilic interactions (∆E lipo ) which were seen to be improved in the new analogues.
The RMSF of protein Cα atoms was compared between the T179A system and the T179A-CompA and T179A-CompB systems ( Figure S12), which showed similar modes of fluctuations. The RMSF values of ligand atoms were recorded as depicted in Figure S13. The RMSF of GS-461203 in T179A-CompA (1.022 Å) and T179A-CompB (1.036 Å) systems were lower than that in T179A system (1.555 Å), which indicate that the newly designed analogues induce more stable conformation of T179A.

Prediction ADME Properties
The ADME properties of compounds as predicted by SwissADME web server are displayed in Table 4 and Figures S15-S20. As the GS-461203 is a prodrug of sofosbuvir, the ADME properties were predicted for Sofosbuvir and its derivatives in which fluorine atom was substituted by chlorine or iodine atoms. The fluorine substitution by chlorine or iodine resulted in Log S (ESOL) values of −3.53 and −4.44 which grouped as soluble and moderately soluble, respectively, which were comparable with Sofosbuvir Log S (ESOL) value of −3.27 (soluble). The other ADME properties of Sofosbuvir derivatives, including GI absorption, BBB permeant, and cytochrome inhibition, are all the same as Sofosbuvir. Further, the fluorine substitution to chlorine or iodine atoms in CompA and CompB, respectively, affected their solubilities. The Log S (ESOL) values for GS-461203, CompA, and CompB were 0.57, 0.31, and −0.61, respectively. Both GS-461203 and CompA were grouped as highly soluble, while CompB was categorized as very soluble, as predicted by the ESOL model [27]. It is clear that the ADME properties of designed compounds did not change significantly as compared to Sofosbuvir. reduced electrostatic interactions. In the M289L system, the mutation induced much more positive van der Waals interactions, however, the effect was compensated by much more negative electrostatic interactions, which resulted in a slight increase in GS-461203 affinity. In the case of M289L, which is located far enough from the active site of RdRp, the mutation effect slightly increased the GS-461203 affinity which was mostly due to increased electrostatic interaction.
In this study, the two newly designed halogen analogues of GS-461203 showed much stronger affinities than GS-461203 alone, which was due to more intense protein-ligand interactions as compared to the GS-461203. The previous systematic studies [34,35] with a detailed database survey and quantum chemistry calculation have suggested that halogen bonds could play roles not only in drug-improving target binding affinity but also in tuning ADME/T property. Our binding and ADME prediction data are consistent with these studies. In particular, the binding free energy predictions of each analogue in complex with T179A mutant showed a significant increase in to the ∆G bind due to major contribution from the electrstatic interactions (∆E ele ). This can be explained by the newly formed Hbond and Hbond water mediated interactions with Gly283 and Thr287, Thr221, Cys223, Asp225, and Phe224 in the T179A-CompA system, and interaction with Thr287, Thr221, Asn291, Arg280, and Phe224 in the T179A-CompB system. The van der Waals interactions (∆E vdw ) were also more favorable in the analogues. This can possibly be explained by the increase in atomic/orbital size around the halogenated site seen with chlorine and iodine. This helps with shifting electron density towards the sugar ring in addition to the surrounding protein residues which can afford a higher opportunity for instantaneously induced dipole interactions. Additionally, the inclusion of more electron-rich atoms at the 2 position can have a large effect on the overall ligand geometry and surface area for opportunity of residue contact. This is seen with the substitution of a fluorine atom for chlorine or iodine in CompA or CompB, respectively, which further stabilized the sugar ring and CH 3 (atom 7) seen in the ligand RMSF plots in Figure S13. However, the improvement in stability and affinity of these analogues towards T179A comes with a predicted decrease in water solubility particularly with CompB.

Materials and Methods
The PDB structure of Sofosbuvir diphosphate in complex with RNA-dependent RNA polymerase of the HCV GT2a was downloaded from RCSB Protein Data Bank with PDB ID 4WTG. As the structure contained a series of mutated residues (S15G, E86Q, E87Q, C223H and V312I), all the mutations were converted back to the WT sequence by homology modeling. The structure was then prepared using the Maestro's Protein Preparation Wizard [36] with default parameters, while structures of T179A or M289L were each then prepared by introducing the point mutation to the WT system. We also prepared a system for S282T mutants as this substitution is common in all GTs.
Structure of Sofosbuvir diphosphate was downloaded from the RCSB Protein Data Bank via PDB ID 4WTG [24]. GS-461203 was prepared using Maestro including the addition of one phosphate group and generating ionization states at pH = 7 by using Epik's pKa calculations [36]. Ligands were then relaxed through optimization and minimization. GS-461203 was then docked with XP precision into the WT, T179A, S282T, and M289L receptor protein structures.

Molecular Dynamics Simulation
Four separate systems of the protein structure in complex with GS-461203 including WT, T179A, M289L, or S282T were prepared for two independent MDS runs (500 ns of each). Systems were solvated in a simple point-charge (SPC) [37] orthorhombic water box with a 10 Å water buffer and the addition of Na+ and Cl− ions in order to reach 0.15 M NaCl. The OPLS_2005 force field [38] was used to model the protein using Desmond System Builder with Maestro's 2019-2 update on Linux operating system.
The MDS were performed with Schrodinger Maestro's Desmond simulation package [39,40]. Relaxation and energy minimization were performed using the default protocol to reduce possible steric stress as explained in our previous paper [41].

Analysis of MD Simulation
Analysis of MDS trajectory was performed using the Desmond simulation interaction diagram (SID) analysis tool which reports protein-ligand root mean squared deviation (RMSD), protein-ligand contacts, protein root-mean squared fluctuation (RMSF), changes in secondary structure elements (SSE) during the simulation, and ligand torsion profiles. The protein and ligand RMSD plots were analyzed to ensure the convergence of the MDS. In addition, cluster analysis was performed using the Desmond trajectory clustering analysis tool [42]. The backbone RMSD matrix is used as the basis of structural similarity and the clustering with average linkage was cut off at 2.5 Å. The centroid structure of the protein-ligand complex was used to represent each structural family. Structural families with frames > 1% of the total frames were considered separate structural families with separate centroid structures. To evaluate the effect of T179A and M289L mutations on whole protein structure, TM-align web-server was used, in which TM-score 0.0 to 0.3 and RMSD ≥ 5 Å were employed to assess the structural similarity between wild type and mutant systems [43].

Binding Energy Calculations
Binding energy calculation was performed using Molecular Mechanism-Generalized Born Surface Area (MM-GBSA) method employing the MM-GBSA Prime module of Schrodinger Maestro's Desmond simulation package. The validity of MM-GBSA method has been previously reported [41]. The snapshots of the last 50 ns of each three trajectories of the simulation were used for the calculation. The OPLS3e force field, a VSGB 2.0 solvation model, and the default Prime protocol was used [44]. The MM-GBSA_Prime protocol includes receptor, ligand, and receptor-ligand complex minimizations. The final binding energy was calculated using equation: ∆E bind = E complex − (E ligand + E receptor ), in which the binding energy was broken down into three components: E electrostatic , E vdW , and E lipophilic . E electrostatic was calculated by summing E H-bond and E coulumbic . E vdW summated E vdW , E pi-pi stacking , and E self-contact .

ADME Prediction
Prediction of ADME properties for compounds was performed using SwissADME web server [45].

Conclusions
This study was performed with the goal of understanding the molecular details of GS-461203 binding to the HCV RdRp GT2a in WT and mutant systems using homology modeling, molecular docking and MDS. In this study, the key residues interacting with GS-461203 include Arg48, Arg158, Asp318, Asp319, and Asp220, which were shown to be in line with the experimental results. It is shown that the binding modes of GS-461203 have changed in mutant systems and resulted in reduced affinities in the T179A system. However, it induced a stronger affinity in the M289L system, which needs further detailed studies to gain a deeper understanding of M289L system. In the case of T179A, the low affinity was due to weaker electrostatic interaction. Further, we designed two new analogues of GS-461203 and found out that the two compounds induced more stable interactions towards T179A than GS-461203. This was also demonstrated by the MM-GBSA prediction as much more favorable binding energies were obtained, which was due to the improved electrostatic, van der Waals, and lipophilic contributions, however further biological assays are needed to validate the potentials of the GS-461203 analogues.