Computer-Based De Novo Designs of Tripeptides as Novel Neuraminidase Inhibitors

The latest influenza A (H1N1) pandemic attracted worldwide attention and called for the urgent development of novel antiviral drugs. Here, seven tripeptides are designed and explored as neuraminidase (NA) inhibitors on the structural basis of known inhibitors. Their interactions with NA are studied and compared with each other, using flexible docking and molecular dynamics simulations. The various composed tripeptides have respective binding specificities and their interaction energies with NA decrease in the order of FRI > FRV > FRT > FHV > FRS > FRG > YRV (letters corresponding to amino acid code). The Arg and Phe portions of the tripeptides play important roles during the binding process: Arg has strong electrostatic interactions with the key residues Asp151, Glu119, Glu227 and Glu277, whereas Phe fits well in the hydrophobic cave within the NA active site. Owing to the introduction of hydrophobic property, the interaction energies of FRV and FRI are larger; in particular, FRI demonstrates the best binding quality and shows potential as a lead compound. In addition, the influence of the chemical states of the terminal amino acids are clarified: it is revealed that the charged states of the N-terminus (NH3+) and C-terminus (COO−) are crucial for the tripeptide inhibitory activities and longer peptides may not be appropriate. In addition, the medium inhibiting activity by acetylation of the N-terminus indicates the possible chemical modifications of FRI. Experimental efforts are expected in order to actualize the tripeptides as potent NA inhibitors in the near future.

FRG, FRV, FHV, YRV, FRT, FRS and FRI were designed as NA inhibitors ( Figure S1). The seven tripeptides, optimized with density functional methods, were docked into the NA active site, respectively, and their interaction mechanisms were then studied by explicitly solvated molecular dynamics (MD) simulations. It was found that FRI has the largest interaction energy and matches satisfactorily in the NA active site, which throws new light on the de novo designs of NA inhibitors. Around physiological pH values, the N-terminus and C-terminus of FRI are charged, in the forms of NH 3 + and COO − , respectively. How the chemical states of the termini influence the inhibitor binding at the NA active site is also an interesting topic that requires attention. In addition, the deprotonation and acetylation at the N-terminus (-NH 2 and -NHCOCH 3 ), as well as the amidation at the C-terminus (-CON(CH 3 ) 2 ) were considered, with their structures optimized at the same level of theory. The interaction mechanisms of the three structures with NA were also studied by explicitly solvated flexible docking and MD simulations. The present results can guide synthetic and medicinal chemists to discover potent peptide-based antiviral drugs.

Results and Discussion
As the backbone-atom root-mean-square deviations (RMSD) in Figure 1 indicate, all the tripeptide-NA complexes that have been energy-minimized remain stable throughout the 1.0 ns MD simulations, consistent with the previous MD results of other NA inhibitors [37][38][39][40][41]. Accordingly, the geometric and energetic analyses are made on the average structures of 500~1000 ps MD trajectories, where the docked complexes are already at equilibrium. The superposed structures in Figure 2 show that the seven tripeptides are in close space at the NA active site, in terms of favorable interaction energies and geometrical matching qualities. It means that these tripeptides occupy the identical binding pocket of the NA protein. However, their binding poses differ somewhat from each other, which will be discussed in the following sections.  The tripeptides superposed at the NA active site. The Connolly surfaces of the NA active-site (in grey) are created using the InsightII 2005 scripts. The tripeptides are represented by stick models. Ribbon colors: Helices (including α-, 3 10 -and π-helix), hydrogen-bonded turns, extended strands and random coils are in red, blue, yellow and green, respectively.

The Tripeptides FRG and FRV as the NA Inhibitors
On basis of the structures of known anti-influenza virus drugs such as oseltamivir carboxylate and BA [2,14,37,38] and the electrostatic, steric and lipophilic characteristics of NA active sites [1,2,13,23,37,38,42,43], tripeptide FRG was first designed as NA inhibitor. That is, the simplest amino acid Gly is at its C-terminus, which is expected to orient towards the Arg triad (Arg118, Arg292 and Arg371). The guanidino group of FRG is assumed to direct to the acidic sub-site consisting of residues Glu119, Asp151 and Glu227. In order to fit the hydrophobic cave of the NA active site, the hydrophobic interactions are introduced at the N-terminus using Phe.
The interaction energy (E inter ) of FRG with NA is calculated at −249.83 kcal mol −1 , where the electrostatic rather than vdW interactions are found to play a dominant role ( Figure 3). As Figure 4a shows, the carboxyl group of FRG has three H-bonds with the positively charged guanidino group of residue Arg152. The guanidino group of FRG forms ionic interactions with the negatively charged carboxyl groups of residues Glu119, Asp151 and Glu227, with one H-bond formed with each residue. The electrostatic contributions (E ele ) of residues Glu119, Asp151, Arg152 and Glu227 amount to −92.93, −132.39, −37.21 and −63.39 kcal mol −1 , respectively (Table S1). Nonetheless, the benzene group of FRG somewhat flips out from the NA active site. The lack of sidechains in Gly (H atoms) does not match with the hydrophobic portion of the NA active site. Accordingly, the C-terminus of FRG (Gly) is improved by the G→V mutation.  Compared with Gly, the sidechain of Val is more suited to the NA active site ( Figure 4). The interaction energy (E inter ) between FRV and NA is equal to −289.88 kcal mol −1 . The FRV has a value 40 kcal mol −1 larger than FRG ( Figure 3). Similar to FRG, the electrostatic rather than vdW interactions dominate during the binding process. The FRV carboxyl group orients towards the Arg triad (Arg118, Arg292 and Arg371) of the NA active site (Figure 4b). The guanidino group of FRV shows strong electrostatic effects with residues Glu119, Asp151, Glu227, Glu276 and Glu277, with the intergrowth of one H-bond to each of residues Glu276 and Glu277. The corresponding electrostatic contributions (E ele ) are calculated at −73.28, −125.36, −68.08, −103.29 and −101.73 kcal mol −1 , respectively (Table S1). In addition, the amino group at the N-terminus of FRV forms one H-bond with residue Asp151. The benzene group of FRV conduces to the hydrophobic contacts with the NA active site. Accordingly, the FRV poses in the NA active site, with a similar manner as the cases of current NA drugs (oseltamivir, zanamivir and peramivir) [2,5,14,38,42,44]. Compared with FRG, more preference of FRV has been observed. The Arg and/or Phe portions of the tripeptides play crucial roles during the binding process, which will be clarified in the following discussions.

The Roles of the Arg and Phe Portions in the Tripeptides
In order to clarify the roles of the Arg and Phe portions in the tripeptide FRV, another two tripeptides FHV and YRV were designed as NA inhibitors. The interaction energy (E inter ) of FHV with NA is calculated to be −254.00 kcal mol −1 (Figure 3). As a result of the disappearance of the Arg portion (R→H mutation), the interaction energy (E inter ) of FHV is about 36 kcal mol −1 lower than that of FRV. The His portion of FHV does not form any H-bond with the active-site residues and rather moves out of the active-site pocket (Figure 5a). In the meantime, the charge transfers of FHV with residues Glu119, Asp151, Glu227, Glu276 and Glu277 decrease, in contrast to the case of FRV, especially residues Glu276 and Glu277 (Tables S1 and S2). Accordingly, the Arg portion is crucial to FRV and responsible for the lower interaction energy of FHV.
For YRV, the interaction energy (E inter ) with NA equals −224.51 kcal mol −1 and is less than either of the tripeptides FRV and FHV ( Figure 3). As shown in Figure 5b, the carboxyl group of YRV deviates from the Arg triad (Arg118, Arg292 and Arg371), which is quite different from the situations of FRV and FHV. Compared with FRV, the electrostatic interactions (E ele ) of the YRV guanidino group with residues Glu276 and Glu277 remarkably reduce to −52.17 and −69.65 kcal mol −1 , respectively (Table S2). Accordingly, the increase of polarity due to the F→Y mutation (FRV to YRV) is unfavorable for the binding process. To summarize, the Arg and Phe portions of the tripeptides, especially the latter, are crucial to the binding process and should be reserved in the design of tripeptide inhibitors.

The Improvement of FRV-Based NA Inhibitors
Among the four tripeptides discussed in Sections 2.1 and 2.2, FRV has the largest interaction energy with the NA protein and its binding pose is similar to those of current NA drugs [2,5,14,38,42,44]. In addition, the Arg and Phe portions of FRV play important roles during the binding process. Accordingly, the improvement of tripeptide-based NA inhibitors was next centered on the mutations of the C-terminus amino acid Val. In this way, tripeptides FRT, FRS and FRI were designed as NA inhibitors. Their binding at the NA active site is shown in Figure 6.
Owing to the addition of hydrophilic property by the V→T mutation, the interaction energy (E inter ) of FRT with NA drops to −255.45 kcal mol −1 ( Figure 3). As Figures 4b and 6a show, the maximal binding differences between FRT-NA and FRV-NA are in that the FRT guanidino group orients towards Asn346 with two H-bonds formed. In addition, the FRT benzene group is somewhat out of the NA active site. Similar to FRV, the charge transfer interactions are observed between FRT and residues Glu119, Asp151, Glu227, Glu276 and Glu277 (Table S3). However, their electrostatic contributions (E ele ) decrease remarkably. Accordingly, FRV instead of FRT suits the NA active site better.
The tripeptide FRS is more hydrophilic than FRV (V→S mutation) and meanwhile its spatial size is less than that of FRT. The interaction energy (E inter ) of FRS with NA is calculated at −250.04 kcal mol −1 , somewhat less than those of FRV and FRT ( Figure 3). The binding pose of FRS at the NA active site is shown in Figure 6b. The FRS carboxyl group is stabilized by residue Arg371, with the formation of three H-bonds. The FRS guanidino group forms two H-bonds with each of residues Asp151 and Glu227. Compared with FRV, the tripeptide FRS has less electrostatic interactions with residues Glu119, Asp151, Glu227, Glu276 and Glu277. Especially the electrostatic contribution (E ele ) of residue Glu276 is merely equal to −45.66 kcal mol −1 (Table S3). More importantly, the orientation of the FRS benzene group deviates somewhat from the NA active site. It indicates that FRS does not better suit the NA active site than FRV. Owing to the increase in spatial size and hydrophobic property introduced by the V→I mutation, FRI moves closer to the NA active-site pocket (Figure 6c). The interaction energy (E inter ) of FRI equals −291.56 kcal mol −1 and is larger than any of the above six tripeptides (Figure 3). Similarly, the electrostatic rather vdW interactions play a dominant role during the binding process. The FRI carboxyl group forms ionic interactions with residues Arg292, Arg371 and Lys432, with two, three and one vigorous H-bonds, respectively (Figure 6c). The electrostatic energies (E ele ) from residues Arg371 and Lys432 amount to −43.88 and −29.62 kcal mol −1 , respectively (Table S3), while these two values are very slight in the case of FRV. The FRI guanidino group shows strong interactions with residues Glu119 and Glu277, with one and two H-bonds formed, respectively. The FRI benzene group fits perfectly with the hydrophobic cave of the NA active site. FRI has complementary properties against the geometrical and biophysical environment of the NA active site, which can also be observed in the cases of current potent NA drugs; e.g., oseltamivir, zanamivir and peramivir [2,5,14,38,42,44].

The Chemical States of the Termini in Tripeptide FRI
For all the seven tripeptides, the electrostatic interactions play a dominant role during their binding processes (Figure 3), which is consistent with the current NA drugs [2,5,14,38,42,44]. The carboxyl groups of the seven tripeptides, except for FRG and YRV, have strong electrostatic interactions with the Arg triad (Arg118, Arg292 and Arg371) and should be fully considered in rational drug designs. The residues Asp151, Glu119, Glu227 and Glu277 of the NA protein contribute greatly in all the seven cases, see the data in Tables S1-S4. In fact, these four residues of the NA protein have already received enough attention from rational drug designs [2,14,42]. The catalytic residue Asp151 is crucial to NA function and the three glutamic acid residues (Glu119, Glu227 and Glu277) are important to stabilize the NA active sites [1,45]. With the low toxicity and viral resistance, the tripeptides throw new light on the rational designs of novel anti-influenza drugs [23]. In particular, the interaction energy (E inter ) of FRI with NA is equal to −291.56 kcal mol −1 and the largest among the seven tripeptides, which is also much larger than that of 4-(N-acetylamino)-5-guanidino-3-(3-pentyloxy) benzoic acid (BA, −160.64 kcal mol −1 ) [37]. Accordingly, the tripeptide FRI shows great potential as an ideal lead compound.
As mentioned in Section 1, the N-terminus of FRI is in the NH 3 + form around physiological pH values (Figure 4a). Besides, the deprotonated (-NH 2 ) and acetylated (-NHCOCH 3 ) forms are also considered and designated as FRI dep and FRI Ac , respectively. The deprotonation causes the FRI dep benzene group to move out of the NA active-site pocket (Figure 7a), with the interaction energy being obviously reduced (Figure 3). The interaction energy (E inter ) between FRI dep and NA is summed to −138.22 kcal mol −1 ; less than half of the normal state (NH 3 + ). The deprotonation also decreases the electrostatic interactions with the NA active-site residues; e.g., residue Glu277, whose electrostatic contribution (E ele ) drops sharply from −108.43 to −19.89 kcal mol −1 (Table S4). In addition, the H-bonds are merely four: two between the FRI dep carboxyl group and residue Arg371 and two between the FRI dep guanidino group and residue Glu119 (Figure 7a). Accordingly, the depronation of the N-terminus NH 3 + group is not favored for the inhibiting activities of the tripeptides.
When the N-terminus of FRI is protected by acetylization (-NHCOCH 3 ), the interaction energy (E inter ) with the NA protein is calculated to be −267.32 kcal mol −1 . It indicates that the acetylization also decreases the inhibiting activity but more slightly than the deprotonation (Figure 3). Compared with the normal state (NH 3 + ), the acetylization causes the carboxyl group to move towards residue Arg371, with the simultaneous formation of one strong H-bond (Figure 7b). The electrostatic energy (E ele ) of FRI Ac with residue Arg371 increases to −62.66 kcal mol −1 (Table S4). However, the ionic interactions with residue Lys432 do not exist anymore. At the same time, the FRI Ac guanidino group has less polar contacts with residues Glu227 and Glu277, with one H-bond formed with each of them. Accordingly, FRI rather than FRI Ac and FRI dep matches satisfactorily with the NA active site. The decrease of interaction energies by the acetylization is in good agreement with experiment data that long peptide chains may not form compact binding complexes with the NA receptors [22,23,25]. Nonetheless, FRI Ac still displays reasonable space orientation at the NA active site and medium inhibiting activity, probably due to that the lost electrostatic binding has been compensated by the hydrophobic effects between its acetyl-CH 3   Similar to the N-terminus, the C-capped FRI is designed as FRI DMA , in order to explore the roles of the C-terminus chemical states during the binding processes. As shown in Figure 3, the C-terminus capping (-CON(CH 3 ) 2 ) causes the interaction energy (E inter ) of FRI DMA with NA to reduce to −153.29 kcal mol −1 , about half of the normal state (COO − ). Compared with FRI, the Phe and Ile portions of FRI DMA nearly move out of the active-site pocket. Only the guanidino group of FRI DMA forms two H-bonds with Asp151 and one H-bond with Glu277 ( Figure 7c). Moreover, the ionic interactions with residues Arg371 and Lys432 are dissolved as well (Table S4). Accordingly, the capping of C-terminus (-CON(CH 3 ) 2 ) disrupts the tripeptide interactions with the NA active sites, in good agreement with previous reports that the charged carboxyl groups (COO − ) are important for anchoring inhibitors in the NA active sites by strong electrostatic interactions [37,38,43,46]. It further indicates that the long peptides may not be suitable to be designed as NA inhibitors.

Computational Methods
The docking and molecular dynamics (MD) simulations were performed with the different modules implemented under InsightII 2005 software package [47] on Linux workstations.

System Preparations
The N9 sub-type neuraminidase (NA) crystal structure (PDB code: 1F8B) was recovered from the RCSB Protein Data Bank [18]. For convenience, it is named NA throughout this work. The calcium ion (Ca 2+ ) and crystal water molecules near the active site were retained in the protein structure. The hydrogen atoms were then added on basis of the expected charge distribution of amino acids at physiological pH values [37,38,46,47]. The particular protonation states of residues with titratable groups were taken with the aid of the Biopolymer module and manual verification [18,37,38,40,47]. Note that the sidechain of residue Asn294 in NA was rotated so that its Oδ1 and Nδ2 atoms of the amide group form H-bonds with the nearby Ala246 O and Arg292 Nε2 atoms, which will improve the agreement with the overall crystal structure [40]. The NA structure was then neutralized with chloride anions [37,38,43,46,48]. The conjugated gradient algorithm was used to optimize the NA structure (Discover 3.0 module), with the consistent-valence force-field (CVFF). The convergence criterion was set to 0.01 kcal mol −1 Å −1 .
All the tripeptides ( Figure S1) were optimized with density functional methods [49,50], and the details can be found in Supplementary Material.

Flexible Docking
The docking simulations were performed by the protocol used in our previous works [37,38,43,46]. The Binding-site module was used to identify the NA active site. Then, the advanced docking program Affinity, combining Monte Carlo (MC) and simulated annealing (SA) methods, was used to determine the optimal orientations of the tripeptides at the NA active sites [51]. A feature for the semi-flexible method is that the ligand and the defined active-site residues were allowed to move freely whereas the rest of proteins were held rigid during the docking process. The potential function was assigned using the CVFF force-field and the non-bonded interactions were described by the Cell-Multipole approach. The solvent effects were considered by solvating the complexes in a large sphere of TIP3P water molecules [52] with the radius of 35.0 Å. Chloride anions were added to neutralize the docked systems [37,38,43,46,48]. The docked complexes were selected on basis of interaction energies and geometrical matching qualities. The selected complexes were further energy-minimized using the conjugated gradient method until converged to 0.01 kcal mol −1 Å −1 .

Molecular Dynamics (MD)
The MD simulations were performed on the energy-minimized docked complexes, using the CVFF force-field in Discover 3.0 module. The canonical ensemble (NVT) was employed. The simulation temperature is 300.0 K (normal temperature), which was controlled by the Langevin thermostat [53]. The integration of the classical equations of motion was achieved using the Verlet algorithm. During the MD simulations, the inhibitors plus a surrounding sphere of 10.0 Å were allowed to move freely whereas the rest were held rigid, consistent with previous works [37,38,46]. The MD trajectories were generated using a 1.0-fs time step for a total of 1000 ps, saved at 1.0-ps intervals. The interaction energies of tripeptides with NA and the respective residues at the NA active site were calculated by the Docking module, over the average structures of 500~1000 ps MD trajectories [51].

Conclusions
In this work, a series of tripeptides were explored as potential neuraminidase (NA) inhibitors. Their interactions with the NA protein were then studied by flexible docking and molecular dynamics (MD) simulations. In addition, the influences were clarified for the chemical states of terminus amino acids. This study will guide synthetic and medicinal chemists to discover potent tripeptides as novel anti-influenza virus drugs.
Based The former has strong electrostatic interactions with residues Asp151, Glu119, Glu227 and Glu277, in good agreement with the data of commercial NA inhibitors; while the latter perfectly fits the hydrophobic cave of the NA active site. Moreover, the addition of proper hydrophobicity facilitates the interactions. Among the seven tripeptides, FRI best matches the NA active site and has the largest interaction energy, obviously superior to the potential drug 4-(N-acetylamino)-5-guanidino-3-(3-pentyloxy)benzoic acid (−160.64 kcal mol −1 ). Accordingly, it is an ideal lead compound for the designs of tripeptide-based NA inhibitors.
The deprotonation or acetylization of the N-terminus NH 3 + group, as well as the amidation of the C-terminus COO − group causes reduction of the binding qualities. Accordingly, the charged forms of the N-and C-termini (i.e., NH 3 + and COO − ) are crucial for the tripeptide inhibitory activities. The longer peptide chains may not form compact binding complexes with the NA protein.
Nonetheless, FRI Ac still shows reasonable spatial orientation at the NA active site and medium inhibiting activity, indicating the possible chemical modifications. We believe that this work will arouse the interest of experimental aspects and result in potent tripeptide-based NA inhibitors in the near future.

Supplemental Material
The details of density functional methods and interaction energies of tripeptides with the NA active-site residues can be found as supplemental material.

Density Functional Calculations
As shown in Figure S1, seven tripeptides FRG, FRV, FHV, YRV, FRT, FRS and FRI were designed, which were then optimized with B3LYP density functional methods within Gaussian03 software [49,[54][55][56][57][58][59]. The standard 6-31G(d,p) basis set was used [50,60]. The N-and C-termini of these tripeptides are protonated and depronated around the physiological pH values; i.e., in the NH 3 + and COOforms, respectively. In addition, three other states were considered for the tripeptide FRI: the N-terminus is neutral (-NH 2 , see FRI dep in Figure S1h), the N-terminus is acetylated (-NHCOCH 3 , see FRI Ac in Figure S1i) and the C-terminus is amidated (-CON(CH 3 ) 2 , see FRI DMA in Figure S1j), respectively. The three structures were also optimized with B3LYP/6-31G (d,p) methods. Frequency calculations at the same level of theory were performed for all the above structures, confirming that they are stable minima on their respective potential energy surfaces (PES) [50].