In Silico Discovery of Antimicrobial Peptides as an Alternative to Control SARS-CoV-2

A serious pandemic has been caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The interaction between spike surface viral protein (Sgp) and the angiotensin-converting enzyme 2 (ACE2) cellular receptor is essential to understand the SARS-CoV-2 infectivity and pathogenicity. Currently, no drugs are available to treat the infection caused by this coronavirus and the use of antimicrobial peptides (AMPs) may be a promising alternative therapeutic strategy to control SARS-CoV-2. In this study, we investigated the in silico interaction of AMPs with viral structural proteins and host cell receptors. We screened the antimicrobial peptide database (APD3) and selected 15 peptides based on their physicochemical and antiviral properties. The interactions of AMPs with Sgp and ACE2 were performed by docking analysis. The results revealed that two amphibian AMPs, caerin 1.6 and caerin 1.10, had the highest affinity for Sgp proteins while interaction with the ACE2 receptor was reduced. The effective AMPs interacted particularly with Arg995 located in the S2 subunits of Sgp, which is key subunit that plays an essential role in viral fusion and entry into the host cell through ACE2. Given these computational findings, new potentially effective AMPs with antiviral properties for SARS-CoV-2 were identified, but they need experimental validation for their therapeutic effectiveness.


Introduction
Coronaviridae is an enveloped virus family containing positive single-stranded RNA that includes the human coronaviruses (HCoV), identified as causative agents of a wide array of illnesses, including respiratory, enteric, hepatic, and neurological diseases [1][2][3]. Recently, outbreaks of Middle East respiratory syndrome (MERS) and severe acute respiratory syndrome (SARS), caused by Betacoronaviruses (βCoV), MERS-CoV, and SARS-CoV, respectively, emerged and caused outbreaks of severe human respiratory diseases [1][2][3]. In December 2019, a novel HCoV, designated as SARS-CoV-2, was first reported as an atypical pneumonia in Wuhan, China, called COVID-19 [4,5]. Because of its person-to-person transmission and the rapidly increasing number of infected patients worldwide, the World Health Organization (WHO) characterized COVID-19 as a pandemic to promote the implementation of comprehensive strategies for treating and protecting patients [6].
In this study, we investigated the in silico interaction between AMPs and structural glycoproteins of SASR-CoV-2. In particular, we explored the virus-associated protein-peptides docking by focusing on S glyocoprotein of SARS-CoV-2 and a subset of AMPs with particular physicochemical properties. These computational findings are thought to identify new potentially effective molecules with antiviral properties for SARS-CoV-2. Additionally, protein-peptides dockings between the host cell receptor ACE2 and AMPs were performed to evaluate the selectivity of the peptides for viral proteins.

AMPs Clusters, Structural Prediction and Validation
A total of five AMPs clusters, according to their physicochemical characteristics, including net charge, sequence length, percentage of hydrophobicity, and secondary structure, were initially obtained via the K-Means algorithm with a K = 5 ( Table 1). All AMPs included in these clusters showed specific physicochemical characteristics and experimental antiviral activity according to APD3 database [18]. Given their physicochemical properties, AMPs included in the cluster four were selected to perform the peptide-protein interaction in order to determine their accuracy in binding with Sgp of SARS-CoV-2, and the host cell receptor ACE2 (Table 1). This cluster included a total of 36 antiviral peptides, which showed three types of secondary structures, including, random coil, α-helix, and beta sheet, net charges that ranged between −3 to 6, and hydrophobicity between 45% and 100% (Table 1).
Additionally, for these AMPs no experimental hemolytic effect and no anti-SARS-Cov-2 activity have been previously reported [18]. Most of peptides included in this cluster belong to a group of naturally occurring AMPs in amphibians, followed by bacteria and mammal ( Figure 1A,B). The phylogenetic analysis showed that the 36 peptides analyzed were clustered into two main clades ( Figure 1C), and one of them included 15 peptides that belong to amphibian Hylidae family ( Table 2). AMPs are naturally occurring peptides produced as a first line of defense against pathogenic infections by frogs [41][42][43]. Aureins, alyteserins, caerins, citropins, and frenatins are the most abundant AMPs families in frogs of Hylidae family and present high diversity in length and antimicrobial spectras [41,42]. In particular, caerins, aurein, uperin, and maculatin are families of AMPs that have shown in vitro activity against bacteria, virus, fungal and parasites, in addition to anticancer effects [44][45][46][47][48][49]. However, their interaction with SARS-CoV-2 have not been previously evaluated. From these 15 peptides (Table 2), ten belong to caerins family, which are characterized to be α-helix cationic peptides with net charges between +1 and +3, hydrophobicity range 53%-56% and lengths ranging 24-25 residues [44-47].
The structural models of these 15 AMPs (Table 2) obtained using the I-TASSER platform, were initially validated using RAMPAGE. A total of four AMPs could be validated, including aurein 1.2, caerin 1.3, caerin 1.5, and uperin 7.1, which showed >98% residues outside the favorable region. Remaining 11 AMPs were then optimized using MODELLER to improve their structure (Table 3).

Coordinates for Gridbox of Target Proteins
The coordinates of the target proteins Sgp protein (6VYB) and host cell receptor ACE2 (1RL4) were obtained with CB-DOCK, as shown in Figure 2 [50]. In this study, we investigated the interactions between AMPs and target proteins SARS-CoV-2 Sgp protein and host cell receptor ACE2. We studied the inhibitory mechanism of a set of AMPs with particular physicochemical characteristics, through peptide-target protein interactions to determine their accuracy in binding with Sgp protein of SARS-CoV-2 and their low affinity for host cell protein ACE2. The binding energies (∆G) for interactions between each peptide and target proteins are summarized in Table 4 for Sgp and receptor ACE2, respectively. All peptides here evaluated interacted with Sgp (Table 4).
In particular for Sgp, the best interactions were observed for caerin 1.6 and caerin 1.10, with a ∆G of −7.5 kcal/mol and −7.7 kcal/mol respectively (Table 4). For caerin 1.6 the residues VAL17, VAL18, and LYS24 interacted mainly with the residues TYR756, ARG995, and THR998 from viral Sgp. Meanwhile, VAL5, PRO19, GLU23, and LEU25 residues from caerin 1.10 interacted with HIS49, THR51, ASN969, and ARG995 residues of Sgp. The ARG995 was the common residue of Sgp for binding of caerins.    In Figure 3 you can see how the SARS-CoV-HR2P control peptides ( Figure 3A) and EK1 ( Figure 3B) in the binding site present a folding on themselves, which is not observed in the 1.6 ( Figure 3C) and 1.10 ( Figure 3D) falls. This is because in the control peptides more intramolecular interactions are generated than caerins.
The main type of interaction of the peptides presented in Table 5 was the formation of hydrogen bridges with Sgp, followed by hydrophobic bonds and finally electrostatic interactions. Table 5 shows a low binding affinity between control peptides and Sgp. On the contrary, caerin 1.6 and 1.10 present better affinity with Sgp, among these caerin 1.10 stands out for with a binding energy of −7.7 kcal/mol.    In particular, VAL17 and VAL18 of caerin 1.6, and VAL5 and GLY7 of caerin 1.10 had significant binding with ARG995 in A, B and C chains of Sgp through hydrogen bonds ( Figure 4). These peptides blocked in particular the S2 subunit, which together with S1 subunit play an essential role in viral fusion, binding and entry into the cell host due to the cleavage of furin proteases [32,51,52]. In fact, the S1/S2 cleavage site contains several arginine residues which indicates high cleavability [53]. These caerins had a low affinity with ACE2 of −5.4 kcal/mol and −5.2 kcal/mol respectively. Regarding the cell host receptor ACE2, Maculatin 1.3 and Uperin 7.1 showed the best interactions with this target protein, with ∆G of −6.4 and −7.1 kcal/mol, respectively (Table 4).
Molecules 2020, 25, x 9 of 21 and cell host receptor ACE2 by AMPs, and the controlling of viral infection by interrupting the viral fusion, cell entry, and viral replication into human cells [3,32,34,52]. Regarding the interaction of control peptides and caerins with ACE2 protein, Figure 5 presents a folding over itself of the control peptides ( Figure 5A,B), this may be attributed to the formation of more intramolecular interactions with respect to caerins. Also, the presence of negatively charged amino acids in the binding site could cause the formation of intramolecular interactions since the control peptides present a net negative charge. Table 6 compares the binding energy and interactions between peptides with the ACE2 protein. Increased formation of hydrogen bridges is observed, followed by hydrophobic bonds and electrostatic interactions. With respect to the previous, the caerin 1.10 presented increased formation of hydrophobic bonds than hydrogen bonds. The main ACE2 protein residues that interact with the peptides are ARG482 (forming saline bridges with the glutamic acid or glutamine residues of the peptides), ASP494, TRP163, LYS174, and TYR613.  From the interaction of arginine residues from the Sgp with residues of caerin 1.6 and caerin 1.10, a probable relationship could be inferred between the blocking of interaction between the Sgp and cell host receptor ACE2 by AMPs, and the controlling of viral infection by interrupting the viral fusion, cell entry, and viral replication into human cells [3,32,34,52].
Regarding the interaction of control peptides and caerins with ACE2 protein, Figure 5 presents a folding over itself of the control peptides ( Figure 5A,B), this may be attributed to the formation of more intramolecular interactions with respect to caerins. Also, the presence of negatively charged amino acids in the binding site could cause the formation of intramolecular interactions since the control peptides present a net negative charge. Given the COVID-19 pandemic, previous studies have shown the in silico and in vitro effectiveness of existing antiviral drugs against SARS-CoV-2, including chloroquine, remdesivir, ivermectins and even antiretrovirals for HIV therapy such as saquinavir [7,8,12,38,[54][55][56]. However, no previous studies have reported the interaction of AMPs with SARS-CoV-2 target proteins. In this respect, some studies have previously evaluated the activity of natural and synthetic peptides, including defensins, plectasins, temporins and cathelicidins, against multiple respiratory viruses, such as influenza A virus H5N1, H1N1, MERS-CoV, and SARS-CoV [27][28][29]39,40,57]. Similar to this study, in silico analyses showed the potent antiviral effects of AMPs against Betacoronavirus [27,39]. According to our results, the AMPs are attractive candidates as alternative to conventional antiviral drugs to control SARS-CoV-2 infection, because they offer several potential advantages, including specific anti-CoV effects, high selectivity, and do not be associated with severe adverse effects according to in vitro and in vivo assays [58][59][60].  Table 6 compares the binding energy and interactions between peptides with the ACE2 protein.
Increased formation of hydrogen bridges is observed, followed by hydrophobic bonds and electrostatic interactions. With respect to the previous, the caerin 1.10 presented increased formation of hydrophobic bonds than hydrogen bonds. The main ACE2 protein residues that interact with the peptides are ARG482 (forming saline bridges with the glutamic acid or glutamine residues of the peptides), ASP494, TRP163, LYS174, and TYR613. Table 6. Comparison of binding energy and interactions between control peptides and caerins to the ACE 2 protein.   Given the COVID-19 pandemic, previous studies have shown the in silico and in vitro effectiveness of existing antiviral drugs against SARS-CoV-2, including chloroquine, remdesivir, ivermectins and even antiretrovirals for HIV therapy such as saquinavir [7,8,12,38,[54][55][56]. However, no previous studies have reported the interaction of AMPs with SARS-CoV-2 target proteins. In this respect, some studies have previously evaluated the activity of natural and synthetic peptides, including defensins, plectasins, temporins and cathelicidins, against multiple respiratory viruses, such as influenza A virus H5N1, H1N1, MERS-CoV, and SARS-CoV [27][28][29]39,40,57]. Similar to this study, in silico analyses showed the potent antiviral effects of AMPs against Betacoronavirus [27,39]. According to our results, the AMPs are attractive candidates as alternative to conventional antiviral drugs to control SARS-CoV-2 infection, because they offer several potential advantages, including specific anti-CoV effects, high selectivity, and do not be associated with severe adverse effects according to in vitro and in vivo assays [58][59][60].

Interaction between EK1 and SARS-CoV-HR2P and Target Viral Protein
Two peptides, EK1 and SARS-HR2P fusion peptide, with experimentally proven activity against SARS-CoV-2 [61][62][63], were used as control to evaluate and compare the interaction of AMPs (caerin 1.6 and caerin 1.10) with Sgp. Both EK1 and the SARS-CoV-HR2P binding to the HR1 domain present in the Sgp S2 subunit [61][62][63]. Table 6 summarizes the comparison of binding energies between both control peptides and Sgp from SARS-CoV-2. The SARS-CoV-HR2P peptide has a binding energy of −5.5 kcal/mol and the EK1 peptide was −5.3 kcal/mol. The negative net charge of the control peptides summarized in Table 7, appears to be present in their glutamic and aspartic residues, these peptides are the ones that interact more with the residues of the pocket located in the S2 subunit, for example, the GLU21 and GLU28 in the case of SARS-CoV-HR2P peptide and the residues GLU15 and GLU35 in the EK1 peptide. Nevertheless, the high presence of these residues in these peptides did not have the best binding energy when compared with the results presented in the docking of the 1.6 and 1.10 caerin with values of −7.5 kcal/mol and −7.7 kcal/mol respectively, this could be attributed to the fact that the pocket has a greater presence of negatively charged residues such as ASP428 and ASP994, allowing residues such as lysine and histidine from the caerins to achieve better results.
Hydrophobic interactions were more common in the caerins compared to the control peptides. The electrostatic interactions marked the difference between caerin 1.6 and 1.10, with the latter with three more interactions which could have marked the difference shown by the binding energies to Sgp. It is also observed a similarity of interactions between the caerins with the control peptide EK1, for example, the ARG995, ASP994, and the ARG44 are residues of the Sgp that present electrostatic interactions with these peptides. The ARG995 also plays an important role not only in the electrostatic interactions of Sgp with these peptides but also participates in the formation of hydrophobic interactions and hydrogen bonds. Threonines of Sgp are frequently involved in the formation of hydrogen bonds, most frequently THR51 and THR998. Alternatively, in the hydrophobic interactions VAL991 is frequently found interacting with residues of the caerins and EK1 peptide. In Table 5 we noticed how almost all the residues of the caerin 1.10 interact with Sgp in the subunit S2 except for S4 and S8, therefore, we decided to modify these serine residues by positive polar residues such as arginine, lysine, and histidine, this because in the pocket of the docking we observed negatively charged residues, all the previous to understand the role of these residues in the peptide. In this way we obtained the synthetic peptide A with ARG4 and ARG8 residues, the synthetic peptide B with H4 and H8 residues, the synthetic peptide C with K4 and K8 residues (Table 7). These residues modified the net charge of the original caerin by increasing it, considering this we modified the original serines of the caerin 1.10 by glycines in such a way that the net charge was not changed in a new synthetic peptide D.
The results of the docking of these synthetic peptides with the Sgp subunit S2 are shown in Table 8, showing that the most notorious change was SR4 and SR8 obtaining a binding energy of −5.0 kcal/mol with the synthetic peptide A that compared to the other synthetic peptides tends to form less hydrogen bridges with Sgp but increases the intramolecular interaction, giving 32 interactions of this type which surpasses the two presented by caerin 1.10. These interactions occur mainly between ARG4 with GLY1 and ARG8 with VAL20, GLU23, GLY7, and ALA22.  The above is seen more clearly in Figure 6A where the caerin 1.10 is deployed in the pocket while Figure 6B shows us how the synthetic peptide A is compacted by intramolecular interactions. Similarly, the synthetic peptides B, C, and D present a greater number of intramolecular interactions than caerin 1.10 with 9, 23, and 16 interactions respectively. In Figure 6 it is shown how these peptides roll up on themselves diminishing the interaction with the pocket residues, showing that the serine residues S4 and S8 of the caerin 1.10 present a smaller intramolecular interaction which favors a smaller binding energy.

Public Datasets
The computational approach performed in this study involved database screening of AMPs from APD3 antimicrobial database for retrieving their amino acid sequence [18]. Additionally, the

Public Datasets
The computational approach performed in this study involved database screening of AMPs from APD3 antimicrobial database for retrieving their amino acid sequence [18]. Additionally, the crystallographic coordinates for structure of the SARS-CoV-2 S Sgp in the prefusion conformation, and the host cell receptor ACE2 were retrieved from the protein structure database RCSB Protein Data Bank, with PDB ID 6VYB [52] and 1R4L [64], respectively.

Database Screening and Selection of Antimicrobial Peptides
The set of AMPs here investigated were retrieved from the APD3 database. This database contains a total of 3178 AMPs from six kingdoms, including bacteria, archaea, protists, fungi, plants, and animals [18]. According to their in vitro antibacterial, antiparasitic, antiviral, and antifungal activity, a total of 800 AMPs were selected from this database. Predicting the molecular bond between the ligand and the target allows a very efficient virtual examination of the key points of the interaction [43,44]. Deep learning techniques are being used more frequently, which have established a new era of large-scale virtual projection with high efficiency and reliability in in-silico drug design [65]. In in silico molecular binding prediction studies using deep neuronal learning, the multitasking approach is more reliable and the use of target sets with high similarity is preferred [43,44]. Moreover, high similarity sequences also allow the identification of key residues in the ligand-receptor interaction, giving the possibility of applying mutagenesis and improving in this case the affinity of the ligand [66,67].
From this set, we selected a list of AMPs according to their physicochemical properties, such as net charge, percentage of hydrophobicity, length, and secondary structure [27,68,69], using a clustering strategy by integration of the K-Means method and algorithm elbow test with R-Project software Version 1.1.463 [70,71]. When the AMPs did not have available information for their secondary structures, these were predicted using the MLRC method of NPS@: network protein sequence analysis (https://npsa-prabi.ibcp.fr/cgi-bin/secpred_mlr.pl) [72,73]. We selected a cluster according to these criteria: experimental antiviral activity but unknown anti-SARS-CoV-2 activity, non-toxic to mammalian cells, and non-hemolytic effects. Finally, for a subset of 15 peptides from the cluster, a phylogenetic tree was constructed based on AMPs sequences by Maximum likelihood using the MEGA X software, and its reliability was evaluated by bootstrap with 1000 replicates.

In Silico Structural Modeling of AMPs and Validation
First, structural models of the AMPs were obtained using the I-TASSER platform [74]. Here, the 3D atomic models of the peptides were obtained using multiple threading alignments against the protein structure database RCSB PDB [74]. Models with higher confidence according to their C-score was selected [74]. From these structural predictions, a total of 100 molecular models were built for each peptide with MODELLER 9.14 using default parameters. Based on the discrete optimized protein energy score (DOPE score), the best probable structures were selected [75]. Additionally, the stereochemical quality of the best models was verified using Ramachandran plots in PROSA web server [76] and RAMPAGE [77]. All selected models had more than 90% amino acid residues in the favored and additional regions allowed. All the structures analyzed in this study were visualized with PyMOL (https://pymol.org/2/).

AMPs-Target Proteins Docking
The binding modes of AMPs with Sgp, and the host cell receptor ACE2 were determined. To this end, the proteins preparation, the peptides preparation, the grid generation, and the peptide-protein docking were performed using Autodock vina software [78]. For the protein preparation, the target proteins were initially pre-processed by removal of water molecules, addition of Kollman charges, optimization of the Hydrogen bond (H-bond), and addition of Gasteiger charges. The coordinates of grid were obtained by CB-DOCK online tool using the prepared ligand and protein. CB-DOCK is a protein-ligand docking method that identifies the binding sites, calculates the center and size, and customizes the docking box size according to the query ligands [50]. The results obtained were analyzed manually by Discovery Studio Visualizer version 2020 [79]. Two peptides with reported activity against SARS-CoV-2, SARS-CoV-HR2P, and EK1 were used as positive controls, both of synthetic origin and targeting Sgp [61][62][63]. Their 3D structures were created with I-TASSER.

Conclusions
In conclusion, the results of this study demonstrated that two AMPs (caerin 1.6 and caerin 1.10) have a very high potential to interact with Sgp, but low affinity for ACE2 protein, which suggested the selectivity of these peptides for viral proteins. These AMPs may potentially block the interaction between SARS-CoV-2 S and cell host receptor ACE2, during viral binding, fusion, and entry to host cells, but they need experimental validation for their therapeutic effectiveness.