Theoretical evaluation of novel Thermolysin inhibitors from Bacillus thermoproteolyticus as possible antibacterial agents

The search for new antibacterial agents that could decrease bacterial resistance is a subject that is continuously developing. The Gram-negative and Gram-positive bacteria have a metalloproteins group belonging to the M4 family. That is the main virulence factor of these bacteria. In this work, we have used a computational protocol based on the comprehensive analysis of the results of docking, molecular dynamics simulation, MM-PBSA, ligand efficiency, and ADMETox properties of ligand designed in silico in the previous manuscript using the Thermolysin from Bacillus thermoproteolyticus, a metalloprotein of the M4 family as a target. The principal results obtained were the designed ligands were adequately oriented in the thermolysin active center. The Lig783, Lig2177, and Lig3444 compounds were those with better dynamic behavior, however, when analyzing the results extracted from the ADME-Tox properties, only Lig783 was the best antibacterial agent candidate.


Introduction
The increase in bacterial diseases around the world has been relevant in recent years [1] . It is proposed that for this year 2020, more than 117 conditions of bacterial origin will be described worldwide. Because of the increase in bacterial infections, the World Health Organization (WHO) assigned the highest priority to the research and development of antibacterial drugs to combat Grampositive and Gram-negative bacteria [2] .
Bacteria of the Vibrio, Legionella, Clostridium, Listeria, Staphylococcus, and Pseudomonas genus are critical factors of various diseases such as cholera [3,4] , ulcerative gastritis [5] , and gastric carcinoma [6] , affecting millions of people around the world. These bacteria have a significant group Figure 1. Best poses alignment of our ligands designed in silico (yellow) compared to 5H9 compound reference (red) into Thermolysin pocket.
As shown in Figure 1, the docking poses obtained were adjusted acceptably with available inhibitory X-ray crystal structures. All inhibitors were adequately targeted at the Thermolysin pocket. In the docking experiments, the best poses were computed, and, as shown in Table 1, 86% of the poses obtained had a ΔGbinding greater than 7 kcal/mol, and 43% had an RMSD below 2 Å. The most stable complex of all was Lig5H9-5DPF, which presented the most negative free binding energy of all (-8.2 kcal/mol) and the second-lowest RMSD (Table 1) of all the complexes analyzed (0.90 Å). It is necessary to clarify that the RMSD values were obtained by comparing the complexes best poses with the crystallographic structure downloaded from the Protein Data Bank (PDB id:5DPF), so the reference ligand 5H9 was re-docked to have a validation of our docking experiments. Lig6199-5DPF -7.0 6.76 -6.9 6.81 -6.6 6.85 A docking experiment detailed analysis reveals that the Lig5H9 presents a POOH group in its structure, which interacts by h-bond with the His231. This result reproduces what was found in the crystallographic structure obtained from the PDB and is further evidence of our result's validation ( Figure 2A). Besides, the C-OH group interacts by h-bond with Arg203, and the COOH group also exhibits hydrogen bond interaction with Asn112, hence the excellent stability that this ligand presents in the active center of Thermolysin ( Figure 2A). Of all the molecules designed in silico, the one that was best oriented in the Thermolysin pocket was Lig3444. The Lig3444-5DPF complex had the second most negative binding energy (counting the reference ligand 5H9) with a value of -8.1 kcal/mol, followed by Lig783 and Lig2177 with binding energies of -8.0 kcal/mol. Another stable complex that was obtained in the docking experiments was Lig3444-5DPF. This ligand has a terminal N-OH group, which presents interactions by hydrogen bonding with Asn112. In addition to the fact that the pyrrolidine ring also presented h-bond interactions with Glu143, which gives it high stability to this complex (Figure 2 F).
The third and fourth place in the most negative binding free energy corresponded to ligands Lig783 and Lig2177, both with ΔGbinding of -8.0 kcal/mol. The Lig783-5DPF complex stability is due to this ligand present a terminal C-OH group in its structure, interacting by a hydrogen bond with Asp150 and Asn165. The molecule hydrocarbon skeleton was oriented into a hydrophobic pocket formed by Phe114, Trp115, and Trp157 (Figure 2 B). Lig2177 has a hydroxyl-amino group in its structure, which had a hydrogen bond interaction with Asn112. The terminal phenyl ring was oriented in a hydrophobic pocket formed by the Phe130. Leu133, and Val139, which gives it high stability to this complex (Figure 2 E).
These are the most stable ligands in docking experiments. To analyze if the interactions last over time, we will look at the molecular dynamics simulation results that will help us delve into this system's behavior at the molecular level over time.
Molecular dynamics simulations give trajectories that contain structural information of thermolysin-ligand complexes at the molecular level [15][16][17] . Principally, our principal goal was focused on analyzing if the interactions found in the docking experiments are maintained over time.
To obtain these results, we have performed an integral analysis of the data obtained from molecular dynamics simulations, which we show below.

Root Means Squared Deviation (RMSD).
As a stability criterion of the studied complexes, we will begin by analyzing the RMSD (Root Means Squared Deviation) parameter [18][19][20] . This variable will give us an idea of how the systems evolve during the 50 ns of simulation time. As shown in Figure 3, starting from the first 3 ns, all the complexes studied remained stable over time with a minimal appreciable variation. All the systems had RMSD values lower than 1.4 Å, even below the Lig5H9-5DPF complex (our reference system), indicating that the compounds designed in silico as possible antibacterials behaved stably over time into thermolysin pocket.
We can observe the RMSD parameter as the stability criterion of the studied systems in Table 2. There were no significant differences in the average value of the RMSD over time. The most stable complex was Lig2177-5DPF (0.90±0.07 Å). This system had the lowest RMSD value of all the complexes studied, even lower than our reference system (Lig5H9-5DPF), which had an average RMSD value of 1.11±0.13 Å. Table 2. Average values and standard deviation of some parameters taken from 50 ns of trajectories.

Hydrogen bond interactions (H-bond).
We quantified the amount and occupancies of h-bond interactions during the 50 ns of simulation time to analyze the studied complex's stability. As shown in Figure 4, the number of h-bond interactions was low in all the complexes studied. None of the systems had an h-bond average higher than 1.5 (Table 2), indicating that these interactions do not significantly influence the complexes stability studied.
In addition to the low amount of h-bond, the stability of these interactions was also low. As shown in Table 2, the standard deviation of h-bond interactions for all the systems was higher than the population means, indicating the significant instability of these interactions. Another interesting fact that must be emphasized is the h-bond interaction occupancies over time. The occupancy term (%) refers to when the h-bond interaction is maintained less than 3 Å of distance during the 50 ns of simulation. Occupancy of more than 50% was taken as a stability criterion [21,22] . The highest occupancy was obtained from the hydrogen bond interaction formed by Lig5H9-OH--O-Glu166, maintained at an average distance of 1.75±2.38 Å the 24.63% during 50 ns of molecular dynamics simulation. The second-highest occupancy was found in the h-bond interaction formed by Lig783-OH--O-Asp150, which had 20.36% of occupancy 2.83±1.60 Å during the trajectory.
From this analysis, we can conclude that given the low amount of hydrogens and the instability of these interactions, we can not explain the complex stability shown in the molecular dynamics simulations using this parameter. For that reason, it is necessary to analyze other parameters such as radius of gyration, whose results we will present below.

Radius of Gyration (Rg).
The stability of the complexes analyzed using the RMSD parameter and h-bond interaction could not be explained. We will now analyze the compaction degree of 5DPF-ligand complexes during 50 ns of molecular dynamics trajectories. The compaction degree of the complexes studied will be addressed using the Radius of gyration (Rg) parameter [23][24][25] . This variable is defined as the root mean square distance of the mass of a collection of atoms from a common center of mass. The Rg graph leads to two analysis levels: the parameter's value over time and its fluctuation [23][24][25] . As shown in Figure 5, the complexes that remained more compact over time were Lig783-5DPF, Lig3444-5DPF, and Lig1392-5DPF. These systems had the lowest values and standard deviation of all the analyzed complexes (Table 2). These complexes remained stable over time without appreciable conformational changes during the 50 ns of simulation time. Another complex that remained compact overtime was the Lig6199-5DPF, although the Rg value was slightly higher than Lig783-5DPF, Lig3444-5DPF, and Lig1392-5DPF.
A very different behavior was obtained from the trajectory of the Lig5H9-5DPF, Lig1022-5DPF, and Lig2177-5DPF complexes. As shown in Figure 5, the complex formed by Lig5H9-5DPF had small conformational fluctuations during the first 22 ns of trajectory. From this simulation time, the system stabilizes until the end of the trajectory.
The Lig1022-5DPF complex behavior was different, which was the least compact of all systems studied with many fluctuations during 50 ns of simulation. As shown in Figure 5, this system had few fluctuations in the first 18 ns of simulation time. However, there was a conformational change ( Figure 6) in which the radius value decreases abruptly. Then its value increases again from the nanosecond 33, demonstrating the low stability of this complex.

Root Mean Squared Fluctuation (RMSF).
The root mean squared fluctuation (RMSF) parameter gives us a measure of the flexibility of the different residues composed of the Thermolysin structure during the trajectory [26] . At high RMSF values, there is greater flexibility of movement, and at low RMSF values indicates that there are movement restrictions during the simulation.
As shown in Figure 7, for all the complexes studied, the lowest RMSF values were found between amino acids 129-155 and 161-165, indicating that this area in contact with the designed ligands in silico remained more rigid than the reference. It is necessary to emphasize that this zone is the Thermolysin active center and the conserved sequence (HExxH) in the M4 family proteins [9,27] , which in the case of this protein, the sequence is His142-Glu166-Tyr157-His146-His231. The most flexible amino acid residues were Asn183 and Glu199, which have a structural rather than a catalytic role.

MM-PBSA.
The Molecular Mechanic Poisson-Boltzmann Surface Area (MM-PBSA) method will help us explore the molecular level at which energy components contribute favorably or unfavorably to the studied complexe's stability a free energy decomposition analysis[28] . Of all the complexes studied the Lig5H9-5DPF had the most negative ∆Gbinding value, indicating that it was the most stable complex of all, which agrees with that obtained in the docking experiments. This ligand in our reference was obtained from the Protein Data Bank.
The complex formed by Lig1022 with Thermolysin (Lig1022-5DPF) had the most negative binding energy of all the ligands designed in silico. This complex had the third most negative binding energy in docking experiments. According to the MMPBSA calculations results, Lig1022-5DPF is stabilized by Van der Waals interactions, which is the third most negative of all the complexes studied.
Lig6199 and Lig1392 were the complexes that had the second and third most negative ∆Gbinding, in MM-PBSA calculations. These systems had fewer negative energies in the docking experiments, which shows that they had losses of interactions during the molecular dynamics simulations. Like the Lig1022-5DPF complex, these two systems are mainly stabilized by Van der Waals interactions. From the detailed analysis by residue (Figure 8), we can observe that the Phe126, Asp138, Glu166, Glu187, and Asp170 residues contributed favorably to ∆Gbinding,. Note that Glu166 residue was found in the conserved sequence of the M4 family proteins active center [9,27] and has an essential catalytic role [9,27] . Molecular dynamics simulations, together with free energy calculations, corroborate this result.
The amino acids Asp82, Arg101, Asn112, Phe114, and His231 contributed favorably to the electrostatic origin and Van der Waals interactions. His231 is another essential amino acid present in the thermolysin active center and is within the consensus sequence of the M4 family proteins active center[9,27] .

Ligand Efficiency Calculation and ADME-Tox Properties.
Until now, we have analyzed the molecular stability bases of the Thermolysin-ligand complexes. In the next section, we will analyze the affinity of the compounds designed in silico and Thermolysin and the ADME-Tox properties. These results will help us to discriminate which of the designed compounds could be the best candidates for antibacterial agents, minimizing the risk as much as possible.
We have used in this analysis four parameters, the dissociation constant (Kd), ligand efficiency index (LE), binding efficiency index (BEI), and lipophilic ligand efficiency (LLE), which will help us compare and select the best candidates for antibacterial agents. The dissociation constant (Kd) parameter gives us a measure of how strong or weak is the ligand-Protein interaction. Low values indicate a strong interaction between the ligand and the protein. The dissociation constant was calculated using Equation 4. As shown in Table 4, the most robust interaction was obtained with Lig5H9, which presented significant differences with the rest of the ligands studied. Of the compounds designed in silico, Lig3444, Lig783, and Lig2177 had the lowest Kd values, indicating that they could be good candidates for antibacterial agents. The ligand efficiency index (LE) is a parameter that correlates the ligand-Protein binding energy and the number of atoms in the compound without taking into account the hydrogens in its structure [29] . In our work, we take LE>0,3 kcal/mol HAC as a reference for an adequate drug candidate [29][30][31] . Consistent with the reference value, compounds Lig783 and Lig3444 are good candidates as thermolysin inhibitors since they had LE values of 0,40 and 0,36, respectively (Table  4). Studies carried out on common drugs have given LE values between 0.5 and 0.52 for drugs for oral administration[29] , so we can suggest that the compounds Lig783 and Lig3444 could be good candidates for orally administered drugs.
The binding efficiency index (BEI) is a parameter that relates the dissociation constant (Kd) of the ligand and its molecular weight [32] . As reference values for this parameter, we have taken 20<BEI<27. These values were obtained from drugs known as Bortezomid (BEI=21) Another essential variable to analyze is lipophilic ligand efficiency (LLE). This parameter measures the ligand-binding capacity with the protein and its lipophilic power [29] . In our case, LLE's reference values are taken between 5-7 units, values calculated in known drugs for oral administration [34] . As shown in Table 4, the ligands designed in silico obtained LLE values lower than the reference values, mainly due to the relatively high values of the cLogP variable, which means that these compounds have high lipophilic power (Table 4). Suppose we stop to analyze the chemical structure of our molecules ( Figure 10). In that case, we can observe that they have apolar hydrocarbon skeletons, which could influence our lipophilic ligand capacity.
For a promising drug candidate molecule, the pharmacokinetics prediction (absorption, distribution, metabolism, and elimination (ADME)) is necessary. Also, some toxicological parameters (Tox) [35][36][37][38] will be critical to known. In this work, we calculated the molecular weight (MW), the octanol/water partition coefficient (cLogP), hydrogen bond acceptor count (HBA), hydrogen bond donor count (HDA), and rotatable bond count (RB) for each one of the molecules studied using SwissADME web server [39] . As a toxicological criterion, these results were compared with the Lipinski[40] , Veber[41] , and Pfizer[42] rules. Thus, if any compounds designed in silico comply with only one Lipinski rule, this compound would not be the right candidate. On the contrary, according to the Veber and Pfizer rules, a right ligand candidate must comply with both rules in both cases. Table 4, our reference ligand 5H9 meets all the parameters of the Lipinski Rule. However, it does not meet any of the parameters of the Veber rule. It only meets one of the parameters of the Pfizer Rule, indicating that this compound could have high oral availability. However, its high lipophilic power could have some toxicological potential. The compounds Lig783, Lig1392, and Lig3444 comply with all the Lipinski, Veber, and Pfizer rulers, so we consider these ligands the best antibacterial candidate agents. The Lig1022, Lig2177, and Lig6199 molecules result in not good candidates because of the high lipophilic power. This parameter provokes some toxicological consequences.

Computational Protocol.
In this work, we planned a computational biochemistry protocol to evaluate Thermolysin inhibitors as possible antibacterial agents, as shown in Figure 9. The molecules that we will study in this work were designed by our group using the QSARIN method reported in previous works [11] . The molecular structures of these possible Thermolysin inhibitors were sketched using Avogadro software version 1.2.0 [43] . The optimized geometry was obtained using DFT calculation at the b3lyp/ma-def2-SVP basis set implemented in Orca 4.2.1 software package [44,45] . The full optimized geometry was checked by counting their imaginary frequencies. The molecular structure of the Thermolysin inhibitors studied is represented in Figure  10. The optimized geometries of these molecules were obtained for docking experiments to examine the interactions of these compounds in the Thermolysin pocket. Figure 10. Molecular structure of Thermolysin inhibitors as possible antibacterial agents.

Docking Procedure.
The full optimized geometry obtained from the quantum calculation of the ligands were used for docking experiments. The molecules were prepared at pH=7.4 using Autodock Tools [46] . The X-ray crystallography structure of Thermolysin was obtained from Protein Data Bank (PDB) [47] from Bacillus thermoproteolyticus, whose PDBid is 5DPF, resolved at 1. 47 Å[48] . This protein was prepared by the addition of all hydrogen atoms at pH=7.4. The water molecules around the protein were eliminated. The size of the grid box was 16.875×14.14×16.875 Å around the mass centers of the 5H9 our reference ligands discharged from x-ray crystallography structure from PDB whose coordinates were x = 11,844 y=-40,829 and z = 6,420. The Thermolysin is an M4 family of proteinases with Zn 2+ into the pocket, that´s why we kept this element in all docking experiments.
In all docking procedures, we used a grid spacing of 0.375 Å, the number of modes was 10, and the energy rank was set up to 1 kcal/mol. To analyze if our docking results were correct, the 5H9 reference ligand was re-docked using the same docking protocol of the other compounds. All docking experiments were realized with Autodock Vina software version 1.1.2 [49,50] . The best docking poses were selected using binding energy (kcal/mol) and the positional root-mean-square deviation (RMSD) [51] .
The best energetically good poses and lowest root-mean-square deviation of each complex were selected for molecular dynamics simulations and ligand efficiency calculations. To verify the docking results reproducibility, we calculate the root-mean-square deviation (RMSD), between the possible antibacterial compounds, taking 5H9 as ligand reference from the crystallographic structure of Protein Data Bank. These calculations were realized using the LigRMSD server 1.0 program [52] . All docking figures were made using Pymol software version 1.8[53] .

Molecular Dynamics Simulation.
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 December 2020 We obtained the best conformational poses for each ligand-Thermolysin complex from docking experiments as input for molecular dynamics simulations. Each complex was placed into a water box of 15×15×15 Å, using the TIP3P water model [54,55] . Topologies and parameters of the ligands were obtained by the SwissParam web Server [56] . All molecular dynamics simulations were described using CHARMM36 and CGenFF force field for the Thermolysin and the possible antibacterial compounds [57][58][59][60][61] .
The ligand-Thermolysin complexes were submitted to 50000 steps for the energy minimization using the conjugated gradient methodology for reducing any close contact. The working temperature was 298.15 K employing the weak coupling algorithm [62] . The Van der Waals cutoff was fixed to 12 Å, and we applied a backbone constraint to all complexes using the NPT ensemble. The Particle Mesh Ewald (PME) approach[63] was used for considering the long ranges of electrostatic forces. We employed the velocity Verlet algorithm with a time step of 1.0 fs to solve the motion equations. All the complexes were submitted to 2.0 ns of equilibration and 50 ns of molecular dynamics simulation using the NAMD 2.13 software package [64] . The trajectories analysis and scripts were realized using the VMD software version 1.9.3[65] .

Free energy Calculation.
This work's computational protocol combines molecular dynamics simulation and MM-PBSA to study the ligands and Thermolysin interactions. The binding free energy was calculated using g_mmpbsa package version 5.1.2[66] , a Gromacs tool to compute the ligand-free binding energy [66] . This is an implicit solvent method that computes the different energy between the ligands-Thermolysin complexes.
From 50 ns of molecular dynamics simulation, we extract the last 300 frames to compute each binding free energy of complexes. So the MM-PBSA method calculates the free energy decomposition into contributions. The free energy for the Thermollysin-ligand complexes were calculated according to the following equation: ΔG binding =G complex − Thermolysin +G ligand (1) In Equation 1 Gcomplex corresponds to the Thermolysin-ligand complex energy, GThermolysin, and Gligand is the energy of protein and ligand respectively. The following equation was used to calculate the free energy of the protein, ligand, and complex separately.

=E bond +E vdw +E elect +G polar +G Apolar
(2) In the Equation, 2 Gx can be Gcomplex or GThermolysin, or Gligand. The Ebond represents the interactions that include bond, angle, and dihedral angle, Eelect is the electrostatic energy contribution, Evdw is a Van der Waals energy contribution. The Gpolar represents the polar free energy contribution, which was calculated using the continuum solvent Poisson-Boltzmann (PB) model included in the APBS (Adaptive Poisson-Boltzmann Solver) software version 1. 4.1[67] . The non-polar free energy contribution was calculated according to the following equation: where γ represent the coefficient related to the surface tension of the solvent, which in our case has a value of 0.0072 kcal/mol/Å 2 , SASA is the solvent-accessible surface area, with an amount of 1.4 Å and β is a fitting parameter.
We also decompose the overall binding energy per residue because we need to know every amino acid contribution. These contributions were calculated using pythons script MmPbSaStat.py, and the

Ligand Efficiency Calculation.
Ligand efficiency metrics consist of a series of parameters that seek to measure the relationship between the binding energy and the molecule size [29] . These parameters have currently taken on important relevance in predicting how efficient a given compound will be as a possible drug [29,31,68,69] . In our case, the ligand efficiency calculations were performed through several parameters, such as the dissociation constant (Kd), the ligand efficiency index (LE), the binding efficiency index (BEI), and the lipophilic ligand efficiency (LLE).
The Kd corresponds to the dissociation constant between a ligand and the protein. Binding efficiency index BEI is a measure that involves a binding property of the ligand with the protein against molecular weight [71,72] . In this work, the binding efficiency index was calculated through the following equation [32] : where pKd is -logKd and Kd is the dissociation constant calculated from Equation 4. MW represents the molecular weight in kDa.
Lipophilic ligand efficiency (LLE) has been defined as the difference between the ligand activity and lipophilicity (clogP), as shown in Equation 7[73] .
LLE=pK − clog( ) The absorption, distribution, metabolism, and excretion (ADME) properties of all ligands studied in this work were calculated from the molecules under study's optimized geometry, using the SwissADME web server [39] . We also calculated other physicochemical variables like molecular weight (MW), octanol/water partition coefficient (cLogP), hydrogen bond acceptor (HBA), hydrogen bond donor (HBD), topological polar surface area (TPSA), and rotatable bond count (RB) respectively. His fundamental goal is to obtain ties of a compound to become a drug. From these physicochemical parameters, we can predict the drug's toxicological properties (Tox) considering the Lipinski [40,74] , Veber [41] , and Pfizer toxicity empirical rules[42] (Table 5).

Conclusions
Bacterial diseases have increased around the world. This type of disease is caused by a large group of Gram-Positive, and Gram-Negative bacteria, which have M4 family metalloproteins containing Zn 2+ in their active center. This family of proteins is the cause of the bacteria's virulence. In previous works, we designed a series of Thermolysin inhibitor ligands (M4 family metalloproteinase). How can we determine if these designed compounds would be good antibacterial agents? In this work, we have successfully designed a computational protocol with a comprehensive analysis of the results obtained from docking experiments, molecular dynamics simulations, MM-PBSA, ligand efficiency calculation, and ADME-Tox prediction. From the results, we can conclude that the compounds Lig783, Lig2177, and Lig3444 were the ones that had the best behavior in the docking experiments. These same compounds had a very similar behavior during the 50 ns of molecular dynamics and MM-PBSA. These results coincided with the ligand efficiency analysis, demonstrating that these molecules can be good antibacterial agents. However, if we analyze the results extracted from the ADME-Tox properties, we can conclude that the best candidate is Lig783.