Improved Estimation of Protein-Ligand Binding Free Energy by Using the Ligand-Entropy and Mobility of Water Molecules

We previously developed the direct interaction approximation (DIA) method to estimate the protein-ligand binding free energy (ΔG). The DIA method estimates the ΔG value based on the direct van der Waals and electrostatic interaction energies between the protein and the ligand. In the current study, the effect of the entropy of the ligand was introduced with protein dynamic properties by molecular dynamics simulations, and the interaction between each residue of the protein and the ligand was also weighted considering the hydration of each residue. The molecular dynamics simulation of the apo target protein gave the hydration effect of each residue, under the assumption that the residues, which strongly bind the water molecules, are important in the protein-ligand binding. These two effects improved the reliability of the DIA method. In fact, the parameters used in the DIA became independent of the target protein. The averaged error of ΔG estimation was 1.3 kcal/mol and the correlation coefficient between the experimental ΔG value and the calculated ΔG value was 0.75.


Introduction
In the pharmaceutical sciences, the protein-ligand binding free energy (G) is one of the most important properties of a drug compound. Despite the development of numerous docking programs and scoring functions to estimate the G [1][2][3][4][5][6][7], the typical accuracy of G estimation remains about 2−3 kcal/mol [6][7][8][9][10]. Usually, docking scores are proportional to G values. This low accuracy of the G or docking score has contributed to a low success rate of computer-aided drug design. The limitations of the docking score are obvious. In statistical physics, the free energy is calculated from the partition function, which is based on a structural ensemble of numerous structures at a particular temperature. On the other hand, the conventional docking score is calculated from a single protein-compound complex structure.
Many reports have used molecular dynamics (MD) simulations to analyze protein-compound docking and to calculate the G. Even if the protein-ligand complex structure is unknown, ab-initio MD docking simulations can be used to reveal the protein-ligand complex structures and the free energy landscapes [11][12][13][14]. In an explicit water model, if the protein-ligand complex structure is known, the binding free energy and the potential of mean force (PMF) along the dissociation path can be obtained using the filling potential (FP) method [15], the meta dynamics method [16,17], the MP-CAFEE method [18], the smooth-reaction path generation method [19] and Jarzynski's method [20].
There have been several reports on the calculation of protein-ligand binding free energy by semi-empirical methods, since the ab-initio free energy calculation is still very time-consuming. The molecular-mechanical Poisson-Boltzman surface-area (MM-PBSA) method [21], the linear interaction energy (LIE) method [22] and the COMBINE method [23−29] have succeeded in reproducing the trends of Gs for single target proteins. These methods have been successful in practical use, but the parameters used in these methods must be optimized for each target protein.
We previously proposed a direct interaction approximation (DIA) method for the G estimation [30]. This method estimates the G value based on the direct interaction between the protein and the ligand just as in the COMBINE method, but the weighted parameters for residues are set to fixed values as in the LIE method. In the current study, we modified the DIA method in order to use target-independent parameters. Since previous authors have introduced a ligand-entropy term in their docking studies [5,6], we also examined the ligand-entropy term. In addition, because the mobility of solvent water molecules has been analyzed in previous reports [31,32], and we also examined the effect of the solvent water mobility herein, but used a different method of analysis for this purpose.

Results and Discussion
The brief explanation of the previously developed direct interaction approximation (DIA) [30] is introduced in Section 2.1. The ligand-entropy term is the first additional term to the original DIA and it is introduced in Section 2.2. The stability of hydration shell of each residue is the second additional term to the original DIA and it is introduced in Section 2.3. The ligand-entropy term and the stability of hydration shell were examined by using the protein-ligand complex structures in Section 2.4. These additional two terms improved the accuracy and the physical consistency of the DIA model. These results showed that the trajectory average of the protein-ligand interaction improved the estimation of the protein-ligand binding free energy. In Section 2.5, we showed the trajectory average of the docking score can improve the binding free energy estimation and the consensus score of the DIA result and the docking score improved the correlation between the experimental and the calculated binding free energies.

Original Direct Interaction Approximation (DIA) Method
In our previous study, we developed the DIA method to estimate G [30]. The fluctuation of the accessible surface area (ASA) or the dihedral angles of the system was introduced as the entropy term of the G value, and the estimation accuracy reached 1.5 kcal/mol for several tens protein-compound complex structures. Here, we will explain the DIA method briefly. In the original DIA method without a direct solvent effect (DIAV), the G value is estimated as follows [30]: where E vdW (i) and E ele (i) are the vdW and electrostatic interactions between the i-th residue of the protein and the ligand, respectively. Svdw(i) and Sele(i) are the fluctuation of the E vdW (i) and E ele (i), respectively. The *S x term represents the entropy of the system. S x is the fluctuation of a property x.
In the current study, S x is the fluctuation of the accessible surface area (x = ASA) of the protein-ligand complex structure or the all dihedral angles (x = DIH) of the protein over the trajectory. There are five parameters: α, α2, β, β2, and .
To represent the van der Waals interaction and the hydrophobic interaction, a Lennard-Jones (LJ) 8-4-type function has been used instead of the LJ12-6 type function: where R e is the equilibrium distance. The R e and the well depth values are set to the same values obtained from AMBER param99 [33] and the general AMBER force field (GAFF) [34]. The data sampling MD simulation is performed with the conventional AMBER force field (LJ 12-6 potential), and the DIA analysis is performed with Equation (2). In the ligand-binding pocket, the effective dielectric constant ( eff ) should be different at each point, since the  eff values of proteins are 2−4 and the  eff of water is 78.5. The E ele (i) should be scaled by the  eff . Therefore, the electrostatic interactions in the DIAV method were modified and we named the modified method as the direct interaction approximation with solvent (DIAS) method [30]: where E mod ele (i) is the E ele (i) value scaled by  eff . The  eff value could be calculated from the ratio between the electrostatic force calculated in the explicit water model and that in a vacuum as follows: (4) where E j ele (i) is the electrostatic interaction between the i-th residue and the j-th atom of the ligand in a vacuum. Here, the effective dielectric constant is given by: where F j real and F j vac are the electrostatic force acting on the j-th atom in the explicit water model and in a vacuum, respectively. The F real and F vac were calculated by the molecular dynamics simulation in the explicit water model and in a vacuum, respectively. The scale factor 1/ε 'j eff could be an unrealistically large value when the denominator of Equation (5) is nearly zero. Thus, we introduce a parameter x and the scale function as follows: This parameter, 1/ε j eff , was used as the scale factor and the previous study showed that the optimal value was 0.6 [30]. Note that the actual effective dielectric constant corresponds to ε j eff /β.

Intra-Molecular Ligand-Entropy Term
In the current study, the entropy change of the ligand was taken into account in the G estimation. The rotatable bonds of the ligand can freely rotate in its unbound form, and these bonds can be fixed into a single conformation in its bound form. Thus, the entropy of the ligand decreases during the protein−ligand binding process. We added the ligand-entropy term (TS ligand ) as follows [5,6]: where N rot is the number of rotatable bonds (single bonds between heavy atoms) in the non-ring part. The number of possible conformers is 3 Nrot without the ring part. Considering the intra-molecule atomic collision, the number of conformers can be less than 3 Nrot , and so an additional parameter w is introduced. First, the ligand entropy of the ring parts was examined. The number of conformers of a ring part was approximated by 2 (Nrot-ring -3) or 3  , where N rot-ring is the number of rotatable bonds (single bonds between heavy atoms) in the ring parts, since the three-membered ring has only one conformer and the torsion angles of ring parts are restricted compared to the free rotation. We examined the importance of the ring-entropy term by multiple linear regression analysis of the data of 34 protein-ligand complexes. Consequently, the ring-entropy term did not improve the estimation accuracy of G. Thus, in the current work, Equation (7) was used as the ligand-entropy term.

Hydration Effect of Each Residue of the Target Protein
In computer-aided drug design, crystal water molecules are often replaced by ligand atoms to design a high-activity compound [35]. This is an empirical procedure known among medicinal chemists. The amino-acid residues around the crystal water molecules should be important to the protein-ligand binding interaction (hot spot). To detect the hot spot, the mobility of water molecules is observed by MD simulations. In the current study, the MD simulation of apo protein in water was performed at room temperature, and the mobility of the water molecules was observed around the i-th residue.
The mobility of water is measured by the ligand exchange rate. In the current study, a residue-based ligand exchange rate for the i-th residue (<H i >) was introduced: Here n m j is the number of water molecules exchanged at the m-th step of the sampled MD trajectory around the j-th atom. The j-th atom belongs to the i-th residue and N i is the total number of atoms (including H atoms) of the i-th residue. Nstep is the total number of the sampled MD steps. The water molecule, whose distance to the j-th residue is less than 6 Å, is taken into account as the ligand of the j-th atom in Equation (8). Since the weight for each energy term of Equations (1) and (3) [exp(−α2xS vdw (i)) and exp(−β2xS ele (i))] corresponds to the probability, the weight for the i-th residue is a dimensionless parameter. We assume that the weight of the amino-acid residues is a function of <H i >, since it could be a measure of the stability of hydration shell around the i-th residue. The higher the value of <H i > is, the more important is the i-th residue is in the protein-ligand interaction. Thus, the weight of residues should be a monotonically increasing function of <H i >. We apply exp(<H i >) as a simple function for approximating the weight, where  is a positive parameter. In the current study, the trajectory was sampled every 2 psec and the minimum, maximum and average values of <H i > were 0.0, 0.16, and 0.042, respectively. These values correspond to 1, 0.26, and 0.96 of the exp(<H i >) values. The average <H i > values in the ligand binding pocket and on the protein surface were 0.0478 and 0.0488, respectively. The ratios of the residue with <H i > less than the average <H i > value were 54% and 48% in the ligand binding pocket and on the protein surface, respectively.
Water molecules in the bottom of the pocket hardly move and the contact number of the bottom of the pocket should be large. Thus there is a correlation between the water mobility and the contact number. The contact number is the number of atoms (protein atoms only, excluding the solvent and ligand atoms), whose distances from the atom in question are less than 6 Å. In this assumption, the <H i > value for the i-th residue is estimated instead of Equation (8) as follows: Here C j m is the contact number of the j-th atom at the m-th step of the sampled MD trajectory and C avg is the average value of C j m . The j-th atom belongs to the i-th residue and N i is the total number of atoms of the i-th residue. In the current study, the correlation coefficient between the <H w i > and <H c i > was 0.32. The average (C avg ), minimum and maximum <H c i > values were 73.51, 0, and 106.2, respectively.

G Estimation by the DIA Method
In the current study, we used the following 6 models to examine the ligand-entropy term and the hydration effect of residues.
Model 4: The DIAV_W model, where the weight of each residue is calculated from the hydration solvent water. Here <H i > = <H i w > as in Equation (8): Model 5: The DIAV_LW model. Here, the ligand-entropy term and the weight for each residue are added to the original DIAV model. The weight for each residue is calculated from the hydration solvent water. Here <H i > = <H i w > as in Equation (8): Model 6: The DIAV_LC model. The weight for each residue is estimated based on the contact number. The model equation is Equation (12) with the relation <H i > = < H c i > as in Equation (9). Table 1 shows the computational average error and the correlation coefficient between the experimental values and the values calculated by these six models. The results were obtained by leave-one-out cross-validation tests. In the leave-one-out cross-validation test, one data is selected as the test data that is to be predicted and the other data are used as the teaching data to generate the prediction model equation. The test data is selected one after another in the given data set until all data are selected as the test data. The property x of Sx (entropy term) was fixed to x = DIH (the fluctuation of the dihedral angles), since the G accuracy obtained by x = DIH was better than that obtained by x = ASA (the fluctuation of the accessible surface area). The  values were optimized to minimize the G estimation error, and these values were set to −6.115 and 0.00613 for the DIAV_LW and DIAV_LC methods, respectively.
Comparing the average error obtained by the DIAV and DIAV_L models, the consideration of the ligand-entropy term improved the accuracy. In addition, comparing the average error obtained by the DIAV and DIAV_W models, the consideration of the weight of the residues improved the accuracy. The combination of both the ligand-entropy term and the weight of the residues improved the accuracy (DIAV_LW). Among the six models examined, the DIAV_LW model showed the best accuracy and the DIAV_C model showed the second best accuracy. The accuracy of the DIAV_L model was almost the same as that of the DIAS model. Since the number of parameters of the DIAV_L model was smaller than that of the DIAS model, the second best model should be the DIAV_L model. Figure 1 shows the correlation between the experimental and calculated G values obtained by the DIAV, DIAV_LW and DIAV_LC methods. These values were obtained by the molecular dynamics simulation starting from the experimentally determined protein-ligand complex structures. The computational detail was described in the data preparation section. It is clear that the DIAV_LW/DIAV_LC methods gave a better correlation than the DIAV method.  The experimental data (G exptl ) and the calculated values (G calc ). Open circles, filled circles and filled triangles represent the results obtained by the DIAV, DIAV_LW and DIAV_LC methods, respectively. Table 2 shows the average, deviation, and minimum and maximum values of the optimized parameters (α, β,  and w) of Models 1−6 of the 34 cross-validation tests. The % of the negative values is also summarized. The other parameters, i.e., α2, β2, and , are omitted. The smaller the deviation of the parameter is, the less dependent on the target protein the model is. In particular, the sign of parameter β is important. Negative values of β are physically unreasonable. Namely, negative β implies that repulsive ligand-protein interactions stabilize the free energy of binding. In the DIAS and the DIAV models, the β value was negative in 2.86% of the cases (one model among total 34 cross-validation test models). In contrast, the average α,  and w values were almost identical among the DIAV_L, DIAV_LW, DIAV_LC and DIAS models. The consideration of the ligand-entropy term (DIAV_L model) slightly improved the problem of the negative β parameter. In addition, the weight of residues (DIAV_W model) also slightly improved the problem of the negative β parameter. In the DIAV_LW and DIAV_LC models, all β parameters were positive in the 34 cross-validation tests.
Among these six models, the deviation of the DIAV_LW model was the smallest. Considering the average error and the deviation of the parameters, the DIAV_LW model was the best of the 6 models and the DIAV_LC model is the second best. In the drug design, the prediction accuracy of unknown compounds for a new target protein is more important than the regression of the activities of known active compounds for a known target. Thus, the parameters of the computational model must not depend on the target proteins. From this point of view, the DIAV_LW or DIAV_LC model is desirable.  The consideration of the ligand entropy and the weight of each residue did not improve the DIAS model. In the DIAS model, the weight of each residue is already considered by using the parameters α2 and β2. Thus, the newly introduced weight with the  parameter would count the weight of the residue twice. Considering the number of parameters (α, β, ,  and w in the DIAV_LV/DIAV_LC models, and α, β, , α2, β2 and  in the DIAS model), the DIAV_LW/DIAV_LC models have a smaller number of parameters than the DIAS model. Since a model with a small number of parameters should, in principle, be better than that with a large number of parameters, the DIAV_LW/DIAV_LC models are better than the DIAS model.
The trajectory dependence of the models was examined. The above results were obtained from the 2-nsec trajectories. Figure 2 shows the time dependence of the DIAV, DIAV_LW and DIAV_LC results. When the 1-nsec trajectories were used, the results were slightly worse than those shown above, but the difference was not statistically significant. In the DIAV model, the β value was negative in 5.71% of the cases (1ett and 1hsl). In the DIAS, DIAV_L and the DIAV_W models, the β value was negative in 2.86% of the cases (1ett in the all models) and no negative β value was observed in the DIAV_LW/DIAV_LC models, just as in the above results. The 1ett structure is thrombin, but the other thrombin structures (1etr, 1ets) did not show the problem. Currently, the reason of the problem is unclear. We also estimated the binding free energies of non-active compounds, since evaluation of the binding free energies of the both active and non-active compounds are essential in practical use. We docked three GPCR (G-protein coupled receptor: membrane protein) ligands to the proteins and calculated the binding free energies of them by the DIAV_L method. These compounds were alprenolol (a β-adrenergic inverse agonist), fenoterol (a β-adrenergic inverse agonist) and cetirizine (H 1 receptor inverse agonist). They are non-active compounds of the proteins, because there are no GPCR in the target proteins. The condition of the MD simulation was the same as those used in Table 1. The parameters of the DIAV_L method were determined for the MLR of the all 34 proteins used in Table 1. The results are summarized in Table 3. In some cases, the pocket sizes of the ligand-binding sites of the proteins were too small for the GPCR ligands to be docked, and so MD simulations were not possible for them. Obviously, such ligands should be non-active compounds. The average values of the estimated binding free energies for the non-active compounds were weaker than those of the original ligands. In all 19 cases, the Gs of alprenolol were weaker than those of the original ligand. The Gs of fenoterol and cetirizine were weaker than those of the original ligand in 69% and 56% cases, respectively. For some proteins, feneterol and cetirizine show stronger affinities estimated than those for the original ligands, but most of their absolute bindings are weak. The whole protein set included four thrombins, three HIV-1 proteases and five trypsins. We have examined the Spearman's rank correlations for these ligands of the same target proteins. The parameters of the DIAV_L, DIAV_LW and DIAV_LC methods were determined based on the 22 proteins excluding these 12 target proteins. The Gs, error of the Gs, and the correlation coefficients are summarized in Table 4. The total number of HIV-1 proteases was 11, since we added eight new data points. These results obtained by the DIAV_L, DIAV_LW and DIAV_LC methods are similar to each other. The trends of the Gs are almost correctly reproduced.

Consensus Score with the Trajectory Average of the Docking Score
Next, we examined the trajectory average of the docking score. The Sievgene protein-compound docking program was used to calculate the protein-ligand docking score [7]. The trajectory average improved the correlation between the experimental binding free energy and the averaged docking score. Namely, the correlation coefficients between the experimental binding free energy and the averaged docking score were 0.751 and 0.745, with and without the trajectory average, respectively. The actual docking scores calculated by three different programs were summarized in the supplementary data. On the contrary, the DIAV_L (R = 0.76) and DIAV_LW (R = 0.78) methods gave R = 0.76 and 0.78, respectively, slightly better than those for the averaged docking scores. However, the differences between their estimated binding free energies and the experimental ones were 1.39 kcal/mol and 1.33 kcal/mol, respectively. They are much smaller than those by the averaged docking scores, 1.63 kcal/mol and 1.89 kcal/mol with and without trajectory average, respectively. We must note that the results should strongly depend on the ensemble generated by the MD simulation [36]. Our method should be applied the correct protein-ligand complex structures or the protein-ligand complex structures must be the equilibrium states, otherwise the calculated G values should drift depending on the simulation time.
The consensus score of the DIA model and the docking score was also examined. The simple sum of the G value obtained by the DIAV_LW and Sievgene docking score gave a correlation coefficient between the consensus score and the experimental G of 0.796. The simple sum of the G value obtained by the DIAV_LC and Sievgene docking score gave a correlation coefficient between the consensus score and the experimental G of 0.782. Thus, the consensus score worked well and it slightly improved the G estimation.
Recently, a machine-learning approach was applied to improve the docking score. Such new method showed the G standard deviation (SD) error of 1.5 kcal/mol and the correlation coefficients between the experimental binding free energy and the docking score reached 0.76 based on a single structure [37]. This result is better than our current result (R = 0.75−0.76, SD = 1.7-1.8 kcal/mol; see Table 1), however the used protein-ligand datasets were different to each other. The accuracy of the other docking score could be improved considering the suitable ensemble of the protein-ligand complex structures.

Method: The Docking Score Calculation
A protein-compound docking simulation was performed by the program Sievgene, which is a protein-ligand flexible docking program for in silico drug screening [7]. This program generates many conformers (100 conformers by default) for each compound and keeps the target protein structure rigid, but with soft interaction forces adapting its slight structural change to some extent. The Sievgene scoring function was designed to consider the structural change of the target protein. In the inner region of the target protein, the protein is approximated as an elastic body, while the atomic pair-wise scoring function is applied in the outer region of the target protein. This docking program was developed with a performance yielding about 50% of the reconstructed complexes at a distance of less than 2 Å RMSD for the 132 complexed receptors with the compounds in PDB. The results predicted by our program were almost the same as those predicted by other docking programs [7]. The docking score (H dock ) to estimate the protein-ligand binding free energy was determined as: (15) where N rot , E ASA , E vdW , E ele , E hyd , and E intra-vdW represent the number of rotatable bonds of the ligand molecule, the hydrophobic energy due to the accessible surface area, the vdW energy, the protein-ligand Coulombic potential, the hydrogen bond energy, and the intra-molecular vdW energy of the ligand for Sievgene [7]. Also, c rot , c AV , c ele , c hyd , and c intra-vdW are the optimized coefficients for each energy term. For each atom type, the sum of E ASA and E vdW gives a grid potential, and both energy terms are always simultaneously calculated. Thus, these two terms share the same coefficient, c AV . Sievgene utilizes the grid potential to calculate each energy term except for the intra-molecular interactions. In this study, a mesh size of 60 × 60 × 60 was adopted.

Data Preparation
To determine the coefficients for the G scores for several current models, we performed a protein-ligand docking simulation based on the known complex structures registered in the Protein Data Bank. The data and the procedure were almost the same as those used in the previous study [30]. Here, 34 complexes accompanied by the experimental binding free-energy values were selected from the database that was used to determine the G scores of the PRO_LEADS [6]. The PDB identifiers, the names (protein names and ligand names), the molecular weights (MW), the number of hydrogen bond acceptors (HA) and the number of hydrogen bond donors (HD) of ligands are summarized in Table 5. There was no peptide-like compound. The MWs were distributed from 114 to 606 Da. To assess the ligand diversity, we calculated the average Tanimoto index and the standard deviation of the all 34 ligands  all 34 ligands by using Maximum Common Substructure (MCS) method [38]. The average Tanimoto index and the standard deviation were 0.29 and 0.19, respectively. These values showed that the used ligand molecules were diverse. In the test dataset, the metalloproteins were removed from the present analysis. Metal atoms (Zn and Fe atoms) formed covalent bonds with O and S atoms of the ligands, and the classical force field that we applied could not represent the covalent bond. Thus, the present method cannot calculate G values for metalloproteins with high precision.
The structural ensembles generated from the PDB structure given by MD in explicit water were prepared as follows. All target proteins were prepared with ligands (forming a protein-ligand complex structure). In the previous study, all metal atoms in the systems were removed, since the target proteins were not metalloproteases. Some non-metalloproteins include metal atoms those bind to the proteins. In the current study, all metal atoms of the PDB files were included in the MD simulations. The force fields and the charges of the protein atoms originated from AMBER parm99 [36]. The atomic charge of each ligand was determined by the restricted electrostatic point charge (RESP) procedure using HF/6-31G*-level quantum chemical calculations [39]. We used Gaussian98 to perform the quantum chemical calculations [40]. The initial coordinates of protein and ligand molecule of each data were fixed to the experimentally determined coordinates. The whole structure of each protein was embedded in a sphere of TIP3P [41] water molecules (CAP water), including ion particles of 0.1% Na + and Cl − , in order to neutralize the total charge of the systems. The center of the sphere was set at the mass center of the protein. The shortest distance between the protein atom and the CAP sphere wall was set to 10 Å.  Before an MD calculation was performed for the entire system, an MD calculation for only the solvent parts (solvent water and counter ions) was performed with the protein, ligand, and metal ion coordinates fixed, so as to bring the solvent parts sufficiently close to an equilibrium state. The SHAKE method was used to constrain covalent bonds between heavy and hydrogen atoms in any molecule in the system [42]. MD simulations of the entire system were performed using 2.0 fs time steps with the temperature set at 310 K; the fast multipole method [43] was used to calculate the Coulombic interaction. The cutoff distance of the van der Waals interaction was 12.0 Å. The MD simulations were performed by using cosgene/myPresto [15]. After equilibration steps of 1,000 ps, the protein coordinates were sampled every 2 ps. Finally, we obtained 1,000 structures for each target protein in the 2,000 ps production run. The software program myPresto version 4 (http://presto.protein.osaka-u.ac.jp/myPresto4/index_e.html) was used for the simulation. The 2-nsec MD simulations cost average 79 h (max 229 h) using 4-core parallel computation on intel Xeon 5600. The trajectory analysis for the DIA method cost average 580 second using single core on intel Xeon 5600.

Conclusions
We have developed new computational models for protein-ligand binding free energy estimation. The DIAV_LC and DIAV_LW models were based on the trajectory average of the protein-ligand interaction with the ligand-entropy term. The mobility of the water molecules in the ligand-binding pocket was used to calculate the weight of the each residue-ligand interaction of the target protein. The interactions of residues around the low-mobility water were weighted comparing to the interactions of other residues. The consideration of the ligand entropy and the weight of the residues reduced the target-protein dependence of the parameters of the DIA models and consequently the accuracy was improved. The average error of G estimation was 1.3 kcal/mol and the correlation coefficient between the experimental values and the calculated values was 0.75, when the correct protein-ligand complex structures were provided. The trajectory average of the docking score improved the correlation between the docking score and the experimental G values. In addition, the simple sum of the G value obtained by the DIAV_LW/DIAV_LC and a docking score showed the correlation coefficient between the consensus score and the experimental G of 0.8. Thus, the consensus score worked well, and it slightly improved the G estimation.