Binding Affinity via Docking: Fact and Fiction

Pantsar, Tatu; Poso, Antti

doi:10.3390/molecules23081899

Open AccessOpinion

Binding Affinity via Docking: Fact and Fiction

by

Tatu Pantsar

¹

and

Antti Poso

^1,2,*

¹

School of Pharmacy, University of Eastern Finland, P.O. BOX 1627, 70211 Kuopio, Finland

²

Department of Internal Medicine VIII, University Hospital Tübingen, Otfried-Müller-Strasse 14, 72076 Tübingen, Germany

^*

Author to whom correspondence should be addressed.

Molecules 2018, 23(8), 1899; https://doi.org/10.3390/molecules23081899

Submission received: 4 July 2018 / Revised: 22 July 2018 / Accepted: 26 July 2018 / Published: 30 July 2018

(This article belongs to the Special Issue Molecular Modeling in Drug Design)

Download Versions Notes

Abstract

:

In 1982, Kuntz et al. published an article with the title “A Geometric Approach to Macromolecule-Ligand Interactions”, where they described a method “to explore geometrically feasible alignment of ligands and receptors of known structure”. Since then, small molecule docking has been employed as a fast way to estimate the binding pose of a given compound within a specific target protein and also to predict binding affinity. Remarkably, the first docking method suggested by Kuntz and colleagues aimed to predict binding poses but very little was specified about binding affinity. This raises the question as to whether docking is the right tool to estimate binding affinity. The short answer is no, and this has been concluded in several comprehensive analyses. However, in this opinion paper we discuss several critical aspects that need to be reconsidered before a reliable binding affinity prediction through docking is realistic. These are not the only issues that need to be considered, but they are perhaps the most critical ones. We also consider that in spite of the huge efforts to enhance scoring functions, the accuracy of binding affinity predictions is perhaps only as good as it was 10–20 years ago. There are several underlying reasons for this poor performance and these are analyzed. In particular, we focus on the role of the solvent (water), the poor description of H-bonding and the lack of the systems’ true dynamics. We hope to provide readers with potential insights and tools to overcome the challenging issues related to binding affinity prediction via docking.

Keywords:

docking; solvent effect; binding affinity; scoring function; molecular dynamics

1. Introduction

Docking, originally introduced by Kuntz et al. [1], is a computational method that virtually tries to predict a complex of (usually) two binding partners. Typically, these binding partners are biological macromolecules (e.g., protein, DNA/RNA, peptide) or small molecules (e.g., endogenous ligands, drugs). Although nowadays specific docking methods are available for distinct binding partners, such as HADDOCK for protein-protein docking [2], here we focus on the more traditional small-molecule molecular docking methods, such as GOLD [3,4,5], Surflex-Dock [6], AutoDock [7] and Glide [8,9,10], that are regularly utilized in structure-based drug design to predict ligand interactions with the target protein. In structure-based small-molecule docking a small ligand molecule is aligned inside the binding cavity of the target protein and the resulting docking pose is evaluated by a specific scoring function. The scoring function generates a score for each pose, and the resulting values are used to rank the different poses and ligands. In a methodological sense, there are two independent stages in the docking process: the pose generation and the scoring. The first refers to the methods which are used to create different ligand and protein conformations and aligning different ligand conformations within the binding site of the protein. The latter, the scoring, is required in the docking process for a quantitative estimation of the pose quality. As docking is typically utilized to screen extensive small-molecule (up to millions) chemical libraries, the pose generation and the pose quality evaluation must be carried out by fast methods i.e., the computational cost should be low. To fulfill this, several simplifications are needed in the overall docking process.

The first simplification in docking is related to water, as this solvent is neglected by most docking programs. Only very recently, have several docking methods been introduced where individual water molecules are included in the pose generation and evaluation phase [10,11]. The challenge in water description is related to the fact that these abundant molecules are fast moving and rotating and they participate in hydrogen bonding (H-bonding) as a donor and acceptor. This means that a change in the orientation of a single water molecule in the binding site not only has an effect on the neighboring waters, but also extends to the surrounding multiple hydration layers, thereby affecting the whole water network. In addition, the differentiation of strong and weak H-bonds in these interactions should be considered. Thus, the abundant possibilities in the water arrangement prohibit a feasible, explicit evaluation of all the potential water interactions. The current state of how the water can be treated explicitly in docking is reviewed in [12].

The lack of motion is another simplification in docking. However, the dynamic nature of the whole system in terms of entropy and enthalpy should be acknowledged. Whereas the ligand flexibility is typically included in the docking process, the same does not hold true for protein. Usually, protein is considered as rigid, with the exception of the rotating hydroxyl groups of serine, threonine and tyrosine residues. Obviously, these simplifications affect the quality of the generated poses, which may be artificial [13]. As a result, different approaches that consider the protein flexibility have been developed, such as ensemble docking [14], where docking is conducted in an ensemble of different protein conformations.

The third simplification in docking is related to the analysis of the interactions between the protein and the ligand. The different types of protein–ligand interactions (for non-covalent binders) include ionic interactions, hydrogen bonds and van der Waals interactions (including dispersion, polar and induced interactions). The most accurate way to estimate these interactions is with a quantum mechanics (QM) based approach [15]. However, in most cases QM methods are computationally too expensive for docking purposes. To speed up the interaction analysis, calculations are typically conducted with simple potential energy functions, usually related to force-fields or statistical potentials. While the current force fields and scoring functions are well parametrized, polarization effects and a detailed proton affinity estimation are still lacking.

Docking programs produce one (or several) different poses for every ligand, and further rank different compounds based on their scoring functions. A comparison of different docking programs is difficult as the data sets to estimate the docking performance are often of low quality and there is no consensus on which metrics to apply in these comparisons. For instance, binding affinities predicted by the docking might be incorrect, despite the correctly predicted binding pose. Another example is the case in which a particular docking method performs reasonably with one protein but with another protein, docking poses are constantly mispredicted. These problems are well explained in the work of Cheng et al. [16], in which the frequently used CASF-2007 data set was employed to evaluate docking performance. In the same work, evaluation problems were solved by using three different metrics, namely “docking power”, “ranking power” and “scoring power”. Recently Li et al. [17], described a fourth metric, called “screening power”. “Docking power” is the power to identify the native docking pose among the decoy poses, while “scoring power” is the ability to predict the binding affinity. In virtual screening campaigns, employment of “ranking power” is usually more appropriate, as it is the ability to correctly rank compounds according to their binding affinity. Also, the “screening power” is highly relevant for virtual screening, as it measures how well the method is able to identify the true binders from a random pool of ligands, including non-binders.

Docking is utilized as a tool in both virtual screening and compound optimization. There are several very comprehensive reports indicating unreliable binding affinity predictions by docking; a good summary of those studies has been published by Pagadala et al. [18]. Additionally, in order to achieve reliable binding affinity predictions, the old docking methods being updated and new methods have been published. New parametrizations and methods are based on ever increasing datasets and increased computational capacities, such as the implementation of QM in docking [19] and moving towards dynamic docking [20]. In our laboratory, we have carried out different virtual screening and docking experiments since the early 2000s. While we used docking as a stand-alone approach in our early studies, nowadays, to increase the quality of our results we are increasingly employing diverse methods in parallel to docking. In the following, we briefly discuss those theoretical aspects which have directed us to use docking in combination with other methods. First, we focus on the pose generation in docking and then, we provide a short overview of scoring function caveats. Finally, we discuss the role of water and (the lack of) dynamics in the docking process.

2. Pose Generation and Scoring Functions in Docking

2.1. Pose Generation

Ligand and protein conformational freedom is a huge challenge in docking. Widely used docking programs handle ligands as fully flexible, thus typically generating a very large number of conformations during the docking process. In addition to conformation generation methods, the quality of the docking will depend on the force field. It is evident that the conformational aspect of the process is well optimized as all the widely used docking methods are able to identify the correct bioactive conformation of a ligand (i.e., to recreate the X-ray pose) in several instances. For example, this was shown by Li et al. [17], Warren et al. [21], and the same conclusions were reached in the CASP2 competition [22]. However, these results do not necessarily imply that the pose generation produces the correct ligand conformation and binding pose in all instances. Based on these observations, it is apparent that the focus should be placed on the scoring function to increase the quality of docking.

2.2. Scoring Functions

The strength of a protein-ligand complex is related to the intermolecular interactions between these binding partners, solvent effects and dynamics. The most conservative method to estimate all of these simultaneously, is to apply all-atom molecular dynamics (MD) simulations. However, in order to avoid the significant computational costs related to these simulations, molecular docking utilizes scoring functions to provide a fast and crude estimation of the binding affinity. There are three main types of scoring functions: force-field based, knowledge-based statistical functions, and empirical scoring functions [23]. Force-field based methods utilize molecular mechanics functions for evaluating the direct interactions between a ligand and the protein, and solvent effects are typically evaluated by a generalized Born/surface area (GB/SA) type of approach [24], which is often based on the work of Wesson and Eisenberg [25]. Knowledge-based methods rely on statistical information derived from the existing ligand-receptors complex structures [26] in the form of distance-dependent atom-pair potentials. The third approach, empirical scoring functions [8] is based on the idea that all the relevant factors affecting the binding are expressed in the form of (preferably simple) equations, like those describing H-bonding, rotational/translational degrees of freedom and polar/lipophilic effects. In addition, these equations are balanced by using a regression-type approach; in the literature this approach is sometimes referred to as regression-based scoring.

2.2.1. Enthalpy and Entropy

Scoring functions attempts to estimate the binding affinity, which is directly related to the Gibbs energy of binding. There are several ways to describe the partitions of binding energy and one of these is described in Equation (1). This partition by Ajay and Murcko [27], describes the binding energy as individual components: the solvation/desolvation energy (∆G_solvent); the change in energy of the receptor and ligand due to complex formation (∆G_conf); the change in energy due to specific interactions between the ligand and the receptor (∆G_int); and the contribution due to changes in movement (rotational, translational, vibrational) (∆G_motion).

∆ G_{b i n d} = ∆ G_{s o l v e n t} + ∆ G_{c o n f} + ∆ G_{i n t} + ∆ G_{m o t i o n}

(1)

What can we conclude based on Equation (1)? First, one must note that it is inadequate to only study the protein-ligand interactions. Additionally, it is important to understand how both interact with water before the formation of the complex and how water mediates this process. Also, one must recognize the fact that binding energy includes the conformational aspects of ligand and protein, and also changes in motion (this is mainly an entropic effect). As entropy is directly related to the motion and the temperature, a single protein-ligand complex pose may not provide enough information to reliably predict binding affinity.

Finally, how reliable is the estimation of the strength of the direct interactions (∆G_int) between different binding partners (protein-ligand-solvent)? This depends greatly on the description of the ionic interactions and van der Waals interactions (including H-bonding). The entropic component is thought to be related mainly to the conformational and rotational/translational aspects, but we believe this is an optimistic view. More emphasis should be placed on how much the protein flexibility contributes to the stability of the protein-ligand complex and how the water affects the binding energy.

2.2.2. Direct Interactions

Direct polar interactions between ligand, protein and water are enthalpic in nature. In scoring functions, these interactions are considered by specific terms such as H-bonding, Lennard-Jones type of functions and ionic interactions. Dispersion-type interactions (erroneously called van der Waals interaction) are usually reasonably described by classical Lennard-Jones potential. As indirect proof of how precise these equations are, even the latest parametrization of the OPLS (Optimized Potentials for Liquid Simulations) force field family, OPLS3 [28], utilizes the formerly developed Lennard-Jones parameters. Indeed, many of the scoring functions use this approach to model dispersion [5,24,29], although GOLD uses softer 8–4 potential while Dock and Glide prefer 12–6 potential [1,3,4,5,8,9,10]. Other approaches do exist, for example Surflex uses surface-based description (derived from van der Waals surface) [6] and FlexX has a scoring term based on separate attractive terms for H-bonds, ionic, aromatic and lipophilic interactions and atom-center distance-based repulsive function [30].

One could argue that a proper description of dispersion is needed for accurate ligand-binding prediction, however, a precise understanding of the H-bonding is even more important. Recently, Raschka et al. analyzed the type of interactions found in 136 non-homologous protein-ligand complexes [31], concluding that strong H-bonds are required for most high-affinity ligands. In addition, they disclosed that the protein prefers to act as a H-bonding donor. As an explanation of why the protein prefers to act as a H-bond donor for high-affinity ligands, the authors speculated that geometrically more constrained H-bonding donors were enriched during the evolution. Consequently, a proper H-bonding description in scoring functions is required.

In all scoring functions, both distance and angular parameters are included in the H-bond potentials in similar fashion, in several force fields. In addition, the type of H-bonding (e.g., charged, neutral) is considered. One of the most detailed forms of H-bonding potential is implemented within the Glide XP [10], where three different types of H-bond are used: neutral-neutral, neutral-charged and charged-charged. The functional form of the Glide XP includes several H-bond class-specific modifications and environment-based restrictions. As a result, enhanced recognition of the “false positive” H-bonds is achieved.

2.2.3. Hydrogen Bond Strength and Classification

The hydrogen-bond (H-bond) is mainly an electrostatic interaction, which is typically modelled via Coulomb-type-equations, and for this, the dielectric constant is a critical factor. Unfortunately, a reliable and fast method to calculate the dielectric constant does not exist. One approach is to use QM/MD-methods but this approach is currently too slow for docking purposes [15]. Furthermore, the challenge in H-bond modeling is the high variability of different H-bond types and strengths. Even the environment has a huge effect as demonstrated with water–water H-bonds [32]. In a neutral (pH 7) environment a single water-water H-bond is weak but in basic or acidic medium it becomes a 6-fold stronger and 15% shorter charge assisted H-bond. In fact, H-bond strengths can usually be estimated based on the proton affinities or pK_a-values of both the donor and acceptor site. Based on this approach, a 6-class classification for H-bonding has been developed [33]. Normal, weak H-bonds (those without charge) are the most common type of H-bonds found in biological systems. These are unassisted by charge or resonance and thus are weak, asymmetric and driven by electrostatic force. On the other hand, all the H-bonds assisted by charge (either negative or positive or both) are strong and short, if the pK_a-values of donor and acceptor match. When pK_a-values mismatch (>2 pK_a units), these H-bonds are classified as regular H-bonds. In a biological system this means that the ionization state of the protein and ligand are needed for proper H-bonding evaluation, and also the pK_a-values are required. Most of the current H-bond potentials produce reliable predictions for uncharged H-bonds. The same does not hold true for the H-bonds that include ionizable donor and acceptor groups. There are several computational methods available for both protein and ligand pK_a-value calculations [34,35,36]. Nevertheless, proton affinities are hardly ever considered in scoring functions. Therefore, the Glide approach [10] of differently scoring H-bonds with charge, is probably correct.

3. Water, Dynamics and Docking

Water has an important role in the biological environment, especially in the protein matrix [37,38]. The crucial role of the water in the ligand binding process has long been acknowledged [39]. Water has an important role in ligand binding thermodynamics [40,41], even in the environment of a lipophilic binding cavity [42] and displacing specific water molecules from the binding site may play an important role in the ligand optimization process [43]. Moreover, water related H-bonding networks have a significant influence in the structure-activity relationship [44], and optimizing the ligand taking into account the surrounding water network may result in enhanced binding affinity and prolonged residence time [45]. The problem is that detailed information of how water is located within and around the ligand binding site is mostly unavailable. The most common tool for determining 3D structure, X-ray crystallography, can only provide partial information because the resolution and low-quality electron density limits water detection. Those water molecules which are detected by X-ray are often entropically stabilized [46]. In addition, crystallization conditions are typically far from the biologically relevant ones, and also the co-crystallized ligand molecule(s) may influence the observed hydration network (differently when compared to a docked ligand).

Easy application of water placement in docking is restricted because the water in the binding site is heterogenous. In different locations, an individual water molecule has restricted rotational freedom and H-bonding capabilities. The terminology, “happy” and “unhappy” water has been introduced to describe the individual water energies compared to bulk water [47]. Happy and unhappy water refer to low-energy and high-energy water, respectively. The unhappy water molecules within the binding site have either lost their degree of freedom (entropic penalty) or they are incapable of fulfilling all possible H-bonds (enthalpic penalty), which result in higher energies compared to the bulk water. Therefore, displacing unhappy water molecules from the binding site with the ligand results in a gain in binding affinity [48]. On the other hand, displacing a happy water molecule from the site is typically unfavorable. Furthermore, not all regions within the binding sites are hydrated and occupied by water molecules [49,50]. Areas exist that are energetically so unfavorable for water to occupy that there is no water present; instead, they appear as dry void regions (also referred to as vacuum or dewetted regions) [51,52]. Occupying these regions with a ligand molecule results in both more favorable enthalpy and entropy of binding. The reason for this gain in binding affinity is the fact that the increased protein-ligand interaction surface results in stronger van der Waals interaction. In addition, filling the dewetted region increases entropy. In accordance with this, we have noticed that these vacuum sites played a significant role in determining the compound activity in our series of Autotaxin inhibitors [53].

Even though water has been acknowledged to play an important role in binding, de novo placement of water has not been explicitly included in docking methods, with the exception of Glide XP [10]. The Glide XP includes terms for the hydrophobic enclosure, which promotes the insertion of the lipophilic parts of the ligand in the protein’s lipophilic cavities; thereby, simulating the displacement of potential high-energy water. Moreover, in this method, by utilizing a grid-based methodology “virtual waters” are placed into the binding site, and penalties for are given for improperly solvated hydrophilic (polar or charged) groups and for the water that makes an unusual number of hydrophobic contacts.

As already stated, docking is usually unable to provide a good estimate of the role of the solvation penalties related to the binding. As a result, several complementary computational methods have been developed to identify and analyze water molecules around the protein-ligand complex to estimate its role in binding. Different approaches have been reported and the most popular methods are reviewed by Bodnarchuk [54]. For instance, the Schrödinger’s WaterMap uses a short MD simulation and the estimation of the energies of the hydration sites are derived based on the simulation [48,50], whereas in the Molecular Operating Environment (MOE), the binding desolvation penalties can be estimated by 3D reference interaction site model (3D-RISM), which is based on the density functional theory of liquids [55,56]. The main limitation of these methods is that they are heavily dependent on the protein conformation used in the calculation. To exemplify this, a parallel calculation of the hydration site energy with the same protein may produce totally different results, even if only minor protein conformational change occurs or only one side of the chain conformation is altered. This limitation should be kept in mind when utilizing these methods, as for example, a conformational “induced-fit” effect upon ligand binding (via docking) might hamper the results [57]. Although these methods are now becoming increasingly popular and have demonstrated usefulness in explaining lead molecule structure–activity relationships [58], it is still unclear if these methods are applicable in virtual screening campaigns. One of the first attempts to include these computational approaches directly into scoring functions is WScore [11]. In WScore, a default WaterMap calculation with the apo-protein is utilized to gain insight into the hydration site positions and their corresponding energies. The occupancy of these hydration sites by a ligand are included in the scoring. Moreover, an ensemble docking is carried out that aims to take into account the protein flexibility, which as mentioned above, is the major issue with the WaterMap. The usefulness of WScore and other related methods remain to be seen. Furthermore, conventional MD simulations can be applied to evaluate the hydration networks; thus, some errors related to force field accuracy may arise [59]. In a way, we agree with the statement by Hummer [60], that the contribution of water for the ligand binding may be substantial but its evaluation is challenging.

One of the shortcomings of docking is that it produces only a snapshot of the putative binding conformation. This is a notable limitation, as in real-life the binding event is not a static event, it is dynamic. For instance, we observed a good example of this in a study of 1-/2-monoacylglycerol hydrolysis by Monoacylglycerol lipase (MAGL) [61]. Whereas the wild-type MAGL hydrolyzes both substrates at an identical rate, a C242A mutation in the active site impairs the hydrolysis of the 1-acylglycerol but not the 2-acylglycerol. This mutation had no effect on the binding conformations obtained by the docking; but, it was unable to provide an explanation for the observed difference in the hydrolysis among the substrates. However, in this case even short MD-simulations were capable of highlighting the differences in the substrate binding dynamics that arose due to the mutation.

Perhaps due to the fact that docking is currently unable to consider the impact of water and the dynamic nature of binding, applying MD simulations for the docking pose validation has attracted growing interest in the scientific community [62,63,64,65,66,67]. This is probably also due to increased accessibility to adequate computing resources (e.g., GPUs) that are required for simulations with a reasonable time-scale. Another factor that has made MD simulations more relevant is the improvement in the force fields that are now capable of handling both small molecules and proteins with reasonable accuracy. These improvements have led to more relevant observations from the simulations. Interestingly, MD simulations appear to provide the solution to the two issues that docking is incapable of handling—water and the dynamics.

4. Solution

In this opinion paper, we have exemplified the underlying issues in predicting binding affinity via docking. The main issues are related to the H-bonding and the water description, and how water and the protein-ligand complex should be considered as a dynamic system. While describing the H-bond is clearly an issue, we should also acknowledge that this has already been quite well described in modern force fields. For example, the new OPLS3 and recent AMBER (Assisted Model Building with Energy Refinement) and CHARMM (Chemistry at Harvard Macromolecular Mechanics) force fields include a better H-bonding description [28,68,69]. Additionally, MD is becoming an increasingly robust method to study individual protein-ligand complexes. Unfortunately, the computational costs of MD are still too high to allow virtual screening.

What can be done to increase the accuracy of the binding affinity prediction? With current methods, resolving this issue is extremely challenging. For H-bonding, it is feasible to include a more precise energy evaluation method that would allow recognition and differentiation of the strong and weak H-bonds. However, this requires a fast and reliable pK_a-value calculation that also considers conformational and environmental aspects of the binding cavity. Furthermore, due to the active role of the water in binding, it is obvious that water needs to be explicitly included in the docking process. All the current evidence contradicts docking in the gas phase. WaterMap and other related methods have partially resolved this issue but a more comprehensive solution is required. Finally, implementation of dynamics in scoring functions remains challenging. In future, scoring functions need to be reinvented so that they are able to describe the dynamics related to the binding. Overall, new approaches are required to address the issues discussed above.

Our current solution is based on two comprehensive approaches, one to use docking tools in more efficient ways [53,70], and the other is to use MD simulations to validate the results of the classical docking [71,72]. Prior to any docking experiment, one should explore the flexibility of the target protein, based on both the existing protein structures and MD simulations. At the same time, it is of utmost importance to determine the solvation status of the binding cavity and the energy levels of the potentially happy and unhappy water. Subsequently, this information is further applied in docking by utilizing suitable constraints. This approach can help us to identify more reliable binding poses. Finally, the most promising poses are further analyzed by short (usually 200 ns) MD simulations and followed by WaterMap analysis. However, our approach has two major shortcomings: It is slow and difficult to implement. These shortcomings are tolerable, as long as we have sufficient computing resources and an adequate amount of time to work with the target. We resolve the H-bond issue by estimating the pK_a-values with different computational methods (e.g., QM-polarized docking). Lastly, even after implementing all of these user-based interventions, we always use the most sophisticated scoring function, the eye. If you trust your docking pose, you might be right.

Author Contributions

Writing-Original Draft Preparation, T.P. and A.P.; Writing-Review & Editing, T.P. and A.P.; Funding Acquisition, A.P.

Funding

This research was funded by the Academy of Finland (grant number 276509 to A.P.).

Acknowledgments

We would like to thank Thales Kronenberger for the critical reading and useful comments on the manuscript, and CSC-IT Center for Science Ltd. (Espoo, Finland) for computational resources.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

References

Kuntz, I.; Blaney, J.; Oatley, S.; Langridge, R.; Ferrin, T. A geometric approach to macromolecule-ligand interactions. J. Mol. Biol. 1982, 161, 269–288. [Google Scholar] [CrossRef]
Van Zundert, G.C.P.; Rodrigues, J.P.G.L.M.; Trellet, M.; Schmitz, C.; Kastritis, P.L.; Karaca, E.; Melquiond, A.S.J.; van Dijk, M.; de Vries, S.J.; Bonvin, A.M.J.J.M. The HADDOCK2.2 Web Server: User-Friendly Integrative Modeling of Biomolecular Complexes. J. Mol. Biol. 2016, 428, 720–725. [Google Scholar] [CrossRef] [PubMed]
Jones, G.; Willett, P.; Glen, R.C. Molecular recognition of receptor sites using a genetic algorithm with a description of desolvation. J. Mol. Biol. 1995, 245, 43–53. [Google Scholar] [CrossRef]
Jones, G.; Willett, P.; Glen, R.; Leach, A.; Taylor, R. Development and validation of a genetic algorithm for flexible docking. J. Mol. Biol. 1997, 267, 727–748. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Verdonk, M.; Cole, J.; Hartshorn, M.; Murray, C.; Taylor, R. Improved protein—Ligand docking using GOLD. Proteins 2003, 52, 609–623. [Google Scholar] [CrossRef] [PubMed]
Jain, A.N. Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine. J. Med. Chem. 2003, 46, 499–511. [Google Scholar] [CrossRef] [PubMed]
Trott, O.; Olson, A.J. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 2010, 31, 455–461. [Google Scholar] [CrossRef] [PubMed]
Friesner, R.A.; Banks, J.L.; Murphy, R.B.; Halgren, T.A.; Klicic, J.J.; Mainz, D.T.; Repasky, M.P.; Knoll, E.H.; Shaw, D.E.; Shelley, M.; et al. Glide: A New Approach for Rapid, Accurate Docking and Scoring. 1. Method and Assessment of Docking Accuracy. J. Med. Chem. 2004, 47, 1739–1749. [Google Scholar] [CrossRef] [PubMed]
Halgren, T.A.; Murphy, R.B.; Friesner, R.A.; Beard, H.S.; Frye, L.L.; Pollard, W.T.; Banks, J.L. Glide: A New Approach for Rapid, Accurate Docking and Scoring. 2. Enrichment Factors in Database Screening. J. Med. Chem. 2004, 47, 1750–1759. [Google Scholar] [CrossRef] [PubMed]
Friesner, R.A.; Murphy, R.B.; Repasky, M.P.; Frye, L.L.; Greenwood, J.R.; Halgren, T.A.; Sanschagrin, P.C.; Mainz, D.T. Extra Precision Glide: Docking and Scoring Incorporating a Model of Hydrophobic Enclosure for Protein-Ligand Complexes. J. Med. Chem. 2006, 49, 6177–6196. [Google Scholar] [CrossRef] [PubMed]
Murphy, R.; Repasky, M.; Greenwood, J.; Tubert-Brohman, I.; Jerome, S.; Annabhimoju, R.; Boyles, N.; Schmitz, C.; Abel, R.; Farid, R.; et al. WScore: A Flexible and Accurate Treatment of Explicit Water Molecules in Ligand-Receptor Docking. J. Med. Chem. 2016, 59, 4364–4384. [Google Scholar] [CrossRef] [PubMed]
Hu, X.; Maffucci, I.; Contini, A. Advances in the Treatment of Explicit Water Molecules in Docking and Binding Free Energy Calculations. Curr. Med. Chem. 2018, 25, 1–23. [Google Scholar] [CrossRef] [PubMed]
Chen, Y.C. Beware of docking! Trends Pharmacol. Sci. 2015, 36, 78–95. [Google Scholar] [CrossRef] [PubMed]
Amaro, R.E.; Baudry, J.; Chodera, J.; Demir, Ö.; McCammon, J.A.; Miao, Y.; Smith, J.C. Ensemble Docking in Drug Discovery. Biophys. J. 2018, 114, 2271–2278. [Google Scholar] [CrossRef] [PubMed]
Raha, K.; Peters, M.B.; Wang, B.; Yu, N.; Wollacott, A.M.; Westerhoff, L.M.; Merz, K.M. The role of quantum mechanics in structure-based drug design. Drug Discov. Today 2007, 12, 725–731. [Google Scholar] [CrossRef] [PubMed]
Cheng, T.; Li, X.; Li, Y.; Liu, Z.; Wang, R. Comparative assessment of scoring functions on a diverse test set. J. Chem. Inf. Model. 2009, 49, 1079–1093. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Han, L.; Liu, Z.; Wang, R. Comparative Assessment of Scoring Functions on an Updated Benchmark: 2. Evaluation Methods and General Results. J. Chem. Inf. Model. 2014, 54, 1717–1736. [Google Scholar] [CrossRef] [PubMed]
Pagadala, N.; Syed, K.; Tuszynski, J. Software for molecular docking: A review. Biophys. Rev. 2017, 9, 91–102. [Google Scholar] [PubMed]
Adeniyi, A.A.; Soliman, M.E.S. Implementing QM in docking calculations: Is it a waste of computational time? Drug Discov. Today 2017, 22, 1216–1223. [Google Scholar] [CrossRef] [PubMed]
Gioia, D.; Bertazzo, M.; Recanatini, M.; Masetti, M.; Cavalli, A. Dynamic Docking: A Paradigm Shift in Computational Drug Discovery. Molecules 2017, 22. [Google Scholar] [CrossRef] [PubMed]
Warren, G.; Andrews, C.; Capelli, A.-M.; Clarke, B.; LaLonde, J.; Lambert, M.; Lindvall, M.; Nevins, N.; Semus, S.; Senger, S.; et al. A Critical Assessment of Docking Programs and Scoring Functions. J. Med. Chem. 2006, 49, 5912–5931. [Google Scholar] [CrossRef] [PubMed]
Dixon, J.S. Evaluation of the CASP2 docking section. Proteins 1997, 29 (Suppl. S1), 198–204. [Google Scholar] [CrossRef]
Kitchen, D.; Decornez, H.; Furr, J.; Bajorath, J. Docking and scoring in virtual screening for drug discovery: Methods and applications. Nat. Rev. Drug Discov. 2004, 3, 935–949. [Google Scholar] [PubMed]
Morris, G.; Goodsell, D.; Halliday, R.; Huey, R.; Hart, W.; Belew, R.; Olson, A. Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J. Comput. Chem. 1998, 19, 1639–1662. [Google Scholar] [CrossRef] [Green Version]
Wesson, L.; Eisenberg, D. Atomic solvation parameters applied to molecular dynamics of proteins in solution. Protein Sci. 1992, 1, 227–235. [Google Scholar] [CrossRef] [PubMed]
Gohlke, H.; Hendlich, M.; Klebe, G. Knowledge-based scoring function to predict protein-ligand interactions. J. Mol. Biol. 2000, 295, 337–356. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ajay; Murcko, M.A. Computational Methods to Predict Binding Free Energy in Ligand-Receptor Complexes. J. Med. Chem. 1995, 38, 4953–4967. [Google Scholar] [PubMed]
Harder, E.; Damm, W.; Maple, J.; Wu, C.; Reboul, M.; Xiang, J.Y.; Wang, L.; Lupyan, D.; Dahlgren, M.K.; Knight, J.L.; et al. OPLS3: A Force Field Providing Broad Coverage of Drug-like Small Molecules and Proteins. J. Chem. Theory Comput. 2016, 12, 281–296. [Google Scholar] [CrossRef] [PubMed]
Shoichet, B.; Kuntz, I.; Bodian, D. Molecular docking using shape descriptors. J. Comput. Chem. 1992, 13, 380–397. [Google Scholar] [CrossRef]
Rarey, M.; Kramer, B.; Lengauer, T.; Klebe, G. A fast flexible docking method using an incremental construction algorithm. J. Mol. Biol. 1996, 261, 470–489. [Google Scholar] [CrossRef] [PubMed]
Raschka, S.; Wolf, A.; Bemister-Buffington, J.; Kuhn, L. Protein—Ligand interfaces are polarized: Discovery of a strong trend for intermolecular hydrogen bonds to favor donors on the protein side with implications for predicting and designing ligand complexes. J. Comput. Aided Mol. Des. 2018, 32, 511–528. [Google Scholar] [CrossRef] [PubMed]
Gilli, P.; Pretto, L.; Bertolasi, V.; Gilli, G. Predicting Hydrogen-Bond Strengths from Acid−Base Molecular Properties. The pKa Slide Rule: Toward the Solution of a Long-Lasting Problem. Acc. Chem. Res. 2009, 42, 33–44. [Google Scholar] [CrossRef] [PubMed]
Gilli, P.; Gilli, G. Hydrogen bond models and theories: The dual hydrogen bond model and its consequences. J. Mol. Struct. 2010, 972, 2–10. [Google Scholar] [CrossRef]
Kilambi, K.; Gray, J. Rapid Calculation of Protein pKa Values Using Rosetta. Biophys. J. 2012, 103, 587–595. [Google Scholar] [CrossRef] [PubMed]
Song, Y.; Mao, J.; Gunner, M. MCCE2: Improving protein pKa calculations with extensive side chain rotamer sampling. J. Comput. Chem. 2009, 30, 2231–2247. [Google Scholar] [CrossRef] [PubMed]
Shelley, J.C.; Cholleti, A.; Frye, L.; Greenwood, J.R.; Timlin, M.R.; Uchimaya, M. Epik: A software program for pKa prediction and protonation state generation for drug-like molecules. J. Comput. Aided Mol. Des. 2007, 21, 681–691. [Google Scholar] [CrossRef] [PubMed]
Ball, P. Water is an active matrix of life for cell and molecular biology. Proc. Natl. Acad. Sci. USA 2017, 114, 13327–13335. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Spyrakis, F.; Ahmed, M.H.; Bayden, A.S.; Cozzini, P.; Mozzarelli, A.; Kellogg, G.E. The Roles of Water in the Protein Matrix: A Largely Untapped Resource for Drug Discovery. J. Med. Chem. 2017, 60, 6781–6827. [Google Scholar] [CrossRef] [PubMed]
Ladbury, J.E. Just add water! The effect of water on the specificity of protein-ligand binding sites and its potential application to drug design. Chem. Biol. 1996, 3, 973–980. [Google Scholar] [CrossRef]
Snyder, P.W.; Mecinovic, J.; Moustakas, D.T.; Thomas, S.W., 3rd; Harder, M.; Mack, E.T.; Lockett, M.R.; Héroux, A.; Sherman, W.; Whitesides, G.M. Mechanism of the hydrophobic effect in the biomolecular recognition of arylsulfonamides by carbonic anhydrase. Proc. Natl. Acad. Sci. USA 2011, 108, 17889–17894. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Breiten, B.; Lockett, M.R.; Sherman, W.; Fujita, S.; Al-Sayah, M.; Lange, H.; Bowers, C.M.; Heroux, A.; Krilov, G.; Whitesides, G.M. Water networks contribute to enthalpy/entropy compensation in protein—Ligand binding. J. Am. Chem. Soc. 2013, 135, 15579–15584. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Baron, R.; Setny, P.; McCammon, A. Water in Cavity—Ligand Recognition. J. Am. Chem. Soc. 2010, 132, 12091–12097. [Google Scholar] [CrossRef] [PubMed]
Michel, J.; Tirado-Rives, J.; Jorgensen, W.L. Energetics of Displacing Water Molecules from Protein Binding Sites: Consequences for Ligand Optimization. J. Am. Chem. Soc. 2009, 131, 15403–15411. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Biela, A.; Nasief, N.N.; Betz, M.; Heine, A.; Hangauer, D.; Klebe, G. Dissecting the hydrophobic effect on the molecular level: The role of water, enthalpy, and entropy in ligand binding to thermolysin. Angew. Chem. Int. Ed. Engl. 2013, 52, 1822–1828. [Google Scholar] [CrossRef] [PubMed]
Krimmer, S.G.; Cramer, J.; Betz, M.; Fridh, V.; Karlsson, R.; Heine, A.; Klebe, G. Rational Design of Thermodynamic and Kinetic Binding Profiles by Optimizing Surface Water Networks Coating Protein-Bound Ligands. J. Med. Chem. 2016, 59, 10530–10548. [Google Scholar] [CrossRef] [PubMed]
Beuming, T.; Che, Y.; Abel, R.; Kim, B.; Shanmugasundaram, V.; Sherman, W. Thermodynamic analysis of water molecules at the surface of proteins and applications to binding site prediction and characterization. Proteins 2012, 80, 871–883. [Google Scholar] [CrossRef] [PubMed]
Mason, J.S.; Bortolato, A.; Congreve, M.; Marshall, F.H. New insights from structural biology into the druggability of G protein-coupled receptors. Trends Pharmacol. Sci. 2012, 33, 249–260. [Google Scholar] [CrossRef] [PubMed]
Abel, R.; Young, T.; Farid, R.; Berne, B.J.; Friesner, R.A. The role of the active site solvent in the thermodynamics of factor Xa-ligand binding. J. Am. Chem. Soc. 2008, 130, 2817–2831. [Google Scholar] [CrossRef] [PubMed]
Homans, S.W. Water, water everywhere—Except where it matters? Drug Discov. Today 2007, 12, 534–539. [Google Scholar] [CrossRef] [PubMed]
Young, T.; Abel, R.; Kim, B.; Berne, B.J.; Friesner, R.A. Motifs for molecular recognition exploiting hydrophobic enclosure in protein-ligand binding. Proc. Natl. Acad. Sci. USA 2007, 104, 808–813. [Google Scholar] [CrossRef] [PubMed]
Young, T.; Hua, L.; Huang, X.; Abel, R.; Friesner, R.; Berne, B.J. Dewetting Transitions in Protein Cavities. Proteins 2010, 78, 1856–1869. [Google Scholar] [CrossRef] [PubMed]
Wang, L.; Berne, B.J.; Friesner, R.A. Ligand binding to protein-binding pockets with wet and dry regions. Proc. Natl. Acad. Sci. USA 2011, 108, 1326–1330. [Google Scholar] [CrossRef] [PubMed]
Pantsar, T.; Singha, P.; Nevalainen, T.J.; Koshevoy, I.; Leppänen, J.; Poso, A.; Niskanen, J.M.A.; Pasonen-Seppänen, S.; Savinainen, J.R.; Laitinen, T.; et al. Design, synthesis, and biological evaluation of 2,4-dihydropyrano[2,3-c]pyrazole derivatives as autotaxin inhibitors. Eur. J. Pharm. Sci. 2017, 107, 97–111. [Google Scholar] [CrossRef] [PubMed]
Bodnarchuk, M.S. Water, water, everywhere... It’s time to stop and think. Drug Discov. Today 2016, 21, 1139–1146. [Google Scholar] [CrossRef] [PubMed]
Kovalenko, A.; Hirata, F. Self-consistent description of a metal—Water interface by the Kohn-Sham density functional theory and the three-dimensional reference interaction site model. J. Chem. Phys. 1999, 110, 10095–10112. [Google Scholar] [CrossRef]
Luchko, T.; Gusarov, S.; Roe, D.R.; Simmerling, C.; Case, D.A.; Tuszynski, J.; Kovalenko, A. Three-Dimensional Molecular Theory of Solvation Coupled with Molecular Dynamics in Amber. J. Chem. Theory Comput. 2010, 6, 607–624. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Pearlstein, R.; Sherman, W.; Abel, R. Contributions of water transfer energy to protein-ligand association and dissociation barriers: Watermap analysis of a series of p38α MAP kinase inhibitors. Proteins 2013, 81, 1509–1526. [Google Scholar] [CrossRef] [PubMed]
Bucher, D.; Stouten, P.; Triballeau, N. Shedding Light on Important Waters for Drug Design: Simulations versus Grid-Based Methods. J. Chem. Inf. Model. 2018, 58, 692–699. [Google Scholar] [CrossRef] [PubMed]
Betz, M.; Wulsdorf, T.; Krimmer, S.G.; Klebe, G. Impact of Surface Water Layers on Protein—Ligand Binding: How Well Are Experimental Data Reproduced by Molecular Dynamics Simulations in a Thermolysin Test Case? J. Chem. Inf. Model. 2016, 56, 223–233. [Google Scholar] [CrossRef] [PubMed]
Hummer, G. Molecular binding: Under water’s influence. Nat. Chem. 2010, 2, 906–907. [Google Scholar] [CrossRef] [PubMed]
Laitinen, T.; Navia-Paldanius, D.; Rytilahti, R.; Marjamaa, J.J.; Kařízková, J.; Parkkari, T.; Pantsar, T.; Poso, A.; Laitinen, J.T.; Savinainen, J.R. Mutation of Cys242 of human monoacylglycerol lipase disrupts balanced hydrolysis of 1- and 2-monoacylglycerols and selectively impairs inhibitor potency. Mol. Pharmacol. 2014, 85, 510–519. [Google Scholar] [CrossRef] [PubMed]
Alonso, H.; Bliznyuk, A.A.; Gready, J.E. Combining docking and molecular dynamic simulations in drug design. Med. Res. Rev. 2006, 26, 531–568. [Google Scholar] [CrossRef] [PubMed]
Bartuzi, D.; Kaczor, A.A.; Targowska-Duda, K.M.; Matosiuk, D. Recent Advances and Applications of Molecular Docking to G Protein-Coupled Receptors. Molecules 2017, 22. [Google Scholar] [CrossRef] [PubMed]
Cavalli, A.; Bottegoni, G.; Raco, C.; De Vivo, M.; Recanatini, M. A computational study of the binding of propidium to the peripheral anionic site of human acetylcholinesterase. J. Med. Chem. 2004, 47, 3991–3999. [Google Scholar] [CrossRef] [PubMed]
Colizzi, F.; Perozzo, R.; Scapozza, L.; Recanatini, M.; Cavalli, A. Single-Molecule Pulling Simulations Can Discern Active from Inactive Enzyme Inhibitors. J. Am. Chem. Soc. 2010, 132, 7361–7371. [Google Scholar] [CrossRef] [PubMed]
Ruiz-Carmona, S.; Schmidtke, P.; Luque, F.J.; Baker, L.; Matassova, N.; Davis, B.; Roughley, S.; Murray, J.; Hubbard, R.; Barril, X. Dynamic undocking and the quasi-bound state as tools for drug discovery. Nat. Chem. 2017, 9, 201–206. [Google Scholar] [CrossRef] [PubMed]
Sabbadin, D.; Ciancetta, A.; Moro, S. Bridging Molecular Docking to Membrane Molecular Dynamics to Investigate GPCR—Ligand Recognition: The Human A2A Adenosine Receptor as a Key Study. J. Chem. Inf. Model. 2014, 54, 169–183. [Google Scholar] [CrossRef] [PubMed]
Cerutti, D.; Rice, J.; Swope, W.; Case, D. Derivation of Fixed Partial Charges for Amino Acids Accommodating a Specific Water Model and Implicit Polarization. J. Phys. Chem. B 2013, 117, 2328–2338. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Best, R.; Zhu, X.; Shim, J.; Lopes, P.; Mittal, J.; Feig, M.; MacKerell, A. Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone ϕ, ψ and Side-Chain χ1 and χ2 Dihedral Angles. J. Chem. Theory Comput. 2012, 8, 3257–3273. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Käsnänen, H.; Myllymäki, M.; Minkkilä, A.; Kataja, A.; Saario, S.; Nevalainen, T.; Koskinen, A.; Poso, A. 3-Heterocycle-Phenyl N-Alkylcarbamates as FAAH Inhibitors: Design, Synthesis and 3D-QSAR Studies. Chem. Med. Chem. 2010, 5, 213–231. [Google Scholar] [CrossRef] [PubMed]
Jyrkkärinne, J.; Küblbeck, J.; Pulkkinen, J.; Honkakoski, P.; Laatikainen, R.; Poso, A.; Laitinen, T. Molecular dynamics simulations for human CAR inverse agonists. J. Chem. Inf. Model. 2012, 52, 457–464. [Google Scholar] [CrossRef] [PubMed]
Küblbeck, J.; Jyrkkärinne, J.; Molnár, F.; Kuningas, T.; Patel, J.; Windshügel, B.; Nevalainen, T.; Laitinen, T.; Sippl, W.; Poso, A.; et al. Newin VitroTools to Study Human Constitutive Androstane Receptor (CAR) Biology: Discovery and Comparison of Human CAR Inverse Agonists. Mol. Pharm. 2011, 8, 2424–2433. [Google Scholar] [CrossRef] [PubMed]

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pantsar, T.; Poso, A. Binding Affinity via Docking: Fact and Fiction. Molecules 2018, 23, 1899. https://doi.org/10.3390/molecules23081899

AMA Style

Pantsar T, Poso A. Binding Affinity via Docking: Fact and Fiction. Molecules. 2018; 23(8):1899. https://doi.org/10.3390/molecules23081899

Chicago/Turabian Style

Pantsar, Tatu, and Antti Poso. 2018. "Binding Affinity via Docking: Fact and Fiction" Molecules 23, no. 8: 1899. https://doi.org/10.3390/molecules23081899

Article Menu

Binding Affinity via Docking: Fact and Fiction

Abstract

1. Introduction

2. Pose Generation and Scoring Functions in Docking

2.1. Pose Generation

2.2. Scoring Functions

2.2.1. Enthalpy and Entropy

2.2.2. Direct Interactions

2.2.3. Hydrogen Bond Strength and Classification

3. Water, Dynamics and Docking

4. Solution

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI