Application of InterCriteria Analysis to Assess the Performance of Scoring Functions in Molecular Docking Software Packages
Abstract
:1. Introduction
2. Materials and Methods
2.1. InterCriteria Analysis Approach
C1 | … | Ck | … | Cn | |
O1 | … | … | |||
… | … | … | … | … | … |
Oi | … | … | |||
… | … | … | … | … | … |
Om | … | … |
- −
- counts the cases where the relations and are identical.
- −
- counts cases where the relation is the one dual to the relation .
- , called, in terms of ICrA, degree of agreement;
- , called degree of disagreement.
C1 | … | Ck | … | Cn | |
C1 | … | … | |||
… | … | … | … | … | … |
Ck | … | … | |||
… | … | … | … | … | … |
Cn | … | … |
- Positive consonance, whenever > α and < β;
- Negative consonance, whenever < β and > α;
- Dissonance, otherwise.
2.2. Software Implementation of ICrA
- ✓
- The ability to load comma-separated value (CSV) files with headers by row and column, which are taken as object and criteria names, respectively. This allows loading data from any computer program able to output tables in CSV format.
- ✓
- A functionality that allows a user definition of the thresholds α and β. The default values set in ICrAData are α = 0.75 and β = 0.25 and they were used in the current investigation.
- ✓
- To visualize the results better, cell colors were introduced in accordance with the rules below and the user-defined thresholds α and β:
- −
- In the case of positive consonance, the results are colored in green;
- −
- In the case of negative consonance, the results are colored in red;
- −
- Otherwise, when there is dissonance, the results are colored in magenta.
- ✓
- ICrAData automatically records the results every 15 min and when exiting the program in order to prevent overwriting or accidental loss of data.
2.3. Dataset
2.4. Investigated Scoring Functions
- ➢
- Empirical: Use training sets of X-ray protein–ligand complexes and multiple linear regression as a statistical method to derive the equation. The equation terms describe important binding interactions, such as hydrogen bonds (H-bonds), ionic, hydrophobic, loss in the ligand flexibility (entropy), etc.
- ➢
- Force-field based: Based on classical force fields for proteins and use Lennard-Jones and Coulomb potentials to describe the enthalpy terms. The entropy terms are often missing.
- ➢
- Knowledge-based (known also as Potential of Mean Force): Involve distance-dependent interaction potentials derived through the statistical analyses of a large number of crystal structures of protein–ligand complexes. Such functions describe the interactions between each pair of ligand–protein atoms based on the structural information of the X-ray protein–ligand coordinates. In contrast to the above-mentioned two types, knowledge-based functions do not have physical interpretation.
- ✓
- MOE v. 2016.08 (Molecular Operating Environment, The Chemical Computing Group http://www.chemcomp.com (accessed on 3 June 2022)) operates with five scoring functions, four of which are empirical (London dG, ASE, Affinity dG, and Alpha HB) and one is a force-field scoring function (GBVI/WSA dG) [26].
- ✓
- GOLD v. 5.6.3 (Genetic Optimization for Ligand Docking, The Cambridge Crystallographic Data Center https://www.ccdc.cam.ac.uk/solutions/csd-discovery/components/gold/ (accessed on 3 June 2022)) operates with four scoring functions, two of which are empirical (ChemPLP and ChemScore), one is knowledge-based (ASP), and one is force-field scoring function (GoldScore) [26].
- ✓
- SeeSAR v. 4.3 (BioSolveIT, https://www.biosolveit.de/SeeSAR (accessed on 3 June 2022)) operates with two empirical scoring functions—HYDE [27,28] and FlexX [29].
- ✓
- AutoDock Vina v. 1.2.0. (The Scripps Research Institute, https://vina.scripps.edu (accessed on 3 June 2022)) implements a scoring function of a mixed type (knowledge-based and empirical) [30,31].
3. Results and Discussion
3.1. Applied ICrA to Assess the Performance of the Scoring Functions in MOE
- (i)
- Binding energies calculated by different scoring functions as an indicator of protein–ligand binding affinity (for all scoring functions, lower scores indicate more favorable poses; the unit is kcal/mol):
- −
- The value of the binding energy for the best out of 30 saved docking poses, for each of the tested compounds;
- −
- The average value of the binding energies of the best 5 and 10 poses (having in mind that the pose with the lowest energy is not always the bioactive one), for each of the tested compounds;
- −
- The average values of the binding energies of the 30 saved docking poses, for each of the tested compounds.
- (ii)
- RMSD between the atoms in the benzamidine substructure (nine heavy atoms) of the docked ligands and the matching atoms in the co-crystalized ligands (NAPAP in 1ETS and 4-TAPAP in 1PPH) in the crystallographic structures of the PDB complexes, as an indicator of the geometric proximity in the pose prediction.
3.2. Applied ICrA to Assess the Performance of the Scoring Functions in GOLD
- (i).
- Binding energies calculated by different scoring functions as an indicator of protein–ligand binding affinity:
- −
- For the best docking poses for each of the tested compounds;
- −
- The average values of the binding energies of the best 5 and 10 poses;
- −
- The average values of the binding energies of the 30 saved docking poses.
- (ii).
- Three running modes concerning the GOLD algorithm’s speed:
- −
- Slow—the slowest, but the most accurate one;
- −
- Fast—the fastest, but the least accurate one;
- −
- The medium one.
3.3. Applied ICrA to Assess the Performance of the Scoring Functions in SeeSAR
3.4. Applied ICrA to Assess the Performance of the Scoring Function in AutoDock Vina
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Höltje, H.-D. (Ed.) Molecular Modeling: Basic Principles and Applications, 3rd ed.; Wiley-VCH: Weinheim, Germany, 2008. [Google Scholar]
- Cheng, T.; Li, X.; Li, Y.; Liu, Z.; Wang, R. Comparative Assessment of Scoring Functions on a Diverse Test Set. J. Chem. Inf. Model. 2009, 49, 1079–1093. [Google Scholar] [CrossRef]
- Li, Y.; Han, L.; Liu, Z.; Wang, R. Comparative Assessment of Scoring Functions on an Updated Benchmark: 2. Evaluation Methods and General Results. J. Chem. Inf. Model. 2014, 54, 1717–1736. [Google Scholar] [CrossRef] [PubMed]
- Su, M.; Yang, Q.; Du, Y.; Feng, G.; Liu, Z.; Li, Y.; Wang, R. Comparative Assessment of Scoring Functions: The CASF-2016 Update. J. Chem. Inf. Model. 2019, 59, 895–913. [Google Scholar] [CrossRef] [PubMed]
- Khamis, M.A.; Gomaa, W. Comparative Assessment of Machine-learning Scoring Functions on PDBbind 2013. Eng. Apll. Artif. Intell. 2015, 45, 136–151. [Google Scholar] [CrossRef]
- Xu, W.; Lucke, A.J.; Fairlie, D.P. Comparing Sixteen Scoring Functions for Predicting Biological Activities of Ligands for Protein Targets. J. Mol. Graph. Model. 2015, 57, 76–88. [Google Scholar] [CrossRef] [Green Version]
- Wang, Z.; Sun, H.; Yao, X.; Li, D.; Xu, L.; Li, Y.; Tiand, S.; Hou, T. Comprehensive Evaluation of Ten Docking Programs on a Diverse Set of Protein-ligand Complexes: The Prediction Accuracy of Sampling Power and Scoring Power. Phys. Chem. Chem. Phys. 2016, 18, 12964–12975. [Google Scholar] [CrossRef]
- Stanzione, F.; Giangreco, I.; Cole, J.C. Use of Molecular Docking Computational Tools in Drug Discovery. Prog. Med. Chem. 2021, 60, 273–343. [Google Scholar] [CrossRef]
- Atanassov, K.; Mavrov, D.; Atanassova, V. Intercriteria Decision Making: A New Approach for Multicriteria Decision Making, Based on Index Matrices and Intuitionistic Fuzzy Sets. In Issues in Intuitionistic Fuzzy Sets and Generalized Nets; Wydawnictwo WIT: Piwniczna-Zdrój, Poland, 2014; Volume 11, pp. 1–8. [Google Scholar]
- Atanassov, K. Generalized index matrices. C. R. Acad. Bulg. Sci. 1987, 40, 15–18. [Google Scholar]
- Atanassov, K. Index Matrices: Towards an Augmented Matrix Calculus; Studies in Computational Intelligence; Springer International Publishing: Cham, Switzerland, 2014; Volume 573. [Google Scholar] [CrossRef]
- Atanassov, K. Intuitionistic Fuzzy Logics; Springer: Cham, Switzerland, 2017. [Google Scholar]
- Zadeh, L.A. Fuzzy Sets. Inf. Control 1965, 8, 338–353. [Google Scholar] [CrossRef] [Green Version]
- Jekova, I.; Vassilev, P.; Stoyanov, T.; Pencheva, T. InterCriteria Analysis: Application for ECG Data Analysis. Mathematics 2021, 9, 854. [Google Scholar] [CrossRef]
- Ilkova, T.; Petrov, M. InterCriteria analysis for evaluation of the pollution of the Struma River in the Bulgarian section. Notes IFSs 2016, 22, 120–130. [Google Scholar]
- Roeva, O.; Fidanova, S. Comparison of Different Metaheuristic Algorithms Based on InterCriteria Analysis. J. Comput. Appl. Math. 2018, 340, 615–628. [Google Scholar] [CrossRef]
- Krawczak, M.; Bureva, V.; Sotirova, E.; Szmidt, E. Application of the intercriteria decision making method to universities ranking. Adv. Intell. Syst. Comput. 2016, 401, 365–372. [Google Scholar]
- Tsakovska, I.; Alov, P.; Ikonomov, N.; Atanassova, V.; Vassilev, P.; Roeva, O.; Jereva, D.; Atanassov, K.; Pajeva, I.; Pencheva, T. Intercriteria analysis implementation for exploration of the performance of various docking scoring functions. Stud. Comput. Intell. 2021, 902, 88–98. [Google Scholar]
- Atanassov, K.; Szmidt, E.; Kacprzyk, J. On intuitionistic fuzzy pairs. Notes Intuit. Fuzzy Sets 2013, 19, 1–13. [Google Scholar]
- Atanassov, K.; Atanassova, V.; Gluhchev, G. Intercriteria analysis: Ideas and problems. Notes Intuit. Fuzzy Sets 2015, 21, 81–88. [Google Scholar]
- Roeva, O.; Vassilev, P.; Ikonomov, N.; Angelova, M.; Su, J.; Pencheva, T. On Different Algorithms for InterCriteria Relations Calculation. In Intuitionistic Fuzziness and Other Intelligent Theories and Their Applications; Hadjiski, M., Atanassov, K., Eds.; Studies in Computational Intelligence; Springer: Berlin, Germany, 2019; Volume 757, pp. 143–160. [Google Scholar]
- Ikonomov, N.; Vassilev, P.; Roeva, O. ICrAData—Software for InterCriteria Analysis. Int. J. Bioautom. 2018, 22, 1–10. [Google Scholar] [CrossRef]
- Boehm, M.; Stürzebecher, J.; Klebe, G. Three-Dimensional Quantitative Structure-Activity Relationship Analyses Using Comparative Molecular Field Analysis and Comparative Molecular Similarity Indices Analysis to Elucidate Selectivity Differences of Inhibitors Binding to Trypsin, Thrombin, and Factor Xa. J. Med. Chem. 1999, 42, 458–477. [Google Scholar] [CrossRef]
- Licari, L.G.; Kovacic, J.P. Thrombin Physiology and Pathophysiology. J. Vet. Emerg. Crit. Care 2009, 19, 11–22. [Google Scholar] [CrossRef]
- Maloy, S.R.; Hughes, K.T. (Eds.) Brenner’s Encyclopedia of Genetics, 2nd ed.; Academic Press: Amsterdam, The Netherlands, 2013. [Google Scholar]
- Kalinowsky, L.; Weber, J.; Balasupramaniam, S.; Baumann, K.; Proschak, E. A Diverse Benchmark Based on 3D Matched Molecular Pairs for Validating Scoring Functions. ACS Omega 2018, 3, 5704–5714. [Google Scholar] [CrossRef]
- Reulecke, I.; Lange, G.; Albrecht, J.; Klein, R.; Rarey, M. Towards an Integrated Description of Hydrogen Bonding and Dehydration: Decreasing False Positives in Virtual Screening with the HYDE Scoring Function. ChemMedChem 2008, 3, 885–897. [Google Scholar] [CrossRef] [PubMed]
- Schneider, N.; Lange, G.; Hindle, S.; Klein, R.; Rarey, M. A Consistent Description of HYdrogen Bond and DEhydration Energies in Protein–Ligand Complexes: Methods behind the HYDE Scoring Function. J. Comput. Aided Mol. Des. 2013, 27, 15–29. [Google Scholar] [CrossRef] [PubMed]
- Rarey, M.; Kramer, B.; Lengauer, T.; Klebe, G. A Fast Flexible Docking Method Using an Incremental Construction Algorithm. J. Mol. Biol. 1996, 261, 470–489. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Koes, D.R.; Baumgartner, M.P.; Camacho, C.J. Lessons Learned in Empirical Scoring with Smina from the CSAR 2011 Benchmarking Exercise. J. Chem. Inf. Model. 2013, 53, 1893–1904. [Google Scholar] [CrossRef] [PubMed]
- Trott, O.; Olson, A.J. AutoDock Vina: Improving the Speed and Accuracy of Docking with a New Scoring Function, Efficient Optimization, and Multithreading. J. Comput. Chem. 2010, 31, 455–461. [Google Scholar] [CrossRef] [Green Version]
- Molecular Operating Environment (MOE); Version 2020.09; Chemical Computing Group ULC: Montreal, QC, Canada, 2022.
- Jones, G.; Willett, P.; Glen, R.C. Molecular Recognition of Receptor Sites Using a Genetic Algorithm with a Description of Desolvation. J. Mol. Biol. 1995, 245, 43–53. [Google Scholar] [CrossRef]
- Jones, G.; Willett, P.; Glen, R.C.; Leach, A.R.; Taylor, R. Development and Validation of a Genetic Algorithm for Flexible Docking 1 1Edited by F. E. Cohen. J. Mol. Biol. 1997, 267, 727–748. [Google Scholar] [CrossRef] [Green Version]
- Eldridge, M.D.; Murray, C.W.; Auton, T.R.; Paolini, G.V.; Mee, R.P. Empirical Scoring Functions: I. The Development of a Fast Empirical Scoring Function to Estimate the Binding Affinity of Ligands in Receptor Complexes. J. Comput. Aided Mol. Des. 1997, 11, 425–445. [Google Scholar] [CrossRef]
- Baxter, C.A.; Murray, C.W.; Clark, D.E.; Westhead, D.R.; Eldridge, M.D. Flexible Docking Using Tabu Search and an Empirical Estimate of Binding Affinity. Proteins 1998, 33, 367–382. [Google Scholar] [CrossRef]
- Mooij, W.T.M.; Verdonk, M.L. General and Targeted Statistical Potentials for Protein-Ligand Interactions. Proteins 2005, 61, 272–287. [Google Scholar] [CrossRef]
- Verdonk, M.L.; Berdini, V.; Hartshorn, M.J.; Mooij, W.T.M.; Murray, C.W.; Taylor, R.D.; Watson, P. Virtual Screening Using Protein−Ligand Docking: Avoiding Artificial Enrichment. J. Chem. Inf. Comput. Sci. 2004, 44, 793–806. [Google Scholar] [CrossRef] [PubMed]
- Boehm, H.-J. The Development of a Simple Empirical Scoring Function to Estimate the Binding Constant for a Protein-Ligand Complex of Known Three-Dimensional Structure. J. Comput. Aided Mol. Des. 1994, 8, 243–256. [Google Scholar] [CrossRef] [PubMed]
Software Package/Scoring Function | Source |
---|---|
MOE | |
ASE is based on the Gaussian approximation and depends on the radii of the atoms and the distance between the ligand atom–receptor atom pairs. ASE is proportional to the sum of the Gaussians over all ligand atom–receptor atom pairs. | [32] |
Affinity dG is a linear function that calculates the enthalpy contribution to the binding free energy, including terms based on: interactions between H-bond donor–acceptor pairs, hydrophobic and ionic interactions, metal ligation, also unfavorable interactions (between hydrophobic and polar atoms) and favorable interactions (between any two atoms) ones. | [26] |
Alpha HB is a linear combination of two terms: (i) the geometric fit of the ligand to the binding site taking into account the attraction and repulsion depending on the distance between the atoms; and (ii) H-bonding effects. | [26] |
London dG estimates the free binding energy of the ligand, counting for the average gain or loss of rotational and translational entropy; the loss of flexibility of the ligand; the geometric imperfections of H-bonds and metal ligations compared to the ideal ones; and the desolvation energy of atoms. | [26] |
GBVI/WSA dG estimates the free energy of binding of the ligand taking into account the weighted terms for the Coulomb energy, solvation energy, and van der Waals contributions. | [26] |
GOLD | |
GoldScore comprises the following terms: van der Waals and H-bonds energies between the protein and the ligand, and the internal van der Waals and torsional strain energies of the ligand. | [2,33,34] |
ChemScore incorporates terms for: the total free energy change upon ligand binding; a protein–ligand atom clash; and an internal energy term. It takes account of hydrophobic–hydrophobic contact area, H-bonds, ligand flexibility, and metal interactions. | [2,35,36] |
ASP (Astex Statistical Potential) is a statistic atom–atom potential generated from the statistical analysis of protein–ligand interactions found in the PDB. It considers the different occurrences of different atom types on protein molecules and incorporates volume corrections for protein atoms and ligand atoms. | [2,26,37,38] |
ChemPLP combines parameters from the ChemScore (distance and angle dependences of hydrogen and metal bonds) and PLP (piecewise linear potential) scoring function (heavy-atom-collision and torsion potentials, covalent bond contributions, protein sidechain flexibility, and optional constrains). | [26] |
SeeSAR | |
FlexX applies an incremental construction method to split the ligands into fragments, positioning the fragment (or combinations thereof) into multiple places in the pocket, and scoring based on a simple fast pre-scoring scheme. The ligand is further built up from the fragments and the interim solutions are comparatively scored considering the hydrogen bonds, the ionic interactions, the lipophilic protein–ligand contact surface, and the number of rotatable bonds in the ligand. | [29,39] |
HYDE calculates the realistic free energies of binding by approximating affinities based on two major physical driving forces: atoms’ desolvation and interactions. | [27,28] |
AutoDock Vina | |
AutoDock Vina combines empirical scoring functions and knowledge-based potentials by extracting empirical information from the preferred conformational states of the receptor–ligand complexes, as well as of the experimental affinity measurements. The function consists of weighted terms for steric interactions (attraction and repulsion), hydrophobic interactions, H-bonds, and rotation. | [26,30,31] |
Protein Target | Affinity_dG | Alpha_HB | ASE | GBVI_WSA_dG | London_dG | GoldScore | ChemScore | ASP | ChemPLP | FlexX | HYDE | Vina |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Thrombin | 0.488 | 0.479 | 0.450 | 0.560 | 0.479 | 0.516 | 0.565 | 0.562 | 0.556 | 0.512 | 0.418 | 0.433 |
Trypsin | 0.369 | 0.404 | 0.358 | 0.538 | 0.489 | 0.690 | 0.613 | 0.612 | 0.660 | 0.364 | 0.397 | 0.347 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jereva, D.; Alov, P.; Tsakovska, I.; Angelova, M.; Atanassova, V.; Vassilev, P.; Ikonomov, N.; Atanassov, K.; Pajeva, I.; Pencheva, T. Application of InterCriteria Analysis to Assess the Performance of Scoring Functions in Molecular Docking Software Packages. Mathematics 2022, 10, 2549. https://doi.org/10.3390/math10152549
Jereva D, Alov P, Tsakovska I, Angelova M, Atanassova V, Vassilev P, Ikonomov N, Atanassov K, Pajeva I, Pencheva T. Application of InterCriteria Analysis to Assess the Performance of Scoring Functions in Molecular Docking Software Packages. Mathematics. 2022; 10(15):2549. https://doi.org/10.3390/math10152549
Chicago/Turabian StyleJereva, Dessislava, Petko Alov, Ivanka Tsakovska, Maria Angelova, Vassia Atanassova, Peter Vassilev, Nikolay Ikonomov, Krassimir Atanassov, Ilza Pajeva, and Tania Pencheva. 2022. "Application of InterCriteria Analysis to Assess the Performance of Scoring Functions in Molecular Docking Software Packages" Mathematics 10, no. 15: 2549. https://doi.org/10.3390/math10152549
APA StyleJereva, D., Alov, P., Tsakovska, I., Angelova, M., Atanassova, V., Vassilev, P., Ikonomov, N., Atanassov, K., Pajeva, I., & Pencheva, T. (2022). Application of InterCriteria Analysis to Assess the Performance of Scoring Functions in Molecular Docking Software Packages. Mathematics, 10(15), 2549. https://doi.org/10.3390/math10152549