A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability
School of Life Science and State Key Laboratory of Agrobiotechnology, G94, Science Center South Block, The Chinese University of Hong Kong, Shatin 999077, Hong Kong
*
Author to whom correspondence should be addressed.
Molecules 2019, 24(13), 2414; https://doi.org/10.3390/molecules24132414
Received: 8 June 2019 / Revised: 28 June 2019 / Accepted: 29 June 2019 / Published: 30 June 2019
Machine learning plays an important role in ligand-based virtual screening. However, conventional machine learning approaches tend to be inefficient when dealing with such problems where the data are imbalanced and features describing the chemical characteristic of ligands are high-dimensional. We here describe a machine learning algorithm LBS (local beta screening) for ligand-based virtual screening. The unique characteristic of LBS is that it quantifies the generalization ability of screening directly by a refined loss function, and thus can assess the risk of over-fitting accurately and efficiently for imbalanced and high-dimensional data in ligand-based virtual screening without the help of resampling methods such as cross validation. The robustness of LBS was demonstrated by a simulation study and tests on real datasets, in which LBS outperformed conventional algorithms in terms of screening accuracy and model interpretation. LBS was then used for screening potential activators of HIV-1 integrase multimerization in an independent compound library, and the virtual screening result was experimentally validated. Of the 25 compounds tested, six were proved to be active. The most potent compound in experimental validation showed an EC50 value of 0.71 µM.
View Full-Text
Keywords:
local beta screening; ligand-based virtual screening; machine learning; generalization ability; HIV-1 integrase
▼
Show Figures
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited
- Supplementary File 1:
ZIP-Document (ZIP, 407 KiB)
-
Externally hosted supplementary file 1
Link: https://zenodo.org/record/3241840#.XPvoLdszYdU
Description: Figure S1: The detailed procedure of LBS, Figure S2: Comparison of LBS and molecular docking on dataset of identifying activators of HIV-1 integrase multimerization, Table S1: Dose-response data and SMILES of the compounds in the experiment for validation.
MDPI and ACS Style
Dai, W.; Guo, D. A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability. Molecules 2019, 24, 2414.
AMA Style
Dai W, Guo D. A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability. Molecules. 2019; 24(13):2414.
Chicago/Turabian StyleDai, Weixing; Guo, Dianjing. 2019. "A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability" Molecules 24, no. 13: 2414.
Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.
Search more from Scilit