MSPEDTI: Prediction of Drug–Target Interactions via Molecular Structure with Protein Evolutionary Information
Abstract
:Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Gold-Standard Datasets
2.2. Drug Structure Characterization
2.3. Target Protein Characterization
2.4. Feature Extraction
2.5. Classification Prediction
3. Results
3.1. Evaluation Indicators
3.2. Assessment of Performance
3.3. Comparison of Different Descriptor Model
3.4. Comparison with Different Classifier Model
3.5. Comparison with Previous Approaches
3.6. Case Studies
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Mamoshina, P.; Volosnikova, M.; Ozerov, I.V.; Putin, E.; Skibina, E.; Cortese, F.; Zhavoronkov, A. Machine learning on human muscle transcriptomic data for biomarker discovery and tissue-specific drug target identification. Front. Genet. 2018, 9, 242. [Google Scholar] [CrossRef] [PubMed]
- Xuan, P.; Sun, C.; Zhang, T.; Ye, Y.; Shen, T.; Dong, Y. Gradient boosting decision tree-based method for predicting interactions between target genes and drugs. Front. Genet. 2019, 10, 459. [Google Scholar] [CrossRef] [PubMed]
- Landry, Y.; Gies, J.-P. Drugs and their molecular targets: An updated overview. Fundam. Clin. Pharmacol. 2008, 22, 1–18. [Google Scholar] [CrossRef] [PubMed]
- Yamanishi, Y.; Kotera, M.; Kanehisa, M.; Goto, S. Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework. Bioinformatics 2010, 26, i246–i254. [Google Scholar] [CrossRef] [PubMed]
- Wang, L.; You, Z.H.; Chen, X.; Li, J.Q.; Yan, X.; Zhang, W.; Huang, Y.A. An ensemble approach for large-scale identification of protein-protein interactions using the alignments of multiple sequences. Oncotarget 2017, 8, 5149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhu, S.; Bing, J.; Min, X.; Lin, C.; Zeng, X. Prediction of drug–gene interaction by using Metapath2vec. Front. Genet. 2018, 9, 248. [Google Scholar] [CrossRef]
- Wang, L.; You, Z.-H.; Zhou, X.; Yan, X.; Li, H.-Y.; Huang, Y.-A. NMFCDA: Combining randomization-based neural network with non-negative matrix factorization for predicting CircRNA-disease association. Appl. Soft Comput. 2021, 110, 107629. [Google Scholar] [CrossRef]
- Wang, L.; Yan, X.; You, Z.-H.; Zhou, X.; Li, H.-Y.; Huang, Y.-A. SGANRDA: Semi-supervised generative adversarial networks for predicting circRNA–disease associations. Brief. Bioinform. 2021, 22, bbab028. [Google Scholar] [CrossRef]
- Casañola-Martin, G.M.; Marrero-Ponce, Y.; Khan, M.T.H.; Khan, S.B.; Torrens, F.; Pérez-Jiménez, F.; Rescigno, A.; Abad, C. Bond-Based 2D Quadratic Fingerprints in QSAR Studies: Virtual and In vitro Tyrosinase Inhibitory Activity Elucidation. Chem. Biol. Drug Des. 2010, 76, 538–545. [Google Scholar] [CrossRef]
- Kar, S.; Roy, K. Development and validation of a robust QSAR model for prediction of carcinogenicity of drugs. Indian J. Biochem. Biophys. 2011, 48, 111–122. [Google Scholar]
- Rarey, M.; Kramer, B.; Lengauer, T.; Klebe, G. A fast flexible docking method using an incremental construction algorithm. J. Mol. Biol. 1996, 261, 470–489. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wallach, I.; Jaitly, N.; Nguyen, K.; Schapira, M.; Lilien, R. Normalizing molecular docking rankings using virtually generated decoys. J. Chem. Inf. Modeling 2011, 51, 1817. [Google Scholar] [CrossRef] [PubMed]
- Wang, L.; You, Z.H.; Chen, X.; Yan, X.; Liu, G.; Zhang, W. RFDT: A Rotation Forest-based Predictor for Predicting Drug-Target Interactions Using Drug Structure and Protein Sequence Information. Curr. Protein Pept. Sci. 2018, 19, 445–454. [Google Scholar] [CrossRef] [PubMed]
- Zhao, L.; Wang, J.; Pang, L.; Liu, Y.; Zhang, J. GANsDTA: Predicting Drug-Target Binding Affinity Using GANs. Front. Genet. 2019, 10, 1243. [Google Scholar] [CrossRef] [PubMed]
- Yang, X.; Kui, L.; Tang, M.; Li, D.; Wei, K.; Chen, W.; Miao, J.; Dong, Y. High-throughput transcriptome profiling in drug and biomarker discovery. Front. Genet. 2020, 11, 19. [Google Scholar] [CrossRef]
- Wang, L.; You, Z.-H.; Huang, D.-S.; Li, J.-Q. MGRCDA: Metagraph Recommendation Method for Predicting CircRNA-Disease Association. In IEEE Transactions on Cybernetics; IEEE: Piscataway, NJ, USA, 2021; pp. 1–9. [Google Scholar]
- Wang, L.; You, Z.-H.; Li, J.-Q.; Huang, Y.-A. IMS-CDA: Prediction of CircRNA-Disease Associations From the Integration of Multisource Similarity Information With Deep Stacked Autoencoder Model. In IEEE Transactions on Cybernetics; IEEE: Piscataway, NJ, USA, 2020. [Google Scholar]
- Li, H.-Y.; You, Z.-H.; Wang, L.; Yan, X.; Li, Z.-W. DF-MDA: An effective diffusion-based computational model for predicting miRNA-disease association. Mol. Ther. 2021, 29, 1501–1511. [Google Scholar] [CrossRef]
- Lan, W.; Wang, J.; Li, M.; Wu, F.-X.; Pan, Y. Predicting drug-target interaction based on sequence and structure information. IFAC-PapersOnLine 2015, 48, 12–16. [Google Scholar] [CrossRef]
- Cao, D.-S.; Liu, S.; Xu, Q.-S.; Lu, H.-M.; Huang, J.-H.; Hu, Q.-N.; Liang, Y.-Z. Large-scale prediction of drug–target interactions using protein sequences and drug topological structures. Anal. Chim. Acta 2012, 752, 1–10. [Google Scholar] [CrossRef]
- Yamanishi, Y.; Araki, M.; Gutteridge, A.; Honda, W.; Kanehisa, M. Prediction of drug-target interaction networks from the integration of chemical and genomic spaces. Bioinformatics 2008, 24, I232–I240. [Google Scholar] [CrossRef]
- Schomburg, I.; Chang, A.; Ebeling, C.; Gremse, M.; Heldt, C.; Huhn, G.; Schomburg, D. BRENDA, the enzyme database: Updates and major new developments. Nucleic Acids Res. 2004, 32, D431–D433. [Google Scholar] [CrossRef] [Green Version]
- Kanehisa, M.; Goto, S.; Hattori, M.; Aoki-Kinoshita, K.F.; Itoh, M.; Kawashima, S.; Katayama, T.; Araki, M.; Hirakawa, M. From genomics to chemical genomics: New developments in KEGG. Nucleic Acids Res. 2006, 34, D354–D357. [Google Scholar] [CrossRef] [PubMed]
- Kanehisa, M.; Goto, S.; Furumichi, M.; Tanabe, M.; Hirakawa, M. KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res. 2009, 38 (Suppl. 1), D355–D360. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gunther, S.; Kuhn, M.; Dunkel, M.; Campillos, M.; Senger, C.; Petsalaki, E.; Ahmed, J.; Urdiales, E.G.; Gewiess, A.; Jensen, L.J.; et al. SuperTarget and Matador: Resources for exploring drug-target relationships. Nucleic Acids Res. 2008, 36, D919–D922. [Google Scholar] [CrossRef] [PubMed]
- Wishart, D.S.; Knox, C.; Guo, A.C.; Cheng, D.; Shrivastava, S.; Tzur, D.; Gautam, B.; Hassanali, M. DrugBank: A knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 2008, 36, D901–D906. [Google Scholar] [CrossRef] [PubMed]
- Jones, D.T. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 1999, 292, 195–202. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chen, X.-W.; Jeong, J.C. Sequence-based prediction of protein interaction sites with an integrative method. Bioinformatics 2009, 25, 585–591. [Google Scholar] [CrossRef] [PubMed]
- Jones, D.T.; Ward, J.J. Prediction of disordered regions in proteins from position specific score matrices. Proteins Struct. Funct. Bioinform. 2003, 53, 573–578. [Google Scholar] [CrossRef]
- Gao, Z.G.; Wang, L.; Xia, S.X.; You, Z.H.; Yan, X.; Zhou, Y. Ens-PPI: A Novel Ensemble Classifier for Predicting the Interactions of Proteins Using Autocovariance Transformation from PSSM. Biomed Res. Int. 2016, 2016, 8. [Google Scholar] [CrossRef] [Green Version]
- Wang, L.; You, Z.-H.; Xia, S.-X.; Chen, X.; Yan, X.; Zhou, Y.; Liu, F. An improved efficient rotation forest algorithm to predict the interactions among proteins. Soft Comput. 2017, 22, 3373–3381. [Google Scholar] [CrossRef]
- Altschul, S.F.; Madden, T.L.; Schaffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997, 25, 3389–3402. [Google Scholar] [CrossRef] [Green Version]
- Huang, G.B.; Wang, D.H.; Lan, Y. Extreme learning machines: A survey. Int. J. Mach. Learn. Cybern. 2011, 2, 107–122. [Google Scholar] [CrossRef]
- Wang, L.; You, Z.-H.; Yan, X.; Xia, S.-X.; Liu, F.; Li, L.-P.; Zhang, W.; Zhou, Y. Using Two-dimensional Principal Component Analysis and Rotation Forest for Prediction of Protein-Protein Interactions. Sci. Rep. 2018, 8, 12874. [Google Scholar] [CrossRef] [PubMed]
- Ghadermarzi, S.; Li, X.; Li, M.; Kurgan, L. Sequence-Derived Markers of Drug Targets and Potentially Druggable Human Proteins. Front. Genet. 2019, 10, 1075. [Google Scholar] [CrossRef] [PubMed]
- Yang, J.; Zhang, D.; Frangi, A.F.; Yang, J.Y. Two-dimensional PCA: A new approach to appearance-based face representation and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2004, 26, 131–137. [Google Scholar] [CrossRef] [Green Version]
- Cao, D.-S.; Liang, Y.-Z.; Xu, Q.-S.; Hu, Q.-N.; Zhang, L.-X.; Fu, G.-H. Exploring nonlinear relationships in chemical data using kernel-based methods. Chemom. Intell. Lab. Syst. 2011, 107, 106–115. [Google Scholar] [CrossRef]
- Cao, D.-S.; Xu, Q.-S.; Liang, Y.-Z.; Chen, X.; Li, H.-D. Prediction of aqueous solubility of druglike organic compounds using partial least squares, back-propagation network and support vector machine. J. Chemom. 2010, 24, 584–595. [Google Scholar] [CrossRef]
- Cheng, F.; Liu, C.; Jiang, J.; Lu, W.; Li, W.; Liu, G.; Zhou, W.; Huang, J.; Tang, Y. Prediction of Drug-Target Interactions and Drug Repositioning via Network-Based Inference. PLoS Comput. Biol. 2012, 8, e1002503. [Google Scholar] [CrossRef] [Green Version]
- Gonen, M. Predicting drug-target interactions from chemical and genomic kernels using Bayesian matrix factorization. Bioinformatics 2012, 28, 2304–2310. [Google Scholar] [CrossRef]
- Temerinac-Ott, M.; Naik, A.W.; Murphy, R.F. Deciding when to stop: Efficient experimentation to learn to predict drug-target interactions. BMC Bioinform. 2015, 16, 1–10. [Google Scholar] [CrossRef]
- Öztürk, H.; Ozkirimli, E.; Özgür, A. A comparative study of SMILES-based compound similarity functions for drug-target interaction prediction. BMC Bioinform. 2016, 17, 1–11. [Google Scholar] [CrossRef] [Green Version]
- Van, L.T.; Marchiori, E. Predicting Drug-Target Interactions for New Drug Compounds Using a Weighted Nearest Neighbor Profile. PLoS ONE 2013, 8, e66952. [Google Scholar]
- Chen, H.; Zhang, Z. A Semi-Supervised Method for Drug-Target Interaction Prediction with Consistency in Networks. PLoS ONE 2013, 8, e62975. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Dataset | Target Proteins | Drugs | Interactions | Sparsity |
---|---|---|---|---|
Enzymes | 664 | 445 | 2926 | 0.0099 |
Ion Channels | 204 | 210 | 1467 | 0.0344 |
GPCRs | 95 | 223 | 635 | 0.0299 |
Nuclear Receptors | 26 | 54 | 90 | 0.0641 |
Test Set | Accu. (%) | Sen. (%) | Prec. (%) | MCC (%) | AUC (%) |
---|---|---|---|---|---|
1 | 94.87 | 91.23 | 98.75 | 90.04 | 95.12 |
2 | 94.27 | 93.14 | 95.26 | 88.57 | 94.77 |
3 | 93.85 | 89.80 | 97.78 | 87.99 | 94.32 |
4 | 94.02 | 93.07 | 94.71 | 88.04 | 93.98 |
5 | 93.94 | 92.33 | 95.15 | 87.91 | 93.68 |
Average | 94.19 ± 0.41 | 91.91 ± 1.41 | 96.33 ± 1.81 | 88.51 ± 0.89 | 94.37 ± 0.59 |
Test Set | Accu. (%) | Sen. (%) | Prec. (%) | MCC (%) | AUC (%) |
---|---|---|---|---|---|
1 | 90.17 | 88.44 | 91.55 | 80.38 | 89.99 |
2 | 89.83 | 90.70 | 89.51 | 79.65 | 90.14 |
3 | 92.20 | 90.26 | 94.56 | 84.50 | 91.66 |
4 | 90.51 | 91.86 | 89.44 | 81.05 | 90.46 |
5 | 92.06 | 90.27 | 93.73 | 84.18 | 92.15 |
Average | 90.95 ± 1.10 | 90.31 ± 1.23 | 91.76 ± 2.36 | 81.95 ± 2.24 | 90.88 ± 0.97 |
Test Set | Accu. (%) | Sen. (%) | Prec. (%) | MCC (%) | AUC (%) |
---|---|---|---|---|---|
1 | 86.61 | 92.68 | 82.01 | 73.89 | 85.37 |
2 | 89.76 | 95.74 | 87.10 | 79.53 | 91.90 |
3 | 88.98 | 95.58 | 82.44 | 78.82 | 88.46 |
4 | 88.19 | 92.86 | 84.78 | 76.74 | 89.39 |
5 | 86.22 | 93.94 | 82.12 | 73.07 | 85.00 |
Average | 87.95 ± 1.51 | 94.16 ± 1.45 | 83.69 ± 2.22 | 76.41 ± 2.88 | 88.02 ± 2.88 |
Test Set | Accu. (%) | Sen. (%) | Prec. (%) | MCC (%) | AUC (%) |
---|---|---|---|---|---|
1 | 91.67 | 86.96 | 100.00 | 84.05 | 94.98 |
2 | 80.56 | 85.71 | 70.59 | 61.51 | 84.74 |
3 | 88.89 | 85.00 | 94.44 | 78.26 | 85.63 |
4 | 83.33 | 83.33 | 83.33 | 66.67 | 83.02 |
5 | 86.11 | 86.67 | 81.25 | 71.81 | 84.76 |
Average | 86.11 ± 4.39 | 85.53 ± 1.45 | 85.92 ± 11.56 | 72.46 ± 8.97 | 86.63 ± 4.77 |
Test Set | Accu. (%) | Sen. (%) | Prec. (%) | MCC (%) | AUC (%) |
---|---|---|---|---|---|
1 | 84.75 | 84.90 | 84.90 | 69.49 | 86.41 |
2 | 82.03 | 82.31 | 80.00 | 64.02 | 81.24 |
3 | 82.37 | 82.84 | 82.84 | 64.72 | 83.35 |
4 | 80.68 | 84.23 | 78.93 | 61.47 | 81.22 |
5 | 82.77 | 82.00 | 83.67 | 65.56 | 83.12 |
Average | 82.52 ± 1.47 | 83.26 ± 1.25 | 82.07 ± 2.52 | 65.05 ± 2.91 | 83.07 ± 2.12 |
MSPEDTI | 90.95 ± 1.10 | 90.31 ± 1.23 | 91.76 ± 2.36 | 81.95 ± 2.24 | 90.88 ± 0.97 |
Test Set | Accu. (%) | Sen. (%) | Prec. (%) | MCC (%) | AUC (%) |
---|---|---|---|---|---|
1 | 85.76 | 90.14 | 81.42 | 71.81 | 85.08 |
2 | 85.93 | 89.04 | 82.70 | 71.94 | 87.90 |
3 | 85.76 | 87.34 | 84.04 | 71.46 | 84.80 |
4 | 86.61 | 89.49 | 83.73 | 73.34 | 87.10 |
5 | 88.34 | 89.26 | 87.41 | 76.70 | 88.33 |
Average | 86.48 ± 1.10 | 89.05 ± 1.04 | 83.86 ± 2.24 | 73.05 ± 2.16 | 86.64 ± 1.62 |
MSPEDTI | 90.95 ± 1.10 | 90.31 ± 1.23 | 91.76 ± 2.36 | 81.95 ± 2.24 | 90.88 ± 0.97 |
Method | Enzymes | Ion Channels | GPCRs | Nuclear Receptors |
---|---|---|---|---|
SIMCOMP | 86.30 | 77.60 | 86.70 | 85.60 |
NLCS | 83.70 | 75.30 | 85.30 | 81.50 |
Temerinac-Ott | 83.20 | 79.90 | 85.70 | 82.40 |
Yamanishi | 82.10 | 69.20 | 81.10 | 81.40 |
KBMF2K | 83.20 | 79.90 | 85.70 | 82.40 |
WNN-GIP | 86.10 | 77.50 | 87.20 | 83.90 |
DBSI | 80.75 | 80.29 | 80.22 | 75.78 |
NetCBP | 82.51 | 80.34 | 82.35 | 83.94 |
MSPEDTI | 94.37 | 90.88 | 88.02 | 86.63 |
Drug ID | Drug Name | Taregt Protein ID | Target Protein Name | Validation Source |
---|---|---|---|---|
D00951 | Medroxyprogesteroneacetate | hsa2099 | ESR1_HUMAN | SuperTarget |
D00542 | Bromochlorotrifluoroethane | hsa1571 | CP2E1_HUMAN | SuperTarget |
D03365 | Transdermal Nicotine | hsa1137 | ACHA4_HUMAN | SuperTarget |
D00049 | Nikotinsaeure | hsa 8843 | G109B_HUMAN | SuperTarget |
D00160 | Epsilcapramine | hsa7298 | TYSY_HUMAN | unconfirmed |
D00771 | Chlorzoxazone | hsa1374 | CPT1A_HUMAN | unconfirmed |
D00139 | Xanthotoxine | hsa1543 | CP1A1_HUMAN | SuperTarget |
D00964 | Letrozole | hsa1215 | CMA1_HUMAN | unconfirmed |
D00585 | Mifepristone | hsa2099 | ESR1_HUMAN | SuperTarget |
D00437 | Nifedipine Monohydrochloride | hsa1559 | CP2C9_HUMAN | SuperTarget |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, L.; Wong, L.; Chen, Z.-H.; Hu, J.; Sun, X.-F.; Li, Y.; You, Z.-H. MSPEDTI: Prediction of Drug–Target Interactions via Molecular Structure with Protein Evolutionary Information. Biology 2022, 11, 740. https://doi.org/10.3390/biology11050740
Wang L, Wong L, Chen Z-H, Hu J, Sun X-F, Li Y, You Z-H. MSPEDTI: Prediction of Drug–Target Interactions via Molecular Structure with Protein Evolutionary Information. Biology. 2022; 11(5):740. https://doi.org/10.3390/biology11050740
Chicago/Turabian StyleWang, Lei, Leon Wong, Zhan-Heng Chen, Jing Hu, Xiao-Fei Sun, Yang Li, and Zhu-Hong You. 2022. "MSPEDTI: Prediction of Drug–Target Interactions via Molecular Structure with Protein Evolutionary Information" Biology 11, no. 5: 740. https://doi.org/10.3390/biology11050740
APA StyleWang, L., Wong, L., Chen, Z. -H., Hu, J., Sun, X. -F., Li, Y., & You, Z. -H. (2022). MSPEDTI: Prediction of Drug–Target Interactions via Molecular Structure with Protein Evolutionary Information. Biology, 11(5), 740. https://doi.org/10.3390/biology11050740