Prediction and Chemical Interpretation of Singlet-Oxygen-Scavenging Activity of Small Molecule Compounds by Using Machine Learning
Abstract
:1. Introduction
2. Experiment
2.1. Preparing Dataset
2.2. Machine Learning Model
2.3. Feature Importance
3. Results
3.1. Prediction
3.2. Importance Analysis
4. Discussion
4.1. Prediction Accuracy
4.2. Importance Analysis
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Lee, J.; Koo, N.; Min, D.B. Reactive Oxygen Species, Aging, and Antioxidative Nutraceuticals. Compr. Rev. Food Sci. Food Saf. 2004, 3, 21–33. [Google Scholar] [CrossRef] [PubMed]
- Kaur, C.; Kapoor, H.C. Antioxidants in Fruits and Vegetables—The Millennium’s Health. Int. J. Food Sci. Technol. 2001, 36, 703–725. [Google Scholar] [CrossRef]
- Cao, G.; Alessio, H.M.; Cutler, R.G. Oxygen-Radical Absorbance Capacity Assay for Antioxidants. Free Radic. Biol. Med. 1993, 14, 303–311. [Google Scholar] [CrossRef] [Green Version]
- Ou, B.; Hampsch-Woodill, M.; Prior, R.L. Development and Validation of an Improved Oxygen Radical Absorbance Capacity Assay Using Fluorescein as the Fluorescent Probe. J. Agric. Food Chem. 2001, 49, 4619–4626. [Google Scholar] [CrossRef]
- Rey, F.; Zacarías, L.; Rodrigo, M.J. Carotenoids, Vitamin C, and Antioxidant Capacity in the Peel of Mandarin Fruit in Relation to the Susceptibility to Chilling Injury during Postharvest Cold Storage. Antioxidants 2020, 9, 1296. [Google Scholar] [CrossRef]
- Edge, R.; McGarvey, D.J.; Truscott, T.G. The Carotenoids as Anti-Oxidants—A Review. J. Photochem. Photobiol. B 1997, 41, 189–200. [Google Scholar] [CrossRef]
- Takahashi, S.; Tsutsumi, A.; Aizawa, K.; Suganuma, H. Daily Radical Scavenging and Singlet Oxygen Quenching Capacity Intake from Fruits and Vegetables in Japan. Food Sci. Technol. Res. 2018, 24, 921–933. [Google Scholar] [CrossRef]
- Mukai, K. Antioxidant Activity of Foods: Development of Singlet Oxygen Absorption Capacity (SOAC) Assay Method. J. Nutr. Sci. Vitaminol. 2019, 65, 285–302. [Google Scholar] [CrossRef]
- Antioxidant-Function. Available online: http://www.antioxidant-function.com/detaile/ (accessed on 1 November 2021).
- Jordan, M.I.; Mitchell, T.M. Machine Learning: Trends, Perspectives, and Prospects. Science 2015, 349, 255–260. [Google Scholar] [CrossRef]
- Lutnick, B.; Ginley, B.; Govind, D.; McGarry, S.D.; LaViolette, P.S.; Yacoub, R.; Jain, S.; Tomaszewski, J.E.; Jen, K.Y.; Sarder, P. An Integrated Iterative Annotation Technique for Easing Neural Network Training in Medical Image Analysis. Nat. Mach. Intell. 2019, 1, 112–119. [Google Scholar] [CrossRef]
- Zhang, J.; Mucs, D.; Norinder, U.; Svensson, F. LightGBM: An Effective and Scalable Algorithm for Prediction of Chemical Toxicity–Application to the Tox21 and Mutagenicity Data Sets. J. Chem. Inf. Model. 2019, 59, 4150–4158. [Google Scholar] [CrossRef] [PubMed]
- Liu, B.; Ramsundar, B.; Kawthekar, P.; Shi, J.; Gomes, J.; Luu Nguyen, Q.; Ho, S.; Sloane, J.; Wender, P.; Pande, V. Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models. ACS Cent. Sci. 2017, 3, 1103–1113. [Google Scholar] [CrossRef] [Green Version]
- Podio, N.S.; López-Froilán, R.; Ramirez-Moreno, E.; Bertrand, L.; Baroni, M.V.; Pérez-Rodríguez, M.L.; Sánchez-Mata, M.C.; Wunderlin, D.A. Matching In Vitro Bioaccessibility of Polyphenols and Antioxidant Capacity of Soluble Coffee by Boosted Regression Trees. J. Agric. Food Chem. 2015, 63, 9572–9582. [Google Scholar] [CrossRef] [PubMed]
- Rogers, D.; Hahn, M. Extended-Connectivity Fingerprints. J. Chem. Inf. Model. 2010, 50, 742–754. [Google Scholar] [CrossRef]
- Chen, C.H.; Tanaka, K.; Kotera, M.; Funatsu, K. Comparison and Improvement of the Predictability and Interpretability with Ensemble Learning Models in QSPR Applications. J. Cheminform. 2020, 12, 19. [Google Scholar] [CrossRef] [Green Version]
- Schweitzer, C.; Schmidt, R. Physical Mechanisms of Generation and Deactivation of Singlet Oxygen. Chem. Rev. 2003, 103, 1685–1757. [Google Scholar] [CrossRef] [PubMed]
- Weininger, D. SMILES, a Chemical Language and Information System. 1. Introduction to Methodology and Encoding Rules. J. Chem. Inf. Model. 1988, 28, 31–36. [Google Scholar] [CrossRef]
- Vainio, M.J.; Johnson, M.S. Generating Conformer Ensembles Using a Multiobjective Genetic Algorithm. J. Chem. Inf. Model. 2007, 47, 2462–2474. [Google Scholar] [CrossRef] [PubMed]
- RDkit. Available online: https://www.rdkit.org/ (accessed on 1 November 2021).
- Stewart, J.J.P. Optimization of Parameters for Semiempirical Methods VI: More Modifications to the NDDO Approximations and re-Optimization of Parameters. J. Mol. Model. 2013, 19, 1–32. [Google Scholar] [CrossRef] [Green Version]
- Amić, D.; Lučić, B. Reliability of Bond Dissociation Enthalpy Calculated by the PM6 Method and Experimental TEAC Values in Antiradical QSAR of Flavonoids. Bioorg. Med. Chem. 2010, 18, 28–35. [Google Scholar] [CrossRef] [PubMed]
- Nakata, M.; Shimazaki, T.; Hashimoto, M.; Maeda, T. PubChemQC PM6: Data Sets of 221 Million Molecules with Optimized Molecular Geometries and Electronic Properties. J. Chem. Inf. Model. 2020, 60, 5891–5899. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Sheridan, R.P.; Wang, W.M.; Liaw, A.; Ma, J.; Gifford, E.M. Extreme Gradient Boosting as a Method for Quantitative Structure–Activity Relationships. J. Chem. Inf. Model. 2016, 56, 2353–2360. [Google Scholar] [CrossRef]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 3147–3157. [Google Scholar]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased Boosting with Categorical Features. Adv. Neural Inf. Process. Syst. 2018, 31, 6638–6648. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Freund, Y.; Schapire, R.E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef] [Green Version]
- Friedman, J.; Hastie, T.; Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef] [Green Version]
- Ma, J.; Sheridan, R.P.; Liaw, A.; Dahl, G.E.; Svetnik, V. Deep Neural Nets as a Method for Quantitative Structure–Activity Relationships. J. Chem. Inf. Model. 2015, 55, 263–274. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.; Davis, A.; Dean, J.; Devin, M.; et al. Tensor flow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. arXiv 2015, arXiv:1603.04467. [Google Scholar]
- Gillet, F. Keras. 2015. Available online: https://keras.io/ (accessed on 1 November 2021).
- Labute, P. A Widely Applicable Set of Descriptors. J. Mol. Graph. Model. 2000, 18, 464–477. [Google Scholar] [CrossRef]
- Balaban, A.T.; Balaban, A.T. Highly discriminating distance-based topological index. Chem. Phys. Lett. 1982, 89, 399–404. [Google Scholar] [CrossRef]
- Edge, R.; Truscott, T.G. Singlet Oxygen and Free Radical Reactions of Retinoids and Carotenoids-A Review. Antioxidants 2018, 7, 5. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Schmidt, R. Deactivation of O2(1Δg) Singlet Oxygen by Carotenoids: Internal Conversion of Excited Encounter Complexes. J. Phys. Chem. A 2004, 108, 5509–5513. [Google Scholar] [CrossRef]
- Garavelli, M.; Bernardi, F.; Olivucci, M.; Robb, M.A. DFT Study of the Reactions between Singlet-Oxygen and a Carotenoid Model. J. Am. Chem. Soc. 1998, 120, 10210–10222. [Google Scholar] [CrossRef]
- Al-Nu’airat, J.; Dlugogorski, B.Z.; Gao, X.; Zeinali, N.; Skut, J.; Westmoreland, P.R.; Oluwoye, I.; Altarawneh, M. Reaction of Phenol with Singlet Oxygen. Phys. Chem. Chem. Phys. 2018, 21, 171–183. [Google Scholar] [CrossRef]
Model | Dataset | Random_State | RMSE | R2 | RMSELOO |
---|---|---|---|---|---|
XGBoost | descriptors | 0 | 1.2913 | 0.9147 | 1.7566 |
10 | 1.6867 | 0.8279 | |||
100 | 2.0362 | 0.6968 | |||
fingerprint | 0 | 1.0857 | 0.9397 | 1.7857 | |
10 | 2.7150 | 0.5542 | |||
100 | 3.1873 | 0.2325 | |||
LightGBM | descriptors | 0 | 2.4106 | 0.7028 | 1.8723 |
10 | 1.8021 | 0.7546 | |||
100 | 1.8021 | 0.7546 | |||
fingerprint | 0 | 3.7413 | 0.2841 | 2.4055 | |
10 | 2.9596 | 0.4702 | |||
100 | 3.4717 | 0.0894 | |||
CatBoost | descriptors | 0 | 2.0570 | 0.7836 | 1.7375 |
10 | 2.1533 | 0.7196 | |||
100 | 2.4917 | 0.5309 | |||
fingerprint | 0 | 2.1026 | 0.7739 | 2.5477 | |
10 | 2.3711 | 0.6600 | |||
100 | 3.6750 | −0.0200 | |||
Randomforest | descriptors | 0 | 1.0123 | 0.9476 | 1.5731 |
10 | 1.9342 | 0.7737 | |||
100 | 1.9166 | 0.7225 | |||
fingerprint | 0 | 1.3950 | 0.9005 | 1.7613 | |
10 | 1.2796 | 0.9163 | |||
100 | 2.0136 | 0.7548 | |||
AdaBoost | descriptors | 0 | 0.8141 | 0.9661 | 1.7006 |
10 | 2.1277 | 0.7262 | |||
100 | 1.9945 | 0.6995 | |||
fingerprint | 0 | 0.9573 | 0.9531 | 1.6017 | |
10 | 1.8246 | 0.7986 | |||
100 | 3.8793 | −0.1370 | |||
LASSO | descriptors | 0 | 2.1668 | 0.7599 | 2.3314 |
10 | 2.4023 | 0.6510 | |||
100 | 5.5896 | −1.3605 | |||
fingerprint | 0 | 1.7224 | 0.8483 | 1.9853 | |
10 | 3.0045 | 0.4540 | |||
100 | 4.0502 | −0.2393 | |||
DNN | descriptors | 0 | 2.0145 | 0.7924 | 2.9865 |
10 | 2.6807 | 0.5653 | |||
100 | 3.6878 | −0.0275 | |||
fingerprint | 0 | 1.8309 | 0.8286 | 3.3584 | |
10 | 3.3159 | 0.3350 | |||
100 | 3.4791 | 0.0855 |
Descriptor | Score | Fingerprint | Score |
---|---|---|---|
HOMO | 175.5 | 1515 | 148.5 |
HOMO–LUMO gap | 109 | 1722 | 118.5 |
SlogP_VSA4 | 84.5 | 926 | 93.5 |
SlogP_VSA2 | 72.5 | 807 | 87.5 |
SlogP_VSA6 | 72 | 1356 | 80 |
PEOE_VSA7 | 62 | 252 | 58.5 |
BalabanJ | 44 | 1380 | 35.5 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fujimoto, T.; Gotoh, H. Prediction and Chemical Interpretation of Singlet-Oxygen-Scavenging Activity of Small Molecule Compounds by Using Machine Learning. Antioxidants 2021, 10, 1751. https://doi.org/10.3390/antiox10111751
Fujimoto T, Gotoh H. Prediction and Chemical Interpretation of Singlet-Oxygen-Scavenging Activity of Small Molecule Compounds by Using Machine Learning. Antioxidants. 2021; 10(11):1751. https://doi.org/10.3390/antiox10111751
Chicago/Turabian StyleFujimoto, Taiki, and Hiroaki Gotoh. 2021. "Prediction and Chemical Interpretation of Singlet-Oxygen-Scavenging Activity of Small Molecule Compounds by Using Machine Learning" Antioxidants 10, no. 11: 1751. https://doi.org/10.3390/antiox10111751
APA StyleFujimoto, T., & Gotoh, H. (2021). Prediction and Chemical Interpretation of Singlet-Oxygen-Scavenging Activity of Small Molecule Compounds by Using Machine Learning. Antioxidants, 10(11), 1751. https://doi.org/10.3390/antiox10111751