From Patterns to Pills: How Informatics Is Shaping Medicinal Chemistry
Abstract
:1. Approaches to Drug Discovery and Development
1.1. Classical Drug Discovery
1.2. The Role of Biological Functional Assays in Modern Drug Discovery
- Baricitinib, a repurposed JAK inhibitor identified by BenevolentAI’s machine learning (ML) algortihm as a candidate for COVID-19, required extensive in vitro and clinical validation to confirm its antiviral and anti-inflammatory effects, ultimately supporting its emergency use authorization [9].
- Halicin, a novel antibiotic discovered using a neural network trained on a dataset of molecules with known antibacterial properties, enabling the model to identify compounds with potential activity against Escherichia coli. Although the compound’s antibacterial potential was flagged computationally, biological assays were crucial to confirming its broad-spectrum efficacy, including activity against multidrug-resistant pathogens in both in vitro and in vivo models [10].
- Vemurafenib, a BRAF inhibitor for melanoma, was initially identified via high-throughput in silico screening targeting the BRAF (V600E)-mutant kinase. Its computational promise was validated through cellular assays measuring ERK phosphorylation and tumor cell proliferation [13], ultimately guiding SAR efforts to enhance potency and reduce off-target effects.
1.3. Rational Drug Design in Scaffold-Centric Medicinal Chemistry
1.4. Medicinal Chemistry in the Big Data Era
2. Improving the Decision-Making Process in Drug Development
2.1. The Ability to Predict
2.2. Man vs. Machine
3. Machine Decision Making
3.1. Chemical and Visual Descriptors
3.2. Reducing the Complexity of a Molecule
3.3. The Informacophore and Inverse Cheminformatics
Model Type | Chemical Purpose | Key Feature | Referenced Studies |
---|---|---|---|
VAE (Variational Autoencoder) | Encode molecules into a continuous latent space and decode to generate novel chemically meaningful structures | Smooth latent space allows interpolation and optimization of molecular properties | Gómez-Bombarelli et al. [75] |
GAN (Generative Adversarial Network) | Learn to generate valid molecular structures by training a generator against a discriminator | Adversarial training enables generation of novel molecules with predefined bioactivity profiles | Zhavoronkov et al. * [28] Kadurin et al. [76] |
cRNN (Conditional Recurrent Neural Network) | Learn physicochemical or structural characteristics to generate molecules conditioned on desired properties | Sequential generation of molecules guided by learned property constraints | Kotsias et al. [77] Mohapatra et al. [78] |
Flow-based Neural Network | Model the exact likelihood of molecular data for reversible generation | Enables bidirectional mapping between molecule and latent space with tractable likelihoods | Hu Wei [79] |
4. Outlook
- The first involves FBDD and make-on-demand compound libraries, which will enable virtual searches for bioactive compounds rather than relying on traditional HTS. This shift will streamline the drug discovery process, reducing the need for extensive physical screening, as was the case in the past 25 years [83].
- The second route focuses on advancements in molecular representations. As traditional representations like SMILES and graph-based representations fail to capture the complexity of large molecules [84], there is a need for more sophisticated representations that improve the accuracy of property predictions and virtual screening. These developments will enhance the performance of ML algorithms and help produce more reliable and precise drug discovery outcomes.
Author Contributions
Funding
Conflicts of Interest
References
- Umashankar, V.; Gurunathan, S. Drug discovery: An appraisal. Int. J. Pharm. Pharm. Sci. 2015, 7, 59–66. [Google Scholar]
- Chan, H.S.; Shan, H.; Dahoun, T.; Vogel, H.; Yuan, S. Advancing drug discovery via artificial intelligence. Trends Pharmacol. Sci. 2019, 40, 592–604. [Google Scholar] [CrossRef]
- Dhudum, R.; Ganeshpurkar, A.; Pawar, A. Revolutionizing Drug Discovery: A Comprehensive Review of AI Applications. Drugs Drug Candidates 2024, 3, 148–171. [Google Scholar] [CrossRef]
- Zhou, G.; Rusnac, D.V.; Park, H.; Canzani, D.; Nguyen, H.M.; Stewart, L.; Bush, M.F.; Nguyen, P.T.; Wulff, H.; Yarov-Yarovoy, V.; et al. An artificial intelligence accelerated virtual screening platform for drug discovery. Nat. Commun. 2024, 15, 7761. [Google Scholar] [CrossRef]
- Bunnage, M.E.; Chekler, E.L.; Jones, L.H. Target validation using chemical probes. Nat. Chem. Biol. 2013, 9, 195–199. [Google Scholar] [CrossRef] [PubMed]
- Moffat, J.G.; Vincent, F.; Lee, J.A.; Eder, J.; Prunotto, M. Opportunities and challenges in phenotypic drug discovery: An industry perspective. Nat. Rev. Drug Discov. 2017, 16, 531–543. [Google Scholar] [CrossRef] [PubMed]
- Mak, K.K.; Pichika, M.R. Artificial intelligence in drug development: Present status and future prospects. Drug Discov. Today 2019, 24, 773–780. [Google Scholar] [CrossRef]
- Swinney, D.C.; Anthony, J. How were new medicines discovered? Nat. Rev. Drug Discov. 2011, 10, 507–519. [Google Scholar] [CrossRef]
- Richardson, P.; Griffin, I.; Tucker, C.; Smith, D.; Oechsle, O.; Phelan, A.; Rawling, M.; Savory, E.; Stebbing, J. Baricitinib as potential treatment for 2019-nCoV acute respiratory disease. Lancet 2020, 395, e30–e31. [Google Scholar] [CrossRef]
- Stokes, J.M.; Yang, K.; Swanson, K.; Jin, W.; Cubillos-Ruiz, A.; Donghia, N.M.; MacNair, C.R.; French, S.; Carfrae, L.A.; Bloom-Ackermann, Z. A deep learning approach to antibiotic discovery. Cell 2020, 180, 688–702.e613. [Google Scholar] [CrossRef]
- Gordon, D.E.; Jang, G.M.; Bouhaddou, M.; Xu, J.; Obernier, K.; White, K.M.; O’Meara, M.J.; Rezelj, V.V.; Guo, J.Z.; Swaney, D.L. A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature 2020, 583, 459–468. [Google Scholar] [CrossRef]
- Sugiyama, M.G.; Cui, H.; Redka, D.S.; Karimzadeh, M.; Rujas, E.; Maan, H.; Hayat, S.; Cheung, K.; Misra, R.; McPhee, J.B.; et al. Multiscale interactome analysis coupled with off-target drug predictions reveals drug repurposing candidates for human coronavirus disease. Sci. Rep. 2021, 11, 23315. [Google Scholar] [CrossRef] [PubMed]
- Bollag, G.; Hirth, P.; Tsai, J.; Zhang, J.; Ibrahim, P.N.; Cho, H.; Spevak, W.; Zhang, C.; Zhang, Y.; Habets, G.; et al. Clinical efficacy of a RAF inhibitor needs broad target blockade in BRAF-mutant melanoma. Nature 2010, 467, 596–599. [Google Scholar] [CrossRef]
- Poelking, C.; Chessari, G.; Murray, C.W.; Hall, R.J.; Colwell, L.; Verdonk, M. Meaningful machine learning models and machine-learned pharmacophores from fragment screening campaigns. arXiv 2022, arXiv:2204.06348. [Google Scholar] [CrossRef]
- Rodríguez-Pérez, R.; Bajorath, J. Explainable machine learning for property predictions in compound optimization: Miniperspective. J. Med. Chem. 2021, 64, 17744–17752. [Google Scholar] [CrossRef]
- Batool, M.; Ahmad, B.; Choi, S. A Structure-Based Drug Discovery Paradigm. Int. J. Mol. Sci. 2019, 20, 2783. [Google Scholar] [CrossRef] [PubMed]
- Adam, M. Integrating research and development: The emergence of rational drug design in the pharmaceutical industry. Stud. Hist. Philos. Biol. Biomed. Sci. 2005, 36, 513–537. [Google Scholar] [CrossRef]
- Gambardella, A. Science and innovation: The US pharmaceutical industry during the 1980s; Cambridge University Press: Cambridge, UK, 1995. [Google Scholar]
- Andricopulo, A.D.; Montanari, C.A. Structure-activity relationships for the design of small-molecule inhibitors. Mini Rev. Med. Chem. 2005, 5, 585–593. [Google Scholar] [CrossRef]
- Langmuir, I. Isomorphism, Isosterism and Covalence. J. Am. Chem. Soc. 1919, 41, 1543–1559. [Google Scholar] [CrossRef]
- Meanwell, N.A. The Design and Application of Bioisosteres in Drug Design. Burger’s Med. Chem. Drug Discov. 2021, 1–81. [Google Scholar] [CrossRef]
- Simon, H.A.; Newell, A. Human problem solving: The state of the theory in 1970. Am. Psychol. 1971, 26, 145. [Google Scholar] [CrossRef]
- Enamine. Available online: https://enamine.net/compound-collections/real-compounds (accessed on 25 April 2025).
- CHEMriya. Available online: https://www.otavachemicals.com/products/chemriya (accessed on 25 April 2025).
- Hert, J.; Irwin, J.J.; Laggner, C.; Keiser, M.J.; Shoichet, B.K. Quantifying biogenic bias in screening libraries. Nat. Chem. Biol. 2009, 5, 479–483. [Google Scholar] [CrossRef] [PubMed]
- Lyu, J.; Irwin, J.J.; Shoichet, B.K. Modeling the expansion of virtual screening libraries. Nat. Chem. Biol. 2023, 19, 712–718. [Google Scholar] [CrossRef]
- Kahneman, D.; Tversky, A. Prospect theory: An analysis of decision under risk. In Handbook of the Fundamentals of Financial Decision Making: Part I; World Scientific: Singapore, 2013; pp. 99–127. [Google Scholar]
- Zhavoronkov, A.; Aladinskiy, V.; Zhebrak, A.; Zagribelnyy, B.; Terentiev, V.; Bezrukov, D.S.; Polykovskiy, D.; Shayakhmetov, R.; Filimonov, A.; Orekhov, P.; et al. Potential 2019-nCoV 3C-like protease inhibitors designed using generative deep learning approaches. Preprint v2 from ChemRxiv 2020. [Google Scholar] [CrossRef]
- Hansch, C.; Maloney, P.P.; Fujita, T.; Muir, R.M. Correlation of biological activity of phenoxyacetic acids with Hammett substituent constants and partition coefficients. Nature 1962, 194, 178–180. [Google Scholar] [CrossRef]
- Verma, J.; Khedkar, V.M.; Coutinho, E.C. 3D-QSAR in drug design--a review. Curr. Top. Med. Chem. 2010, 10, 95–115. [Google Scholar] [CrossRef]
- Tropsha, A.; Isayev, O.; Varnek, A.; Schneider, G.; Cherkasov, A. Integrating QSAR modelling and deep learning in drug discovery: The emergence of deep QSAR. Nat. Rev. Drug Discov. 2024, 23, 141–155. [Google Scholar] [CrossRef]
- Rufa, D.A.; Bruce Macdonald, H.E.; Fass, J.; Wieder, M.; Grinaway, P.B.; Roitberg, A.E.; Isayev, O.; Chodera, J.D. Towards chemical accuracy for alchemical free energy calculations with hybrid physics-based machine learning/molecular mechanics potentials. Preprint from BioRxiv 2020. [Google Scholar] [CrossRef]
- Grisoni, F. Chemical language models for de novo drug design: Challenges and opportunities. Curr. Opin. Struct. Biol. 2023, 79, 102527. [Google Scholar] [CrossRef]
- Korshunova, M.; Huang, N.; Capuzzi, S.; Radchenko, D.S.; Savych, O.; Moroz, Y.S.; Wells, C.I.; Willson, T.M.; Tropsha, A.; Isayev, O. Generative and reinforcement learning approaches for the automated de novo design of bioactive compounds. Commun. Chem. 2022, 5, 129. [Google Scholar] [CrossRef]
- Grisoni, F.; Schneider, G. De Novo Molecular Design with Chemical Language Models. Methods Mol. Biol. 2022, 2390, 207–232. [Google Scholar] [CrossRef]
- Ginex, T.; Vazquez, J.; Gilbert, E.; Herrero, E.; Luque, F.J. Lipophilicity in drug design: An overview of lipophilicity descriptors in 3D-QSAR studies. Future. Med. Chem. 2019, 11, 1177–1193. [Google Scholar] [CrossRef] [PubMed]
- Lewis, R.; Isert, C.; Kromann, J.; Stiefl, N.; Schneider, G. Machine intelligence models for fast, quantum mechanics-based approximation of drug lipophilicity. ACS Omega 2023, 8, 2046–2056. [Google Scholar] [CrossRef]
- Magid, R.W.; Sheskin, M.; Schulz, L.E. Imagination and the generation of new ideas. Cogn. Dev. 2015, 34, 99–110. [Google Scholar] [CrossRef]
- Wu, Z.; Zhu, M.; Kang, Y.; Leung, E.L.; Lei, T.; Shen, C.; Jiang, D.; Wang, Z.; Cao, D.; Hou, T. Do we need different machine learning algorithms for QSAR modeling? A comprehensive assessment of 16 machine learning algorithms on 14 QSAR data sets. Brief. Bioinform. 2021, 22, bbaa321. [Google Scholar] [CrossRef] [PubMed]
- Coley, C.W.; Green, W.H.; Jensen, K.F. Machine Learning in Computer-Aided Synthesis Planning. Acc. Chem. Res. 2018, 51, 1281–1289. [Google Scholar] [CrossRef]
- Peeperkorn, M.; Brown, D.; Jordanous, A. On Characterizations of Large Language Models and Creativity Evaluation; University of Kent: Canterbury, UK, 2023. [Google Scholar]
- Visan, A.I.; Negut, I. Integrating Artificial Intelligence for Drug Discovery in the Context of Revolutionizing Drug Delivery. Life 2024, 14, 233. [Google Scholar] [CrossRef]
- Kirboga, K.K.; Abbasi, S.; Kucuksille, E.U. Explainability and white box in drug discovery. Chem. Biol. Drug Des. 2023, 102, 217–233. [Google Scholar] [CrossRef]
- Dara, M.; Azarpira, N. Ethical Considerations Emerge from Artificial Intelligence (AI) in Biotechnology. Avicenna J. Med. Biotechnol. 2025, 17, 80–81. [Google Scholar] [CrossRef]
- Raghunathan, S.; Priyakumar, U.D. Molecular representations for machine learning applications in chemistry. Int. J. Quantum Chem. 2022, 122, e26870. [Google Scholar] [CrossRef]
- Nguyen-Vo, T.H.; Teesdale-Spittle, P.; Harvey, J.E.; Nguyen, B.P. Molecular representations in bio-cheminformatics. Memet. Comput. 2024, 16, 519–536. [Google Scholar] [CrossRef]
- Grimberg, H.; Tiwari, V.S.; Tam, B.; Gur-Arie, L.; Gingold, D.; Polachek, L.; Akabayov, B. Machine learning approaches to optimize small-molecule inhibitors for RNA targeting. J. Cheminform. 2022, 14, 4. [Google Scholar] [CrossRef] [PubMed]
- Tam, B.; Sherf, D.; Cohen, S.; Eisdorfer, S.A.; Perez, M.; Soffer, A.; Vilenchik, D.; Akabayov, S.R.; Wagner, G.; Akabayov, B. Discovery of small-molecule inhibitors targeting the ribosomal peptidyl transferase center (PTC) of M. tuberculosis. Chem. Sci. 2019, 10, 8764–8767. [Google Scholar] [CrossRef]
- Zhang, Q.S.; Zhu, S.C. Visual interpretability for deep learning: A survey. Front. Inf. Technol. Electron. Eng. 2018, 19, 27–39. [Google Scholar] [CrossRef]
- Wang, Y.; Zhang, T.; Guo, X.; Shen, Z. Gradient based feature attribution in explainable ai: A technical review. arXiv 2024, arXiv:2403.10415. [Google Scholar] [CrossRef]
- Kindermans, P.-J.; Hooker, S.; Adebayo, J.; Alber, M.; Schütt, K.T.; Dähne, S.; Erhan, D.; Kim, B. The (Un)reliability of Saliency Methods. In Explainable AI: Interpreting, Explaining and Visualizing Deep Learning; Samek, W., Montavon, G., Vedaldi, A., Hansen, L.K., Müller, K.-R., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 267–280. [Google Scholar] [CrossRef]
- Goel, M.; Aggarwal, R.; Sridharan, B.; Pal, P.K.; Priyakumar, U.D. Efficient and enhanced sampling of drug-like chemical space for virtual screening and molecular design using modern machine learning methods. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2023, 13, e1637. [Google Scholar] [CrossRef]
- Reymond, J.L. The chemical space project. Acc. Chem. Res. 2015, 48, 722–730. [Google Scholar] [CrossRef] [PubMed]
- Dobson, C.M. Chemical space and biology. Nature 2004, 432, 824–828. [Google Scholar] [CrossRef]
- Reymond, J.L.; Awale, M. Exploring chemical space for drug discovery using the chemical universe database. ACS Chem. Neurosci. 2012, 3, 649–657. [Google Scholar] [CrossRef]
- Li, Q. Application of Fragment-Based Drug Discovery to Versatile Targets. Front. Mol. Biosci. 2020, 7, 180. [Google Scholar] [CrossRef]
- Scott, D.E.; Coyne, A.G.; Hudson, S.A.; Abell, C. Fragment-based approaches in drug discovery and chemical biology. Biochemistry 2012, 51, 4990–5003. [Google Scholar] [CrossRef]
- Hopkins, A.L.; Groom, C.R.; Alex, A. Ligand efficiency: A useful metric for lead selection. Drug Discov. Today 2004, 9, 430–431. [Google Scholar] [CrossRef]
- Abad-Zapatero, C.; Metz, J.T. Ligand efficiency indices as guideposts for drug discovery. Drug Discov. Today 2005, 10, 464–469. [Google Scholar] [CrossRef] [PubMed]
- Erlanson, D.A.; Fesik, S.W.; Hubbard, R.E.; Jahnke, W.; Jhoti, H. Twenty years on: The impact of fragments on drug discovery. Nat. Rev. Drug Discov. 2016, 15, 605–619. [Google Scholar] [CrossRef] [PubMed]
- Murray, C.W.; Rees, D.C. The rise of fragment-based drug discovery. Nat. Chem. 2009, 1, 187–192. [Google Scholar] [CrossRef] [PubMed]
- Hall, R.J.; Mortenson, P.N.; Murray, C.W. Efficient exploration of chemical space by fragment-based screening. Prog. Biophys. Mol. Biol. 2014, 116, 82–91. [Google Scholar] [CrossRef]
- Singh, M.; Tam, B.; Akabayov, B. NMR-Fragment Based Virtual Screening: A Brief Overview. Molecules 2018, 23, 233. [Google Scholar] [CrossRef]
- Vidhya, K.S.; Sultana, A.; Kumar M, N.; Rangareddy, H. Artificial Intelligence’s Impact on Drug Discovery and Development From Bench to Bedside. Cureus 2023, 15, e47486. [Google Scholar] [CrossRef]
- Sanchez-Lengeling, B.; Outeiral, C.; Guimaraes, G.L.; Aspuru-Guzik, A. Optimizing distributions over molecular space. An objective-reinforced generative adversarial network for inverse-design chemistry (ORGANIC). ChemRxiv 2017. [Google Scholar] [CrossRef]
- Sanchez-Lengeling, B.; Aspuru-Guzik, A. Inverse molecular design using machine learning: Generative models for matter engineering. Science 2018, 361, 360–365. [Google Scholar] [CrossRef]
- Menacer, R.; Bouchekioua, S.; Meliani, S.; Belattar, N. New combined Inverse-QSAR and molecular docking method for scaffold-based drug discovery. Comput. Biol. Med. 2024, 180, 108992. [Google Scholar] [CrossRef]
- Chen, H.; Engkvist, O.; Wang, Y.; Olivecrona, M.; Blaschke, T. The rise of deep learning in drug discovery. Drug Discov. Today 2018, 23, 1241–1250. [Google Scholar] [CrossRef]
- Polykovskiy, D.; Zhebrak, A.; Sanchez-Lengeling, B.; Golovanov, S.; Tatanov, O.; Belyaev, S.; Kurbanov, R.; Artamonov, A.; Aladinskiy, V.; Veselov, M.; et al. Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models. Front. Pharmacol. 2020, 11, 565644. [Google Scholar] [CrossRef] [PubMed]
- Brown, N.; Fiscato, M.; Segler, M.H.S.; Vaucher, A.C. GuacaMol: Benchmarking Models for de Novo Molecular Design. J. Chem. Inf. Model. 2019, 59, 1096–1108. [Google Scholar] [CrossRef]
- Ertl, P.; Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminform. 2009, 1, 8. [Google Scholar] [CrossRef] [PubMed]
- Thakkar, A.; Chadimova, V.; Bjerrum, E.J.; Engkvist, O.; Reymond, J.L. Retrosynthetic accessibility score (RAscore)—Rapid machine learned synthesizability classification from AI driven retrosynthetic planning. Chem. Sci. 2021, 12, 3339–3349. [Google Scholar] [CrossRef]
- Liu, X.; Zhang, W.; Tong, X.; Zhong, F.; Li, Z.; Xiong, Z.; Xiong, J.; Wu, X.; Fu, Z.; Tan, X.; et al. MolFilterGAN: A progressively augmented generative adversarial network for triaging AI-designed molecules. J. Cheminform. 2023, 15, 42. [Google Scholar] [CrossRef]
- Iovanac, N.C.; Savoie, B.M. Simpler is Better: How Linear Prediction Tasks Improve Transfer Learning in Chemical Autoencoders. J. Phys. Chem. A 2020, 124, 3679–3685. [Google Scholar] [CrossRef] [PubMed]
- Gomez-Bombarelli, R.; Wei, J.N.; Duvenaud, D.; Hernandez-Lobato, J.M.; Sanchez-Lengeling, B.; Sheberla, D.; Aguilera-Iparraguirre, J.; Hirzel, T.D.; Adams, R.P.; Aspuru-Guzik, A. Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules. ACS Cent. Sci. 2018, 4, 268–276. [Google Scholar] [CrossRef]
- Kadurin, A.; Nikolenko, S.; Khrabrov, K.; Aliper, A.; Zhavoronkov, A. druGAN: An Advanced Generative Adversarial Autoencoder Model for de Novo Generation of New Molecules with Desired Molecular Properties in Silico. Mol. Pharm. 2017, 14, 3098–3104. [Google Scholar] [CrossRef]
- Kotsias, P.C.; Arús-Pous, J.; Chen, H.M.; Engkvist, O.; Tyrchan, C.; Bjerrum, E.J. Direct steering of de novo molecular generation with descriptor conditional recurrent neural networks. Nat. Mach. Intell. 2020, 2, 254–265. [Google Scholar] [CrossRef]
- Mohapatra, S.; Yang, T.; Gómez-Bombarelli, R. Reusability report: Designing organic photoelectronic molecules with descriptor conditional recurrent neural networks. Nat. Mach. Intell. 2020, 2, 749–752. [Google Scholar] [CrossRef]
- Hu, W. Inverse molecule design with invertible neural networks as generative models. J. Biomed. Sci. Eng. 2021, 14, 305–315. [Google Scholar] [CrossRef]
- Jiménez-Luna, J.; Grisoni, F.; Schneider, G. Drug discovery with explainable artificial intelligence. Nat. Mach. Intell. 2020, 2, 573–584. [Google Scholar] [CrossRef]
- Aliferis, C.; Simon, G. Overfitting, underfitting and general model overconfidence and under-performance pitfalls and best practices in machine learning and AI. In Artificial Intelligence and Machine Learning in Health Care and Medical Sciences: Best Practices and Pitfalls; Springer: Cham, Switzerland, 2024; pp. 477–524. [Google Scholar] [CrossRef]
- Hanna, M.; Pantanowitz, L.; Jackson, B.; Palmer, O.; Visweswaran, S.; Pantanowitz, J.; Deebajah, M.; Rashidi, H. Ethical and Bias considerations in artificial intelligence (AI)/machine learning. Mod. Pathol. 2024, 38, 100686. [Google Scholar] [CrossRef] [PubMed]
- Warr, W.A.; Nicklaus, M.C.; Nicolaou, C.A.; Rarey, M. Exploration of Ultralarge Compound Collections for Drug Discovery. J. Chem. Inf. Model. 2022, 62, 2021–2034. [Google Scholar] [CrossRef]
- Krenn, M.; Ai, Q.; Barthel, S.; Carson, N.; Frei, A.; Frey, N.C.; Friederich, P.; Gaudin, T.; Gayle, A.A.; Jablonka, K.M.; et al. SELFIES and the future of molecular string representations. Patterns 2022, 3, 100588. [Google Scholar] [CrossRef]
- Sheridan, R.P. Interpretation of QSAR models by coloring atoms according to changes in predicted activity: How robust is it? J. Chem. Inf. Model. 2019, 59, 1324–1337. [Google Scholar] [CrossRef]
Task | ML Capabilities | ML Limitations | Human Expertise Capabilities | Human Expertise Limitations |
---|---|---|---|---|
Molecule Design/Synthetic Creativity | Rapidly generates novel molecular structures from extensive datasets. Efficient in exploring vast chemical spaces. Identifies subtle, non-obvious molecular features. | Struggles with understanding context-specific details, such as disease nuances. Lacks insight into biological or off-target effects. May not propose innovative designs beyond existing data. | Can generate truly novel ideas and hypotheses. Integrates biological, therapeutic, and regulatory contexts. Has intuition for unexpected solutions. | Time-consuming. Limited by experience in specific areas. Limited ability of using large datasets for pattern recognition. |
SAR analysis | Analyzes high-dimensional datasets to identify pattern. Quickly generates predictive models to assess activity across compounds. | Models can fail with noisy data. The models may struggle with complex relationships between molecular features. Performance relies on training data quality. | Expertise in interpreting contextual nuances of SAR data. Flexible in assessing novel relationships or unexpected biological effects. | Limited ability to analyze large datasets manually. Biases may influence interpretation. |
Toxicity Prediction | Capable of analyzing extensive datasets of toxicity reports to forecast potential toxicological effects. | Missing rare or context-specific toxicity risks. Lacking biological insight in certain cases. Being unable to consider contextual factors such as patient-specific variables like age and comorbidities. | Contextual understanding of toxicity, such as tissue type and dosage. Ability to interpret complex, multifactorial interactions that lead to toxicity. | Insights may be generated slowly, slowing down decision-making. May lead to missing emerging trends or new patterns and hinder innovation and adaptability. |
Chemical Synthesis Planning | Can predict viable synthetic routes using comprehensive datasets from prior reactions. Identifies efficient reaction conditions and reagents. | Being constrained by existing data, which may prevent the generation of truly novel synthesis pathways. Struggles with complex, multi-step synthesis that requires intuition or innovative thinking. | The ability to improvise when synthetic routes are ineffective. An intuitive understanding of how to adapt synthesis methods based on specific compound structures. | Struggles with scalability for a large number of compounds. Resource constraints in synthetic efforts can limit creativity. |
Biological Target Identification | Can analyze extensive genomic and proteomic datasets to identify potential drug targets. Employs a data-driven and systematic approach to searching for candidate targets. | Lack of contextual understanding of disease biology and patient variability. May overlook rare or novel biological targets that are not represented in the training data. | Deep understanding of biological mechanisms in specific diseases. Ability to contextualize target discovery through biological insight and experience. | Often struggle with large-scale data integration. Slower to recognize new or unexpected targets. |
Clinical Trial Design | Can optimize trial designs by analyzing data of a certain statistical distribution and identifying factors influencing patient outcomes. | Challenge in integrating individual patient differences or ethical considerations. Lacks the ability to generate human-centered solutions based on societal needs. | Ability to design trials with a focus on human factors, ethics, and patient variability. Adapts trial design to emerging data and real-world conditions. | Time-consuming and resource-intensive. Can be inflexible when adapting existing designs to new conditions. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Trachtenberg, A.; Akabayov, B. From Patterns to Pills: How Informatics Is Shaping Medicinal Chemistry. Pharmaceutics 2025, 17, 612. https://doi.org/10.3390/pharmaceutics17050612
Trachtenberg A, Akabayov B. From Patterns to Pills: How Informatics Is Shaping Medicinal Chemistry. Pharmaceutics. 2025; 17(5):612. https://doi.org/10.3390/pharmaceutics17050612
Chicago/Turabian StyleTrachtenberg, Alexander, and Barak Akabayov. 2025. "From Patterns to Pills: How Informatics Is Shaping Medicinal Chemistry" Pharmaceutics 17, no. 5: 612. https://doi.org/10.3390/pharmaceutics17050612
APA StyleTrachtenberg, A., & Akabayov, B. (2025). From Patterns to Pills: How Informatics Is Shaping Medicinal Chemistry. Pharmaceutics, 17(5), 612. https://doi.org/10.3390/pharmaceutics17050612