From Algorithm to Medicine: AI in the Discovery and Development of New Drugs

Lopes, Ana Beatriz; Rodrigues, Célia Fortuna; Silva, Francisco A. M.

doi:10.3390/ai7010026

Open AccessReview

From Algorithm to Medicine: AI in the Discovery and Development of New Drugs

by

Ana Beatriz Lopes

¹,

Célia Fortuna Rodrigues

^2,3,4,*

and

Francisco A. M. Silva

^1,*

¹

Department of Pharmaceutical Sciences, University Institute of Health Sciences—CESPU (IUCS-CESPU), 4585-116 Gandra PRD, Portugal

²

Associate Laboratory i4HB—Institute for Health and Bioeconomy, University Institute of Health Sciences—CESPU (IUCS-CESPU), 4585-116 Gandra PRD, Portugal

³

UCIBIO—Applied Molecular Biosciences Unit, Translational Toxicology Research Laboratory, University Institute of Health Sciences (1H-TOXRUN, IUCS-CESPU), 4585-116 Gandra PRD, Portugal

⁴

LEPABE—Laboratory for Process Engineering, Environment, Biotechnology and Energy, ALiCE—Associate Laboratory in Chemical Engineering, Faculty of Engineering, University of Porto, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal

^*

Authors to whom correspondence should be addressed.

AI 2026, 7(1), 26; https://doi.org/10.3390/ai7010026

Submission received: 28 November 2025 / Revised: 5 January 2026 / Accepted: 9 January 2026 / Published: 14 January 2026

(This article belongs to the Special Issue Transforming Biomedical Innovation with Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

The discovery and development of new drugs is a lengthy, complex, and costly process, often requiring 10–20 years to progress from initial concept to market approval, with clinical trials representing the most resource-intensive stage. In recent years, Artificial Intelligence (AI) has emerged as a transformative technology capable of reshaping the entire pharmaceutical research and development (R&D) pipeline. The purpose of this narrative review is to examine the role of AI in drug discovery and development, highlighting its contributions, challenges, and future implications for pharmaceutical sciences and global public health. A comprehensive review of the scientific literature was conducted, focusing on published studies, reviews, and reports addressing the application of AI across the stages of drug discovery, preclinical development, clinical trials, and post-marketing surveillance. Key themes were identified, including AI-driven target identification, molecular screening, de novo drug design, predictive toxicity modelling, and clinical monitoring. The reviewed evidence indicates that AI has significantly accelerated drug discovery and development by reducing timeframes, costs, and failure rates. AI-based approaches have enhanced the efficiency of target identification, optimized lead compound selection, improved safety predictions, and supported adaptive clinical trial designs. Collectively, these advances position AI as a catalyst for innovation, particularly in promoting accessible, efficient, and sustainable healthcare solutions. However, substantial challenges remain, including reliance on high-quality and representative biomedical data, limited algorithmic transparency, high implementation costs, regulatory uncertainty, and ethical and legal concerns related to data privacy, bias, and equitable access. In conclusion, AI represents a paradigm shift in pharmaceutical research and drug development, offering unprecedented opportunities to improve efficiency and innovation. Addressing its technical, ethical, and regulatory limitations will be essential to fully realize its potential as a sustainable and globally impactful tool for therapeutic innovation.

Keywords:

drug discovery and development; structure–activity relationships; pharmaceutical innovation; artificial intelligence; deep learning; machine learning; generative models

1. Introduction

The discovery and development of new drugs is a complex and time-consuming process that, according to the traditional method, can take 10 to 20 years, requiring substantial financial investments and considerable multidisciplinary resources [1,2]. This approach, based on experimental and empirical methods, has been put to the test, given the growing demand of the pharmaceutical sector and the need for innovative therapies within a short timeframe.

In an increasingly technological and digital world where computational advances are rapidly unfolding, Artificial Intelligence (AI) has emerged as a transformative and revolutionary tool, capable of acting at all stages of pharmaceutical development, from the discovery of new therapeutic targets to pharmacovigilance [1,3,4].

The combination of several AI techniques like machine learning (ML), deep learning (DL), and generative models has contributed to innovation because of their application in different stages, such as

Virtual screening of new entities, new molecules or other already known, identifying the drug candidates with the highest success rate.
Modulation of bioactivity and toxicity (e.g., through deep neural networks (DNN)) [5,6].
De novo design of chemical structures with the desired pharmacological properties [7].
Prediction of receptor-ligand interactions (e.g., through three-dimensional neural networks such as “DeepAtom” [8,9]).

In addition, AI has played a crucial role in repurposing or repositioning drugs, i.e., in identifying new therapeutic indications for existing drugs, thereby enabling their use in treating diseases different from those for which they were initially developed [7,8]. This approach enables the addressing or simplification of key issues, such as those related to the synthetic design of new drugs. In fact, the impact of this technology is evident in some recent examples, such as Halicin, originally developed as an anti-diabetic molecule that has been identified through DL associated with the prediction of mechanisms of action as a promising antibiotic candidate [1], or the drug DSP-1181, developed in just 12 months to treat obsessive–compulsive disorder [10] and unfortunately abandoned in phase I clinical trials.

However, despite the benefits, AI-driven approaches face significant challenges, including the need for high-quality data to ensure the effectiveness of predictive models, the proper interpretation of results, and also ethical and regulatory issues such as privacy concerns and data biases [7,11,12]. For AI to be used ethically and fairly, it is imperative to promote technical innovation while also establishing a clear and global regulatory framework [10].

Soon, the algorithm may “merge” with the molecules, and computational science will become a catalyst for therapeutic innovations, thus contributing to the thrust of a new era in the discovery and development of new drugs and/or medicines. AI has multiple applications, standing out in the field of pharmaceutical development, for its integration with ML technologies (Figure 1). These allow the optimization of critical processes such as the classification of bioactive and inactive compounds, the evaluation of the pharmacokinetic properties of candidate drugs, and the stratified selection of patients for clinical trials, using specific computational techniques such as “Random Forest” (RF), regression algorithms and neural networks, among others [2,3,5]. On the other hand, DL is applied in the case of nonlinear relationships in high-dimensional biochemical and biological datasets. This approach uses sophisticated neural architectures, such as convolutional networks and recurrent networks, which demonstrate superior capabilities in analyzing and extracting features relevant to pharmaceutical research [2,3,5]. Furthermore, generative models enable the design of new molecules with tailored properties; thus, it becomes possible to create novel compounds, optimize existing drugs, and generate interactive molecular structures [2,3,11].

The application of predictive models that establish correlations between chemical structures and the biological activity/physicochemical properties of molecules enables the analysis of critical parameters of pharmaceutical compounds, namely toxicological profiles, bioavailability (pharmacokinetic characteristics) and therapeutic potency. At the same time, these systems contribute to the molecular screening process, identifying candidates with a higher probability of success in pharmaceutical development [8,12].

As already highlighted, the application of AI in the pharmaceutical area does not work as a stand-alone technology. In fact, it represents an integration of several technological tools, such as digital platforms and generative models. Among them, Bayesian statistical models (quantifying uncertainty), Deep Docking (a DL tool responsible for large-scale molecular screening, accelerating and optimizing the selection process), and Digital Twins (simulating virtual patients in clinical trials with personalized therapy), provide an anticipation of in vivo results, significantly increasing success rates over traditional pharmaceutical discovery approaches. This technological convergence enhances the effectiveness of development processes, responding to the growing demands of personalized medicine. It is through this multiplicity of complementary approaches that the revolutionary paradigm in scientific research is established [7,13,14].

This review adopts the AI-enhanced Design–Make–Test–Analyze (DMTA) cycle [15,16] as a conceptual framework to illustrate how AI systematically transforms pharmaceutical R&D from sequential, hypothesis-driven processes into integrated, data-driven iterative cycles. Specific AI technologies and their contributions to different DMTA stages are also analyzed.

Rather than replacing human expertise, it is examined how these AI tools can create a synergistic ecosystem that compresses development timelines from years to months while improving decision quality through continuous learning from each experimental cycle [15].

2. Materials and Methods

A comprehensive bibliographic search for this narrative review was conducted using PubMed, Scopus, Web of Science, and Google Scholar databases. The search strategy employed keywords and MeSH terms related to “artificial intelligence”, “machine learning”, “deep learning”, “neural network”, “convolutional neural network”, “natural language processing”, “computer-aided drug design”, “drug discovery”, “drug development”, “drug design”, “pharmaceutical research”, “clinical trials”, “pharmacovigilance”, “target identification”, “virtual screening”, “QSAR”, “ADMET prediction”. Boolean operators (AND, OR) were used to combine search terms and refine results, as well as synonyms and variants (e.g., “drug discovery” OR “drug development” OR “drug design”). A combination of controlled vocabulary (MeSH terms, subject headings) and free text searching with wildcards (e.g., “network*” to capture “network”, “networks”, “networking”) was applied to ensure comprehensive coverage. The search was performed on 23 June, 26 July and 21 December 2025.

Studies were included if they (1) focused on the application of AI and ML in drug discovery, development, or clinical applications; (2) were original research articles, systematic reviews, meta-analyses, perspective pieces, or expert commentaries; (3) were published primarily within the last five years (2020–2025) unless considered foundational or seminal works in the field; (4) were written in English; and (5) addressed topics including but not limited to AI-driven target identification, molecular design, formulation development, clinical trial design, regulatory considerations, or ethical implications.

Additionally, relevant articles from high-impact journals in pharmaceutical sciences, clinical oncology, and health technology were manually screened. Reference lists of selected articles were also reviewed to identify additional relevant publications. Grey literature, including conference proceedings and regulatory guidance documents, was consulted where appropriate to provide comprehensive coverage of emerging applications and regulatory perspectives on AI in pharmaceutical research.

The publication year distribution of the reviewed literature (n = 81) covers publications from 2006 to 2025, reflecting the rapidly growing research interest in AI applications for pharmaceutical development (Figure 2 The concentration of publications in recent years (2020–2025) demonstrates the field’s accelerating momentum, with 40 of the included papers (83.3%) published in the last five years. The temporal pattern illustrates the transition of AI in drug discovery from an emerging concept to an established research priority.

The thematic distribution of the references selected in this narrative review demonstrates comprehensive, balanced coverage of AI applications across the pharmaceutical development continuum (Figure 2). The dominance of Drug Discovery & Development (35.8%) and Technical AI Methods (30.9%) reflects the review’s dual focus on practical applications and rigorous methodological analysis. The substantial representation of Ethical & Regulatory Issues (12.3%) addresses critical implementation challenges often overlooked in purely technical reviews. The balanced inclusion of disease-specific applications, clinical trials, and pharmaceutical technology perspectives ensures coverage of diverse stakeholder needs across academia, industry, and regulatory communities.

3. Drug Development Cycle: Traditional Versus AI-Driven Methods

The drug development cycle is a multidisciplinary and highly regulated process which aims to transform a molecule with desirable biological activity into a safe and effective pharmaceutical product. This process is traditionally divided into four main stages: discovery and development of new drugs, preclinical research, clinical research and post-marketing (pharmacovigilance) [4,16].

3.1. Discovery and Development

In the discovery and development stage, therapeutic targets (e.g., proteins, genes, or even metabolic pathways) where pharmacological modulation can be performed are identified. The selection of these targets involves high-throughput medicinal chemistry screening and modelling techniques [4,16]. Deficiencies in the screening process of candidate compounds with therapeutic potential compromise the entire subsequent process. AI tools, such as Deep Docking and AtomNet (a structure-based deep Convolutional Neural Network (CNN)), perform large-scale virtual screenings, making it possible to identify promising compounds more quickly and accurately [4].

3.2. Preclinical Research

Preclinical research includes in vitro and in vivo testing to assess safety, pharmacokinetics and pharmacodynamics before human trials. The use of DNN and Quantitative Structure–Activity Relationships (QSAR) models can predict physicochemical properties and toxicity before drugs are used in humans [5,17].

3.3. Clinical Research

The clinical stage involves the evaluation of the drug in humans. It is the most time-consuming step, representing the largest share of the total development costs.

The evaluation of the drug’s efficacy in humans is divided into phases:

Phase I: Evaluation of safety, tolerability, and pharmacokinetics in about 20 to 100 healthy volunteers/patients. In this phase, AI assists by adjusting initial doses and predictions of responses, in order to minimize risk [1].
Phase II: evaluation of efficacy in a larger number of subjects (100 to 300) with the disease under study. AI techniques can support the selection and recruitment of ideal patients and, using genetic and clinical data, assist in designing trials that are tailored to each situation. They can also virtually simulate compounds and their potential interactions [2,3,10].
Phase III: involves a larger number of patients. It is at this stage that the submission of the regulation is determined. Failures in this step result in significant financial losses, but the use of AI to optimize protocols and predict risks allows for a substantial reduction in these losses [6,13].

Clinical trials are characterized by high failure rates, particularly in pathologies with complex pathophysiological mechanisms, such as Alzheimer’s disease and cancer. Drug candidates in these cases have a low probability of success (approximately 13%) when reaching the clinical trial phase [6,18,19]. The most prevalent problems include inadequate patient selection criteria for trial protocols, compromising the integrity of the results obtained [3]. The incorporation of AI into the new drug discovery process has significantly decreased clinical failure [10,13,14].

3.4. Post-Marketing

Considered by some authors as a clinical stage, the post-marketing or pharmacovigilance phase occurs after the product is marketed and involves monitoring safety and efficacy over time. At this stage, the use of AI to analyze extensive clinical databases and detect patterns of adverse reactions early is an asset for the pharmaceutical sector [17].

From the screening phase to the final success, several factors can influence the journey and compromise the results, often contributing to failures and high costs. One of the driving events of the current pharmaceutical revolution was the recent COVID-19 pandemic, which revealed the need to be prepared for emergencies and to accelerate the processes of research and development of therapeutic solutions. In recent years, industries have collaborated with AI companies, aiming to develop algorithms and models that can be implemented at different stages to improve efficiency and reduce costs [18].

As previously mentioned, the development and production of new drugs is a financially expensive, prolonged and highly complex process. The completion of a complete research cycle implies substantial costs, considering not only the direct investment, but also the resources applied to compounds that do not reach commercialization and the long testing periods. The largest portion of this investment is concentrated in clinical trials, which represent about 63% of the total cost [2,4,16].

AI presents promising solutions that aim to drastically reduce costs by automating critical steps, such as the screening of molecules and their design [7]. The implementation of virtual screening, generative networks and platforms such as “AtomNet” allows reducing expenses with laboratory experiments and reducing the use of animals, directing part of the investment to computer simulations, which are more economical and efficient [4]. In addition to the direct costs of research, the costs of return on investment must be considered. As this tool reduces development time, it also lowers the value of the final price of medicines, making therapies more accessible to patients [16]. However, the use of complex algorithms can, on the other hand, increase technological costs, thus representing a new investment paradigm [20,21]. Quantitative assessments have clearly demonstrated the AI’s transformative impact on development timelines and costs. Virtual screening reduces hit identification from 12–18 months of high-throughput screening ($1–2 M) to 1–4 weeks of computational analysis ($10 K–100 K), achieving 2–5× higher hit enrichment rates [22,23]. De novo molecular design compressed in silico Medicine’s DDR1 kinase inhibitor from target to Phase I clinical trials in just 18 months versus the traditional 4–5 years [22]. ADMET prediction models (R² = 0.7–0.9 for key properties) enable early elimination of problematic candidates, reducing late-stage attrition by 20–30% and avoiding $500 K–2 M in wasted development costs per failed compound [24,25]. AI-enhanced clinical trial design achieves 30–40% faster patient recruitment through improved matching algorithms [26]. Collectively, these advances project potential reductions in total development time from 10–15 years to 3–6 years and costs from $2.6 B to $1.0–1.5 B per approved drug [27,28], though long-term validation of these estimates awaits maturation of current AI-discovered candidates.

4. Fundamentals of the Different Artificial Intelligence Approaches and Drug Research

4.1. Artificial Intelligence with Supervision

The implementation of AI algorithms with supervised learning has shown relevance as one of the most efficient methodologies in accelerating and optimizing the development processes of new drugs and/or medicines. This technique is based on existing data and has crucial applications in molecular screening, toxicity prediction, identification of target molecules and evaluation of pharmacokinetic properties [21]. Using a previously compiled dataset, it trains models to predict or classify new molecules based on existing patterns. In the pharmaceutical sector, this translates into the ability to accurately predict efficacy, toxicity, solubility and receptor binding affinity [29].

The most widely used supervised learning AI methods are Artificial Neural Networks (ANN), Support Vector Machines (SVM), ML, RF, and regression [30,31]. These algorithms have been widely used to classify bioactive compounds, predict absorption, distribution, metabolization, excretion and toxicity (ADMET) and perform virtual screening of drug candidate molecules [3].

These models are applied in all phases of the pharmaceutical development process, ranging from early diagnosis to the prediction of clinical efficacy. Supervised methodologies also prove useful in the training of specialized systems such as “DeepTox”, “SPiDER” and “AiCure”, platforms that enable the analysis and prediction of toxicological profiles, bioactivity and characteristics of patients suitable for clinical trial protocols, consequently optimizing the chances of success [3]. According to Kalayil, N. et al., supervised models are often integrated into molecular representations, with the chemical descriptor designation, SMILES (Simplified Molecular Input Line Entry System) and molecular networks, which enable the prediction of physicochemical and pharmacological properties of the target molecule [17].

In addition to the advantages, when combined with experimental validation, this AI model serves as a powerful decision-support tool for selecting potential candidates and in the design of clinical trials. The effectiveness of this model depends directly on the quality of the data; models trained with biased or incorrectly annotated data tend to reproduce these errors and increase the risk of clinical trial failures or even lead to the exclusion of certain underrepresented populations [3,32]. On the other hand, there is a risk of overfitting (which occurs when the ML model adjusts to trained data but does not generalize to new data). Therefore, validation, the use of tests and continuous monitoring are essential for good development and satisfactory results [3]. From an ethical and regulatory point of view, constant human supervision is essential to prevent the results obtained from being applied randomly without a clinical and scientific context.

Thus, AI with human oversight plays a key role in transforming pharmaceutical discovery and development (Figure 3). The ability to learn from existing data and generate useful and accurate predictions in a short amount of time contributes to cost reduction and increases clinical success rates. However, for its use to be safe, equitable and trustworthy, it is necessary to ensure the quality of the data used, the transparency of the models and ethical oversight at all stages. If applied responsibly, this model can become a powerful tool in the development of drugs and/or medicines that are more efficient in pathology and more patient centered.

4.2. Unsupervised Artificial Intelligence

Unsupervised AI has proven to be a powerful tool in pharmaceutical discovery and development, with an emphasis on cases where there is no pre-existing (compiled) data. This approach permits exploring hidden patterns in large volumes of data, from biomedical data to clinical information. This area is essential in bioinformatics, pharmacogenomics and the design of new molecules [19]. Unlike the supervised technique, where models are trained according to existing data (active or inactive molecules), this model enables the discovery of patterns and irregularities within a dataset. The most used techniques include clustering, downsizing, hierarchical models, and principal component analysis (PCA), helping in grouping chemical compounds with similar properties, identifying subtypes of diseases and visualizing complex structures [5,21].

Unsupervised AI has played a key role in the discovery of new therapeutic targets, mainly because it enables the analysis of large amounts of genetic datasets. Through this analysis, it is possible to identify patterns of gene expression associated with certain diseases, without the need for previous data. In addition, the use of molecular clustering can classify compounds based on their structural similarities, which is essential for molecular screening and to infer the pharmacological properties of the molecule. There are several techniques used by this AI model, including “K-means” and Gaussian structure models [33]. PCA is a resource used to reduce the complexity of molecular data, facilitating the visualization and interpretation of the patterns that guide molecular design. Both clustering and PCA have significant advantages, contributing to making this AI model valuable in the early stages of pharmaceutical research, when data are abundant [5].

Despite its advantages, this tool poses some challenges, the most relevant being the correct interpretation of the results. When dealing with unlabeled data (without prior classification), any discovered patterns must be rigorously validated by experts in the field [34]. On the other hand, this technique focuses on increasing computational scale by dealing with the chemical bases of many compounds. However, the unsupervised AI technique has been improved to overcome this limitation [17].

In short, unsupervised AI has a key role in advancing pharmacology. The ability to explore more complex and non-existent data makes it essential for the identification of biological patterns, molecular groupings and the discovery of new therapeutic targets. However, its effectiveness is dependent on careful interpretations of results and rigorous scientific validation. When associated with supervised models, it proves to be innovative in terms of medicine, making it more accurate, predictive and personalized.

Table 1 establishes a parallel between the traditional and AI approaches, and the advantages of using the unsupervised/supervised AI models.

4.3. Artificial Intelligence Techniques

The transformation in the pharmaceutical industry has been driven by advances in AI. Various techniques have been used to solve challenges that have persisted for years in pharmaceutical research [19,29,30].

4.3.1. Supervised Models

As previously mentioned, this technique is one of the most common approaches, especially in terms of classification and logistic regression. Through data analysis, it is possible to train the models to predict properties such as bioactivity, toxicity or absorption. Among supervised models, the most used are [3,31]

A.

Support Vector Machines (SVM)—a classification model that allows you to outline the best plan between two different groups. It is a technique that works with little data, but that manages to clearly separate the groups.

B.

Artificial Neural Networks (ANN)—inspired by the human brain, this model can learn from data. The most common techniques are

i.: Multi-layer Perceptron (MLP) classifies and predicts values according to the data.
ii.: Convolutional Neural Networks (CNN) analyze images and molecular structures.
iii.: Recurrent Neural Networks (RNN) analyze data in series and permit the generation of new molecules based on the chemical sequence.

C.

“Random Forest” (RF)—permits the evaluation and prediction of the toxicity of a compound according to its chemical characteristics. It is simple and helps to avoid the common mistakes of the traditional method.

D.

Logistic Regression—permits classification of the data into two groups and calculates the probability of an event, according to the variables.

These techniques are widely used to predict the ADMET of a compound, perform virtual screening and classify compounds.

4.3.2. Unsupervised Models

These techniques are used when the data does not exist, that is, when it is not defined. In this case, the algorithm can detect patterns, groupings, or latent structures. Its scope of application includes [33,35]

A.: “K-means” and “clustering”—assess molecular clustering and identify disease subtypes.
B.: Principal Component Analysis (PCA)—can reduce the size of the compounds and visualize the chemical space.
C.: Self-Organizing Maps (SOM)—projects large molecular datasets onto 2D maps.

These techniques are essential to understand the chemical diversity of molecules and identify unknown relationships between compounds.

4.3.3. Deep Learning Models

DL models represent a breakthrough in neural networks as they can learn to represent complex biomolecular data. The most important models include [17]

A.: CNN—used in visual or structural representations.
B.: RNN—generates molecules based on representations.
C.: Transformers—inspired by natural language processing and used to represent mole.

4.3.4. Generative Models

Generative techniques revolutionized molecular design. They can create molecules with specific properties, reducing the time to discovery. In fact, they exemplify the paradigm shift in the Design phase of the DMTA cycle, moving from empirical medicinal chemistry to computationally guided molecular innovation that enables exploration of vast chemical spaces impossible through traditional methods. Among generative models, the following stand out [34]:

A.: Variational autoencoders (VAE) is a model that compresses and reconstructs the data to create the molecule. It is important to create molecules with a structure similar to known drugs.
B.: Generative Adversarial Networks (GAN) are an advanced DL model that works in two parts: the Generator, which creates false data that mimics the real one, and the Discriminator, which has the function of evaluating whether the data is true or false [16,19]. This approach allows the creation of innovative molecules with the desired pharmacological properties and increases the diversity of existing compounds. It is successfully applied in de novo molecular design, allowing the creation of chemical structures that meet the criteria of solubility, bioactivity, and toxicity (Figure 4) [34]. Additionally, it enhances the efficiency of chemical space exploration and accelerates the development of drug candidates.
C.: Reinforcement learning (RL) is a technique in which the AI learns by trial and error, that is, the AI makes decisions according to the result and receives a reward if it is right or is punished if it is wrong. In this way, it is possible to optimize the molecules and their processing. These methods enable the continuous generation and optimization of compounds, as well as the simulation of experimental cycles.

Despite their promise, generative models face significant challenges in drug discovery. While producing diverse molecular structures, GANs generate 10–30% non-synthesizable or structurally unstable molecules due to insufficient chemical rule constraints [36,37]. On the other side, VAEs tend toward conservative outputs, often reproducing training set features rather than exploring novel chemical space [36]. Critically, these models lack prospective validation in real synthesis campaigns—most reported success rates derive from retrospective computational benchmarks that may overestimate practical utility [37,38]. Current evaluation metrics like quantitative estimate of drug-likeness (QED) focus on structural validity but inadequately assess synthetic accessibility [39]. Consequently, this field urgently needs standardized evaluation frameworks that assess molecular validity, synthetic accessibility, novelty, and experimental confirmation rates [40].

4.3.5. Bayesian Models

The use of these models is a fundamental approach for uncertainty models, that is, in scenarios where there is little data or low-confidence data. They can incorporate prior knowledge in the prediction of pharmacological properties, translating uncertainty into probability distributions, which allows for a more robust prediction [13,16]. Thus, Bayesian models offer greater transparency in the interpretation of data, being valued at the regulatory and clinical levels [19]. In fact, they can quantify the uncertainty of the prediction, increasing confidence in the selection of candidate compounds [34]. In addition, they can predict ADMET properties through uncertainty modelling, contributing to the reduction in risks and costs. In short, these models reinforce the scientific confidence needed for the future of pharmaceutical development.

4.3.6. Molecular Representations and Techniques of QSAR/QSPR

QSAR and Quantitative Structure–Property Relationships (QSPR) techniques are used to predict relationships between the chemical structure and the biological activity of molecules [1]. However, its combination with AI models, such as SMILES representations and molecular fingerprints, enables the implementation of molecular screening, drug repositioning, and optimization of lead compounds [6].

4.4. Identification and Evaluation of Therapeutic Targets with Artificial Intelligence

The identification of molecular targets is a crucial step in the process of discovering new drugs. A therapeutic target can be a protein, with or without catalytic capacity, or a nucleic acid involved in the pathophysiological mechanism of the disease, whose modulation can generate therapeutic effects [19].

By traditional methods, this identification can take years. With the advancement of AI, this process has been significantly accelerated and optimized, allowing it to identify and evaluate targets with greater accuracy and effectiveness. AI models can analyze large volumes of biomedical data, identifying hidden and associated relevant patterns. This allows for the rapid and accurate mapping of proteins, metabolic pathways and molecular interactions, facilitating the discovery of new therapeutic targets [2].

As explained, both supervised and unsupervised AI models can be applied in the analysis of complex biological systems. In addition, ML algorithms are used to build protein–protein interaction networks and predict biomarkers with therapeutic potential, especially in complex and multifactorial diseases such as cancer, Alzheimer’s disease and autoimmune diseases. For the discovery of new therapeutic targets, the most used tools include CNN, which analyze genetic sequences and protein expression; Graph Neural Networks (GNN), which represent the molecules and identify the central points of interaction; and DL models combined with natural language processing, capable of extracting relationships between genes and diseases [3]. In addition, the “SPiDER” platform has played an important role in the prediction of molecular targets, excelling in the identification of protein-binding interactions [5].

DL algorithms can predict with high accuracy the affinity between molecules and their targets, thus speeding up the evaluation stage before validation [34]. Finally, AI also helps in the prioritization and evaluation of therapeutic targets, according to biological relevance, the possibility of modulation, safety, and selectivity.

Overall, AI is redefining the way therapeutic targets are discovered and evaluated. This technology can integrate diverse biomedical data, identify biological patterns, and predict interactions. However, for these approaches to be accepted at the scientific and regulatory level, it is crucial to ensure transparency, rigorous validation and the integration of biomedical knowledge.

4.5. Design of New Compounds

AI has made its most significant impact in the field of molecular design. This approach makes it possible to create, from scratch or based on known chemical structures, new molecules with the appropriate and desired properties.

VAE, GAN, and RL models are widely used to generate new structures, operating on data from known molecules. In this way, they can propose and generate new compounds with a high clinical success rate [3]. Through these models, it is also possible to explore regions of the molecule’s chemical space that had not yet been explored, thus allowing the creation of molecules that comply with the defined synthetic and pharmacokinetic constraints [34].

4.6. Artificial Intelligence Screening and Optimization

Virtual screening is an essential step to evaluate, on a large scale, the molecules that are most likely to bind to the biological target under study. With the use of AI, this process is enhanced, allowing the analysis of millions of compounds in a matter of hours or days, reducing the costs and time involved. Among the most applied techniques in this context are DNN, SVM, RF and QSAR, used for their ability to predict bioactivity, toxicity and binding affinity of compounds to molecular targets during screening. At the same time, tools such as “Deep Docking” perform molecular screenings with greater precision and effectiveness in complex therapeutic contexts, such as cancer [5]. Regarding the optimization of pharmacological properties, this step consists of improving crucial characteristics of the selected compounds, such as potency, selectivity, toxicity, and solubility, among others. Evolutionary algorithms and LR techniques allow adjustments in molecular structures until they reach the desired properties, maintaining the balance between efficacy and safety. In addition, the combination of QSAR/QSPR models with AI can predict with greater precision the relationships between the chemical structure and the pharmacological activity of compounds, contributing to accelerating the prioritization of promising molecules [13].

In this sense, virtual screening accelerates the Test phase by enabling in silico evaluation of millions of candidates before synthesis, fundamentally altering the economics and speed of lead identification within the iterative DMTA framework.

4.7. Emerging AI Technologies: Large Predictive and Language Models

Recent advances in AI have introduced transformative technologies that are reshaping pharmaceutical research, particularly in protein structure prediction and chemical language processing.

AlphaFold (developed by DeepMind) represents a breakthrough in computational biology by accurately predicting three-dimensional protein structures from amino acid sequences with near-experimental accuracy. Indeed, AlphaFold 2 demonstrated remarkable achievement in protein structure prediction, but the release of AlphaFold 3 (2024) marked a paradigm shift by extending predictions to include protein complexes, nucleic acids (DNA and RNA), and small molecule ligands—essentially all biomolecular interactions [41,42,43]. In fact, this technology has noticeably accelerated structural biology research, making previously inaccessible protein structures available for structure-based drug design. In terms of pharmacological purposes, AlphaFold 3 enables researchers to visualize binding sites with unprecedented accuracy, predict drug-target interactions including protein-ligand complexes, and design molecules that fit specific protein pockets without requiring time-consuming and expensive experimental crystallography or cryo-electron microscopy [41,42]. Also, in terms of predicting drug-like interactions, this model reaches 50% higher accuracy when compared with traditional physics-based methods, representing a significant advancement for rational drug design [43].

Complementing structure prediction advances, large language models (LLMs) trained on chemical data have emerged as powerful tools for molecular representation and generation. This is the case of MolGPT and ChemBERTa, which apply natural language processing techniques to molecular SMILES (Simplified Molecular Input Line Entry System) strings, treating chemical structures as a specialized language [44,45,46]. These models learn chemical grammar, molecular patterns, and structure–property relationships from millions of compounds, enabling them to generate novel molecules, predict chemical properties, optimize lead compounds, and suggest synthetic routes. For example, ChemBERTa-2 was trained on 77 million compounds from PubChem and has demonstrated superior performance in molecular property prediction tasks by leveraging self-supervised pretraining on massive chemical databases [44]. MolGPT, a transformer-decoder model, can generate diverse, drug-like molecules with desired pharmacological characteristics through conditional generation, essentially translating desired molecular properties into chemical structures [45]. The integration of these language models with traditional drug discovery workflows represents a paradigm shift toward treating molecular design as a language translation problem—converting therapeutic requirements into molecular structures [46].

As a matter of fact, these emerging technologies complement other AI approaches, offering orthogonal capabilities that, when integrated, create comprehensive AI-driven drug discovery platforms. The convergence of protein structure prediction (AlphaFold 3), generative molecular design (VAE, GAN, RL), and chemical language models (ChemBERTa, MolGPT) is enabling unprecedented efficiency in identifying targets, designing molecules, and predicting their interactions, fundamentally transforming the pharmaceutical development pipeline [41,46].

5. Cycle: Design, Make, Test, Analyze

The Design–Make–Test–Analyze (DMTA) cycle (Figure 5) is a concept of rational drug development, especially when combined with AI. In practice, AI makes it possible to integrate this cycle in a continuous and automated way, applying it in different phases of drug development. According to Balaguru S. and Gandra A. [35], the AI-assisted DMTA cycle comprises the following steps:

Design: generation of new compounds through GAN and VAE.
Make: prioritization of compounds with better synthetic viability, based on certain chemical parameters.
Test: computer simulations of molecular interactions and evaluation of pharmacological properties.
Analyze: analysis of the experimental results, with adjustments to the models to achieve the desired characteristics.

5.1. Absorption, Distribution, Metabolization, Excretion and Toxicity

ADMET prediction serves as a critical filter in the Test-to-Analyze transition, allowing early elimination of compounds with unfavorable pharmacokinetic profiles and reducing costly late-stage attrition that has historically plagued drug development. Regardless of the demonstrated therapeutic potential, compounds often fail in clinical phases due to inadequate pharmacokinetic profiles. In this scenario, AI emerges as a promising and indispensable methodology for the efficient, accurate and economical optimization of these properties. AI can build models that simulate the behavior of the drug in the human body. These models are trained and adjusted based on chemical and clinical data, using advanced AI techniques. Similarly, AI has been successful in predicting liver toxicity, carcinogenesis, cardiac toxicity, and estimating the plasma half-life of the drug [5].

The absorption of drugs is correlated with their ability to cross biological barriers and reach the bloodstream. Distribution characterizes the patterns of drug dispersion through the different biological compartments. In these domains, AI has played an important role in predicting intestinal permeability, plasma protein binding, tissue distribution, and central nervous system penetration [5,16,19]. The use of neural networks and QSAR models has contributed to the evaluation of these same properties, reducing the need for tests in the initial phases of studies [3].

In the subsequent stages of the ADMET process, metabolization is assumed as the process of chemical biotransformation that the drug undergoes in the body, mainly in the liver, and excretion refers to the elimination of the drug through the renal, biliary or other routes. The role of AI in these steps is associated with the prediction of interactions with cytochrome P450 enzymes, toxic metabolite formation, clearance, and plasma half-life [16,34]. In fact, AI makes it possible to predict metabolic pathways and excretion profiles, enabling the selection of safer compounds before animal testing.

In addition to the properties mentioned, it is essential to predict possible interactions of drugs with membrane transporters, since these interactions influence the bioavailability (and thus pharmacokinetics) and accumulation of drugs in tissues [34]. The study of drug-carrier interactions thus contributes to the reduction in adverse effects and to the development of drugs with greater tissue penetration. QSAR and QSPR models are thus used in conjunction with AI to correlate molecular features with ADMET properties. The implementation of molecular descriptors (numerical or mathematical representation of a molecule), fingerprint techniques and SMILES representations can predict the toxicity and behavior of compounds [17]. Ultimate, ADMET prediction serves as a critical filter in the Test-to-Analyze transition, allowing early elimination of compounds with unfavorable pharmacokinetic profiles and reducing costly late-stage attrition that has historically plagued drug development.

5.2. Patient Selection and Challenges

Adequate patient selection is one of the main challenges in clinical development, and it is essential to identify individuals with the highest probability of responding to therapy.

Clinical trial failure is often associated with a low rate of adequate patient recruitment. In this context, AI has been increasingly used to improve this process, through the analysis of clinical and genetic data, which allows the identification and selection of patients who meet the inclusion criteria [2]. According to Gallego, V. et al., unsupervised clustering models help to identify patients who are most likely to have a positive response to treatment [3]. In this approach, neural networks and ML algorithms predict individual drug response, based on pharmacogenomic data [6].

About the prediction of adverse reactions, AI intervenes even before the start of treatment, anticipating cardiovascular, hepatic and allergic risks. This makes it possible to exclude candidates at high risk of adverse effects, increasing the safety of studies, improving approval rates for new drugs, and reducing costs associated with clinical failures [34].

The application of AI in clinical research has also been essential for the selection of personalized treatments, based on the genetic and epigenetic profile, the microbiome, as well as the lifestyle and habits of patients. Through this analysis, it is possible to recommend specific and more effective therapies for patients [32].

Despite significant advances, ethical and transparency issues persist in the use of AI in patient selection. To overcome this limitation, AI must be used in a supervised manner, with continuous validation and auditing of the algorithms applied.

5.3. Advantages of AI in Pharmaceutical Research Versus Data Challenges and Solutions

As previously mentioned, regarding the various techniques, the emergence of AI represented an obvious added value for pharmaceutical research. The use of these tools has contributed to cost reduction, increased success rates, and the advancement of personalized therapies [5,6,13,34]. In fact, the main advantages of AI focus on accelerating of the drug discovery cycle, reducing the time from 10 years to at least 12 months, designing of new compounds with optimal molecular properties, molecular screening and prediction of ADMET properties; promotion of personalized medicine, focused on the needs and data of the user; support for industrial production, through quality improvement, waste reduction and increased reliability in industrial processes; repositioning of medicines, that is, identification of new therapeutic uses of existing drugs. The evolution and growing need for AI in research and the pharmaceutical industry are therefore evident. The future of drug discovery and development will inevitably be associated with these technologies [47].

Nonetheless, the AI model performance is fundamentally constrained by data quality challenges across the drug development pipeline. Data standardization remains problematic, with heterogeneous molecular representations (SMILES, SELFIES, 3D structures) and inconsistent bioactivity units complicating the model training [48,49]. In addition, data scarcity particularly affects certain specialized applications: rare disease targets have limited training examples, while failed compounds are systematically underreported in public databases like ChEMBL, creating survival bias [48]. Solutions include transfer learning (pre-training on large chemical libraries before fine-tuning on task-specific datasets) [50], data augmentation through SMILES enumeration and molecular perturbations [50], and combined learning approaches that enable collaborative model training without exposing proprietary data [51]. Finally, active learning strategies may help to address scarcity by intelligently selecting the most informative experiments, expanding knowledge gain from limited resources [52].

Table S1 (Supplementary Data) shows an overview of artificial intelligence techniques in drug discovery and development.

6. Application of Artificial Intelligence in Managing Specific Pathologies

6.1. Alzheimer’s Disease

Alzheimer’s disease is a progressive neurodegenerative disorder which affects millions of individuals globally. It is responsible for causing severe cognitive decline and memory impairment. An early and accurate diagnosis of this disease is crucial for effective intervention, thus minimizing and delaying the symptoms of the disease [53].

The role of AI in Alzheimer’s disease has revolutionized diagnosis, treatment, and prognosis [53]. Early diagnosis has been achieved through AI techniques applied to magnetic resonance imaging or Positron Emission Tomography (PET), as well as the analysis of biomarkers and genetic data, which enable the identification of the initial signs of the disease with greater precision. These tools, used together, improve assessments of the patient’s cognitive and language aspects [54]. When it comes to treatment, AI streamlines drug discovery and permits personalising therapy according to individual responses. For example, Exscientia used AI to develop DSP-0038, a small molecule targeting psychosis in Alzheimer’s disease. Acting as a 5-HT2A antagonist and 5-HT1A agonist, its dual mechanism aims to provide antipsychotic effects with improved safety and tolerability compared to existing drugs. Regarding prognosis, using multiple data and DL, it is possible to predict the progression of the disease more accurately and improve the quality of life [54,55]. However, challenges remain, and AI shows promise in transforming the approach to Alzheimer’s disease.

6.2. Cancer

Cancer is a highly complex disease, often long and painful to treat. There are several types of cancer, and the treatment processes currently instituted in several cases are still immature.

In the field of oncological diseases, the use of AI has made it possible to transform the medicine and therapy of patients. AI allows, in real time, the analysis of genetic data and the integration of different data/information. The development of anticancer drugs takes an average of between 12 and 15 years, and through AI, it has been possible to shorten this period [56]. The development and production of oncological drugs require appropriate models that evaluate the affinity of the drug to the target (“DeepDATA”), models of interaction analysis between different RNAs, such as the study by Chu, A. et al. [18]. In this sense, generative AI models, such as GAN, have helped in the design of molecules with optimized properties (for example, through docking studies and pharmacological properties), thus reducing the failure rate [56].

The clinical trial phase is also optimized by digital platforms, which monitor adherence to therapy, allow the personalization of therapy through molecular data and clinical history, improving efficacy and safety. The reduction in side effects and toxicity is due to the application of algorithms, and the study carried out before entering the clinical phase. Several studies focused on drugs for specific cancers, such as the case of HER2-positive breast cancer, have resulted in the development of more targeted compounds. Research related to STAT3 inhibitors in acute myeloid leukemia has also allowed for targeted treatment and increased the rate of therapeutic success. These advances, supported using AI, have made it possible to achieve greater therapeutic efficacy and safety, outlining a promising future for diagnosis and treatment [56].

6.3. Diabetes Mellitus

Diabetes mellitus is a chronic disease characterized by excess glucose in the blood due to a lack of insulin or the body’s difficulty in using it. There are three main states: prediabetes (in which glucose levels are above normal but still undiagnosed), type 1 diabetes (an autoimmune disease that destroys insulin-producing cells, requiring treatment with injectable insulin) and type 2 diabetes (this state develops over time and can be controlled with a healthy lifestyle or medication). If left unchecked, this disease can cause complications such as diabetic retinopathy, nephropathy, neuropathy, cardiovascular disease, and atherosclerosis [57].

In the context of addressing this disease, the use of AI has allowed advances. For example, ML techniques make it possible to predict glucose variations accurately, and automatic insulin systems simulate physiological administration, reducing the need for the patient to manage blood glucose. It has also been possible to automate medical reports, prioritize users who need medical referral and issue clinical alerts [58].

On the other hand, AI makes it possible to detect complications such as diabetic ketoacidosis in children early, to analyse population data in order to predict the risk of diabetes and to allow the adoption of public policies to treat the disease, reducing risk factors and epidemiological trends [58]. In the future, greater investment in these treatment systems and the increasing integration of AI in health promotion are expected.

6.4. Bacterial Infections

Bacterial resistance to drugs has increased. To reverse this undesirable advance, several researchers have used DNN to identify new antibiotics [23].

Halicin is one antibiotic discovered with the help of AI. This drug has a molecular structure distinct from “classical” antibiotics and a broad spectrum of bactericidal activity that includes some resistant bacterial species, namely Mycobacterium tuberculosis and Acinetobacter baumannii [23].

6.5. Obsessive–Compulsive Disorder

DSP-1181, a powerful long-acting serotonin (5-hydroxytryptamine) receptor agonist, was the first AI-assisted drug for the treatment of obsessive–compulsive disorder to reach clinical trials stage. Exscientia, along with other companies, developed this drug in less than 12 months, from its initial screening to the end of clinical trials [59]. This represents an innovative milestone, but unfortunately, its development was abandoned in phase I clinical trials.

7. Ethical, Regulatory Implementation and Societal Challenges

The advancement of AI in drug discovery and development has resulted in promising advances, but it has also exposed several ethical, regulatory, and societal challenges that require structured responses. The main challenges include algorithm bias, informed consent, data privacy, the absence of regulation by regulatory bodies (U.S. Food and Drug Administration (FDA), European Medicines Agency (EMA), and International Council for Harmonisation (ICH)), and socio-economic impact.

7.1. Ethical Challenges

7.1.1. Algorithm Bias and Health Equity

Algorithm bias is one of the most debated points, since AI models, when trained with non-representative data, can create inequalities. This can lead to less accurate diagnoses and less effective therapies for ethnic minorities, the elderly, women, or small populations, as their data can be misinterpreted or even neglected by the system [19]. This unconscious reproduction can have serious consequences, such as less accurate diagnoses, screenings, and therapeutic personalization. To correct this problem, it is essential to implement bias auditing protocols/mechanisms, use diverse data, and adopt drug design from the beginning of the process [9].

7.1.2. Data Privacy and Informed Consent

Another challenge is how patients’ clinical and genetic data are selected, processed, and shared. The role of AI in pharmacology is related to the analysis of large volumes of data, which are sometimes obtained without the patient fully understanding how they will be used and by whom [19]. The Nightingale project [7] illustrates the ethical risks of the lack of clarity for the patient regarding the use of their data, i.e., the lack of transparency in consent. In addition, many automatic databases are subject to being identified by advanced algorithms. This lack of transparency undermines confidentiality and imposes stricter regulation on the information, provided consents and data traceability.

7.1.3. Socio-Economic Impact

Another ethical challenge of AI is the socio-economic impact that this technology has on the pharmaceutical industry. If, on the one hand, AI promotes the reduction in costs and the development time of new drugs, enabling access to personalized therapies [19], on the other hand, it aggravates global inequality, due to the increased power of corporations that dominate algorithms and data. It is necessary to pay attention to the possibility of replacing health professionals with automatic systems (robots), which may cause an increase in the unemployment rate and devaluation of technical areas and health professionals. In addition, developing countries, due to their still precarious digital structure, run the risk of being excluded from this new economy, aggravating the inequality between countries concerning innovation [9].

7.2. Regulatory Challenges and Implementation Frameworks

7.2.1. Algorithmic Explainability Challenge

The most advanced models generate results that often cannot be clearly explained, such as DL, which hinders the scientific validation and regulation of this technology [2,19]. This limitation also compromises the approval of AI-based medicines and the trust of health professionals. Therefore, there is an urgent need to develop specific regulations for the use of AI in the health sector. The lack of specific regulations by regulatory bodies such as the FDA, EMA, and ICH is another challenge to the implementation and consolidation of AI in the pharmaceutical industry. Such regulation should focus on explainable models, mandatory documentation, regular audits and continuous human oversight to validate clinical decisions.

7.2.2. Current Regulatory Frameworks and Solutions

Regulatory frameworks are evolving to address AI-specific challenges in drug development. The FDA’s AI/ML-Based Software as a Medical Device Action Plan (2021) and subsequent guidance (2023) establish predetermined change control plans (PCCPs) for models that learn post-deployment, addressing “model drift” through required algorithm change protocols and real-world performance monitoring [60,61]. Algorithmic explainability remains a critical hurdle: while deep learning models achieve high predictive accuracy, their “black box” nature complicates regulatory review. Solutions include SHAP (SHapley Additive exPlanations) values for feature importance analysis [62] and attention mechanism visualization in transformer models [63]. The EMA’s Reflection Paper on AI in Medicines (2023) emphasizes validation requirements specific to drug discovery contexts, mandating external validation datasets and documentation of training data provenance [64]. Notably, AI-discovered drugs face dual regulatory pathways: the AI tool itself may require qualification as a development tool, while the resulting drug candidate follows traditional approval routes, creating procedural complexity that companies like Exscientia and Insilico Medicine are currently navigating with first-in-human trials of AI-designed molecules [65,66].

7.3. Path Forward

Consolidating and evaluating the points addressed, there is no doubt that the incorporation of AI in drug discovery represents a scientific advance. However, the success of this technology also depends on the ability to address the ethical, regulatory and social challenges that are intrinsically associated with it. It is essential to promote transparent and equitable regulation, thus building fair, explainable and human-centric AI.

8. Discussion and Future Perspectives

8.1. Achievements and Critical Gaps

The integration of artificial intelligence into pharmaceutical research and development represents a paradigm shift with profound implications for global health. This review has examined AI’s applications across the drug development continuum, from target identification through post-market surveillance, while critically evaluating both the demonstrated successes and persistent limitations of current approaches.

AI has unequivocally accelerated specific stages of drug development, particularly virtual screening, molecular property prediction, and patient stratification for clinical trials. The technology has reduced development timelines for certain compounds from years to months, exemplified by cases such as DSP-1181 and the identification of Halicin as a novel antibiotic. These successes demonstrate AI’s capacity to augment human expertise and handle the increasing complexity and data volume in modern pharmaceutical research. However, this review also identifies critical gaps between AI’s computational promise and clinical reality. The translation of AI-designed molecules to approved therapeutics remains limited, with most success stories still in preclinical or early clinical stages. Methodological concerns—including inadequate validation practices, dataset biases, lack of standardized benchmarks, and insufficient reporting of failures—raise questions about the generalizability and reproducibility of many reported findings. Furthermore, the concentration of AI expertise and resources in well-funded institutions and for well-studied targets risks exacerbating existing inequalities in global health and pharmaceutical innovation.

8.2. Future Technological Directions

The evolution of AI has been oriented towards a promising future in terms of its use by pharmaceutical research, consolidating itself as a tool in the discovery and development of drugs and medicines.

Despite being a tool with several capabilities, there are still points to adjust. Thus, in the future, there will be a tendency to consolidate platforms that incorporate AI; this means that it will be necessary to validate an approach that encompasses molecular design, virtual screening, ADMET analysis and interactive simulations. In this way, AI will be able to incorporate cutting-edge technology with traditional drug discovery strategies, making development more efficient and automated [67,68]. For example, the study by Kandan, A. et al. culminated in the development of a multi-objective generative model that combines target affinity with synthetic accessibility. This model allows us to exemplify the next platforms that will be able to explore the chemical spaces of molecules that have not yet been studied and generate ligands for the target more quickly [69]. Several technological trajectories show particular promise for advancing AI’s impact on pharmaceutical development:

Integrated Multi-Stage Platforms: future AI systems will likely integrate multiple functions—molecular generation, property prediction, synthetic route planning, and formulation optimization—into unified platforms that optimize across the entire development pipeline rather than individual stages in isolation. Such platforms will need to balance computational efficiency with mechanistic interpretability to satisfy both scientific and regulatory requirements [68,69].
Physics-Informed Machine Learning: hybrid approaches that combine data-driven learning with physics-based simulations and mechanistic biological models offer potential to improve generalization beyond training data while maintaining interpretability. This includes incorporating quantum mechanical calculations, molecular dynamics simulations, and systems biology models into AI workflows [68].
Active Learning and Experimental Design: AI systems that actively propose experiments to maximize information gain represent an evolution from passive prediction to active experimental design. Such systems could dramatically reduce the number of experiments required to identify successful drug candidates, particularly valuable for expensive in vivo studies and clinical trials [69,70].
Multimodal Data Integration: future AI applications will increasingly integrate diverse data types—genomic, proteomic, metabolomic, clinical, imaging, and real-world evidence—to achieve a more comprehensive understanding of disease mechanisms and drug responses. This holistic approach is particularly critical for complex diseases and personalized medicine applications [54,55,70].
Explainable AI for Regulatory Acceptance: Development of intrinsically interpretable AI architectures, alongside post-hoc explanation methods, will be essential for regulatory acceptance. Future AI systems must not only predict outcomes but also provide mechanistic rationale that domain experts can evaluate and regulators can scrutinize [71,72].

8.3. Emerging Application Frontiers

8.3.1. Precision Medicine and Therapeutic Personalization

In the context of therapeutic personalization, AI plays an essential role, not only in rare diseases but also in complex pathophysiological conditions, enabling advances in the discovery of therapies adapted to the patient’s genetic and clinical profile. It will thus be possible to speed up clinical trials, patient selection, and select optimal doses, using specific biomarkers [70]. The use of generative algorithms will also allow for speeding up clinical studies, using a greater number of samples, to adapt the individual profile and prevent adverse reactions [55,56,71]. Precision Medicine and Biomarker Discovery: AI will increasingly enable the identification of predictive biomarkers that stratify patients for targeted therapies, moving beyond traditional demographic and disease-stage stratification to molecular and multi-omic profiles. This is particularly critical for oncology, rare diseases, and conditions with heterogeneous patient populations [70,71].

8.3.2. Expanding Therapeutic Modalities

AI could also be used as a tool to support gene therapy or other forms of therapy not currently available in healthcare, and a more holistic approach emerges—the combination of regenerative medicine, pharmacology and gene therapy [56]. Beyond current applications, several emerging frontiers show particular promise:

-: Drug Repurposing at Scale: Systematic, AI-driven exploration of existing drugs for new indications could rapidly expand therapeutic options, particularly for rare diseases and emerging health threats. The COVID-19 pandemic demonstrated both the potential and limitations of this approach, highlighting the need for improved prediction of off-target effects and clinical outcome modeling [19,23].
-: Combination Therapy Optimization: AI methods for predicting synergistic drug combinations could address complex diseases requiring polypharmacy, such as cancer, infectious diseases, and metabolic disorders. However, the combinatorial explosion of possible drug pairs and the scarcity of combination therapy data represent significant challenges requiring innovative experimental and computational strategies [56,69].
-: Gene and Cell Therapy Design: AI applications are emerging in designing viral vectors, guide RNAs for CRISPR systems, and engineered cell therapies. These represent potentially transformative applications at the intersection of AI, synthetic biology, and regenerative medicine [56,71].
-: Global Health and Neglected Diseases: AI offers opportunities to accelerate drug discovery for diseases disproportionately affecting low- and middle-income countries, where traditional pharmaceutical development models have failed due to limited commercial incentives. However, realizing this potential requires deliberate efforts to address data scarcity, infrastructure limitations, and capacity building in these regions [23].

8.4. Computational Sustainability and Resource Considerations

While AI promises efficiency gains, substantial computational costs warrant consideration. Training large molecular generation models requires GPU/TPU infrastructure costing 10,000–1,000,000 USD per training run, with associated energy consumption raising environmental concerns—for context, GPT-3 training generated approximately 552 metric tons of CO₂ equivalent [73]. This creates accessibility barriers favoring resource-rich institutions and potentially limiting AI adoption in academic settings or low–middle income countries. “Green AI” strategies address these concerns through model lightweighting techniques [74]: pruning removes 50–90% of parameters with minimal performance loss [75], quantization reduces precision requirements (INT8 vs. FP32) [76], and knowledge distillation transfers large model capabilities to smaller, deployable versions [77]. Transfer learning reduces training from scratch, while carbon-aware job scheduling leverages renewable energy availability [78,79]. Cost–benefit analysis reveals that computational expenses (USD 50 K virtual screening) remain orders of magnitude below avoided experimental costs (USD 1–2 M high-throughput screening) [23], but holistic sustainability assessments must consider both experimental and computational resource consumption in the transition to AI-driven pharmaceutical development.

8.5. Regulatory Evolution and Governance

However, the rapid acceleration of AI has brought with it the need for new regulatory frameworks. As automated platforms expand, it is crucial to harmonize guidelines between the various regulatory bodies, as well as the existence of consortia to unite researchers, industry, and political power [72,80]. The rapid advancement of AI in pharmaceutical development necessitates the evolution of regulatory frameworks. Future directions include

AI-Specific Regulatory Guidance: Regulatory agencies must develop clear standards for validating AI-generated predictions, documenting training datasets, assessing algorithmic bias, and monitoring post-deployment performance. International harmonization of these standards will be critical for global drug development [2,72,80].
Adaptive Regulatory Pathways: Traditional sequential regulatory processes may need adaptation for AI-designed therapeutics, potentially including iterative model validation, real-world evidence integration, and post-market surveillance using the same AI systems that aided discovery [72,80,81].
Ethical Governance Frameworks: As discussed in Section 6, addressing algorithmic bias, data privacy, equitable access, and socioeconomic impacts requires governance frameworks that extend beyond traditional pharmaceutical regulation to encompass broader societal considerations [9,19,20].
International Collaboration and Data Sharing: Realizing AI’s full potential requires international consortia that enables data sharing while preserving privacy, competitive interests, and intellectual property. Models such as federated learning, which enables training on distributed datasets without centralizing sensitive data, show promise but require further technical development and policy frameworks [80].

8.6. Education and Workforce Development

The pharmaceutical AI revolution demands a new generation of professionals with hybrid expertise spanning AI methodologies, pharmaceutical sciences, clinical medicine, and regulatory affairs. Current graduate education programs insufficiently address this need. Future efforts should include

Interdisciplinary training programs that deeply integrate AI, pharmaceutical sciences, and clinical training.
Continuing education for current pharmaceutical professionals to develop AI literacy.
Ethical training that prepares researchers to recognize and address bias, privacy, and equity concerns.
Industry-academic partnerships that provide hands-on experience with real-world drug development challenges.

8.7. Synthesis and Path Forward

The future is moving towards the integration of AI techniques in pharmaceutical research. However, it is important to make AI technologies more explainable regarding a more effective collaboration between humans and machines, ensuring greater safety, efficacy, and ethics in the development of new therapies.

Artificial intelligence is neither a panacea that will solve all challenges in pharmaceutical development nor a mere incremental improvement to existing methodologies. Rather, it represents a fundamental shift in how we discover, develop, and deploy therapeutics—one that offers tremendous promise but also presents novel challenges requiring thoughtful, coordinated responses from the scientific, medical, regulatory, and policy communities.

The path forward requires balancing enthusiasm for AI’s transformative potential with critical evaluation of its limitations, investing in high-quality prospective validation studies rather than relying solely on retrospective computational benchmarks, addressing methodological rigor and reproducibility concerns that currently limit scientific progress, ensuring equitable benefit distribution and addressing biases that could exacerbate health disparities, and developing regulatory frameworks that enable innovation while ensuring safety and efficacy. Success will require unprecedented collaboration across traditional boundaries—between academia and industry, between computational and experimental scientists, between different therapeutic areas and geographic regions, and between technological innovation and humanistic medicine. As AI continues to evolve and mature, maintaining focus on the ultimate goal—developing safe, effective, accessible therapeutics that improve global health—must remain paramount. The future of pharmaceutical development will be defined not by AI replacing human expertise, but by effective human-AI collaboration that combines computational power with scientific insight, clinical wisdom, and ethical judgment to accelerate therapeutic innovation for the benefit of all humanity.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ai7010026/s1, Table S1: Overview of artificial intelligence techniques in drug discovery and development.

Author Contributions

Conceptualization, A.B.L., C.F.R. and F.A.M.S.; methodology, A.B.L., C.F.R. and F.A.M.S.; validation, C.F.R. and F.A.M.S.; formal analysis, A.B.L., C.F.R. and F.A.M.S.; investigation, A.B.L.; writing—original draft preparation, A.B.L.; writing—review and editing, C.F.R. and F.A.M.S.; supervision, C.F.R. and F.A.M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ADMET	Absorption, Distribution, Metabolization, Excretion and Toxicity
AI	Artificial Intelligence
ANN	Artificial Neural Networks
CNN	Convolutional Neural Networks
DL	Deep Learning
DMTA	Design–Make–Test–Analyze
DNN	Deep Neural Networks
EMA	European Medicines Agency
FDA	U. S. Food and Drug Administration
GAN	Generative Adversarial Networks
ICH	International Council for Harmonisation
ML	Machine Learning
PCA	Principal Component Analysis
QSAR	Quantitative Structure–Activity Relationships
QSPR	Quantitative Structure–Property Relationships
RF	Random Forest
RL	Reinforcement Learning
RNN	Recurrent Neural Networks
SMILES	Simplified Molecular Input Line Entry System
SVM	Support Vector Machines
VAE	Variational Autoencoders

References

Banerjee, P.; Brahma, D.; Sarma, I.; Ray, N. Artificial Intelligence in Drug Development—Revolutionizing Drug Discovery and Clinical Trials. Acta Sci. Pharm. Sci. 2024, 8, 19–21. [Google Scholar] [CrossRef]
Giaramita, H. AI Assistance in the Drug Development Process: Reaching for a Regulatory Framework. Seton Hall Law Rev. 2024, 54, 1239–1278. [Google Scholar] [CrossRef]
Gallego, V.; Naveiro, R.; Roca, C.; Insua, D.; Campillo, N. AI in drug development: A multidisciplinary perspective. Mol. Divers. 2021, 25, 1461–1479. [Google Scholar] [CrossRef] [PubMed]
Huanbutta, K.; Burapapadh, K.; Kraisit, P.; Sriamornsak, P.; Ganokratanaa, T.; Suwanpitak, K.; Sangnim, T. Artificial intelligence-driven pharmaceutical industry: A paradigm shift in drug discovery, formulation development, manufacturing, quality control, and post-market surveillance. Eur. J. Pharm. Sci. 2024, 203, 106938. [Google Scholar] [CrossRef] [PubMed]
Narne, H. Advancements in Gen AI for Drug Discovery Accelerating Research Development. Int. J. Adv. Res. Eng. Technol. 2024, 15, 1–14. Available online: https://iaeme-library.com/index.php/IJARET/article/view/IJARET_15_05_001 (accessed on 21 December 2025).
Mishra, D.K.; Awasthi, H. Artificial Intelligence: A New Era in Drug Discovery. Asian J. Pharm. Res. Dev. 2021, 9, 87–92. [Google Scholar] [CrossRef]
Gupta, R.; Srivastava, D.; Sahu, M.; Tiwari, S.; Ambasta, R.K.; Kumar, P. Artificial intelligence to deep learning: Machine intelligence approach for drug discovery. Mol. Divers. 2021, 25, 1315–1360. [Google Scholar] [CrossRef]
Unogwu, O.J.; Ike, M.; Joktan, O.O. Employing Artificial Intelligence Methods in Drug Development: A New Era in Medicine. Mesopotamian J. Artif. Intell. Healthc. 2023, 2023, 52–56. [Google Scholar] [CrossRef]
Efthymiou, I.-P.; Anastasopoulou, C.; Livieri, G.; Briola, K.; Efthymiou, I. Ethical Issues Arising from the Use of AI in Drug Discovery. J. Politics Ethics New Technol. Artif. Intell. 2024, 3, e37093. [Google Scholar] [CrossRef]
Mahato, T. Impact of AI In Drug Development and Clinical Studies: A Systematic Review. Eur. Chem. Bull. 2023, 12, 558–572. [Google Scholar]
Fu, C.; Chen, Q. The future of pharmaceuticals: Artificial intelligence in drug discovery and development. J. Pharm. Anal. 2025, 15, 101248. [Google Scholar] [CrossRef]
Husnain, A.; Rasool, S.; Saeed, A.; Hussain, H. Revolutionizing Pharmaceutical Research: Harnessing Machine Learning for a Paradigm Shift in Drug Discovery. Int. J. Multidiscip. Sci. Arts 2023, 2, 149–157. [Google Scholar] [CrossRef]
Dudhe, A.; Dudhe, R.; Sakarkar, S.; Porwal, O. AI—New Avenue for Drug Discovery and Optimization. Clin. Oncol. Res. 2021, 4, 2–9. [Google Scholar] [CrossRef]
Harrer, S.; Shah, P.; Antony, B.; Hu, J. Artificial Intelligence for Clinical Trial Design. Trends Pharmacol. Sci. 2019, 40, 577–591. [Google Scholar] [CrossRef] [PubMed]
Schneider, P.; Walters, W.P.; Plowright, A.T.; Sieroka, N.; Listgarten, J.; Goodnow, R.A.; Fisher, J.; Jansen, J.M.; Duca, J.S.; Rush, T.S.; et al. Rethinking drug design in the artificial intelligence era. Nat. Rev. Drug Discov. 2020, 19, 353–364. [Google Scholar] [CrossRef]
Paul, D.; Sanap, G.; Shenoy, S.; Kalyane, D.; Kalia, K.; Tekade, R.K. Artificial intelligence in drug discovery and development. Drug Discov. Today 2021, 26, 80–93. [Google Scholar] [CrossRef]
Kalayil, N.; D’Souza, S.; Khan, S.; Paul, P. Arteficial Intelligence in Pharmacy Drug Design. Asian J. Pharm. Clin. Res. 2022, 15, 21–27. [Google Scholar] [CrossRef]
Chu, A.; Liu, J.; Yuan, Y.; Gong, Y. Comprehensive Analysis of Aberrantly Expressed ceRNA network in gastric cancer with and without H.pylori infection. J. Cancer 2019, 10, 853–863. [Google Scholar] [CrossRef] [PubMed]
Ferdouse, Z.; Islam, R.; Bhowmik, N.; Habibullah, M.; Sharma, D. AI and Machine Learning in Accelerating Drug Design: Opportunities, Challenges, and Future Directions. World J. Adv. Pharm. Sci. 2025, 2, 113–122. [Google Scholar] [CrossRef]
Mishra, A.; Kan, Y. AI Meets Chemistry: Unlocking New Frontiers in Molecular Design and Reaction Prediction. Int. J. Artif. Intell. Sci. 2025, 1, 61–70. [Google Scholar] [CrossRef]
Hirani, R.; Noruzi, K.; Khuram, H.; Hussaini, A.S.; Aifuwa, E.I.; Ely, K.E.; Lewis, J.M.; Gabr, A.E.; Smiley, A.; Tiwari, R.K.; et al. Artificial Intelligence and Healthcare: A Journey through History, Present Innovations, and Future Possibilities. Life 2024, 14, 557. [Google Scholar] [CrossRef] [PubMed]
Zhavoronkov, A.; Ivanenkov, Y.A.; Aliper, A.; Veselov, M.S.; Aladinskiy, V.A.; Aladinskaya, A.V.; Terentiev, V.A.; Polykovskiy, D.A.; Kuznetsov, M.D.; Asadulaev, A.; et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 2019, 37, 1038–1040. [Google Scholar] [CrossRef]
Stokes, J.M.; Yang, K.; Swanson, K.; Jin, W.; Cubillos-Ruiz, A.; Donghia, N.M.; MacNair, C.R.; French, S.; Carfrae, L.A.; Bloom-Ackermann, Z.; et al. A Deep Learning Approach to Antibiotic Discovery. Cell 2020, 180, 688–702.e13. [Google Scholar] [CrossRef] [PubMed]
Vamathevan, J.; Clark, D.; Czodrowski, P.; Dunham, I.; Ferran, E.; Lee, G.; Li, B.; Madabhushi, A.; Shah, P.; Spitzer, M.; et al. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 2019, 18, 463–477. [Google Scholar] [CrossRef]
Morgan, P.; Brown, D.G.; Lennard, S.; Anderton, M.J.; Barrett, J.C.; Eriksson, U.; Fidock, M.; Hamrén, B.; Johnson, A.; March, R.E.; et al. Impact of a five-dimensional framework on R&D productivity at AstraZeneca. Nat. Rev. Drug Discov. 2018, 17, 167–181. [Google Scholar] [CrossRef]
Fogel, D.B. Factors associated with clinical trials that fail and opportunities for im-proving the likelihood of success: A review. Contemp. Clin. Trials Commun. 2018, 11, 156–164. [Google Scholar] [CrossRef]
Wouters, O.J.; McKee, M.; Luyten, J. Estimated research and development investment needed to bring a new medicine to market, 2009–2018. J. Am. Med. Assoc. 2020, 323, 844–853. [Google Scholar] [CrossRef]
Fleming, N. How artificial intelligence is changing drug discovery. Nature 2018, 557, S55–S57. [Google Scholar] [CrossRef]
Farghali, H.; Kutinová Canová, N.; Arora, M. The potential applications of artificial intelligence in drug discovery and development. Physiol. Res. 2021, 70, S715–S722. [Google Scholar] [CrossRef]
Cochran, J.J.; Cox, L.A., Jr.; Keskinocak, P.; Kharoufeh, J.P.; Smith, J.C.; Gilbert, R.C.; Trafalis, T.B.; Adrianto, I. Support Vector Machines for Classification. In Wiley Encyclopedia of Operations Research and Management Science; Cochran, J.J., Cox, L.A., Keskinocak, P., Kharoufeh, J.P., Smith, J.C., Eds.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2011. [Google Scholar] [CrossRef]
Guido, R.; Ferrisi, S.; Lofaro, D.; Conforti, D. An Overview on the Advancements of Support Vector Machine Models in Healthcare Applications: A Review. Information 2024, 15, 235. [Google Scholar] [CrossRef]
Paramasivan, A. The Future of Personalized Medicine AI-Driven Solutions in Drug Discovery and Patient Care. J. Sci. Eng. Res. 2021, 8, 256–263. [Google Scholar] [CrossRef]
da Silva Simões, A. Unsupervised Learning in Pulsed Neural Networks with Radial Basis Function; University of São Paulo: São Paulo, Brazil, 2006. [Google Scholar] [CrossRef]
Rayhan, A. Accelerating Drug Discovery and Material Design: Unleashing AI’s Potential for Optimizing Molecular Structures and Properties. RG Preprint 2023. [Google Scholar] [CrossRef]
Balaguru, S.; Gandra, A. Unleashing Molecular Potential: A Process Discovery and Automation Workflow for Generative AI in Accelerating Drug Discovery. Int. J. Innov. Sci. Res. Technol. 2024, 9, 1235–1241. [Google Scholar] [CrossRef]
Gómez-Bombarelli, R.; Wei, J.N.; Duvenaud, D.; Hernández-Lobato, J.M.; Sánchez-Lengeling, B.; Sheberla, D.; Aguilera-Iparraguirre, J.; Hirzel, T.D.; Adams, R.P.; Aspuru-Guzik, A. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 2018, 4, 268–276. [Google Scholar] [CrossRef]
Polykovskiy, D.; Zhebrak, A.; Sanchez-Lengeling, B.; Golovanov, S.; Tatanov, O.; Belyaev, S.; Kurbanov, R.; Artamonov, A.; Aladinskiy, V.; Veselov, M.; et al. Molecular Sets (MOSES): A benchmarking platform for molecular generation models. Front. Pharmacol. 2020, 11, 565644. [Google Scholar] [CrossRef] [PubMed]
Brown, N.; Fiscato, M.; Segler, M.H.S.; Vaucher, A.C. GuacaMol: Benchmarking models for de novo molecular design. J. Chem. Inf. Model. 2019, 59, 1096–1108. [Google Scholar] [CrossRef] [PubMed]
Bickerton, G.R.; Paolini, G.V.; Besnard, J.; Muresan, S.; Hopkins, A.L. Quantifying the chemical beauty of drugs. Nat. Chem. 2012, 4, 90–98. [Google Scholar] [CrossRef] [PubMed]
Grisoni, F.; Schneider, G. De novo molecular design with generative long short-term memory. Chimia 2019, 73, 1006–1011. [Google Scholar] [CrossRef]
Fang, Z.; Ran, H.; Zhang, Y.; Chen, C.; Lin, P.; Zhang, X.; Wu, M. AlphaFold 3: An unprecedented opportunity for fundamental research and drug development. Precis. Clin. Med. 2025, 8, pbaf015. [Google Scholar] [CrossRef]
Desai, N.; Kadam, A.; Bhaskar, S.; Kumar, A.; Khapre, S.; Rawat, M.; Tiwari, S.; Jain, P.; Shekhar, Y.; Yadav, S. Review of AlphaFold 3: Transformative Advances in Drug Design and Therapeutics. Cureus 2024, 16, e64901. [Google Scholar] [CrossRef]
Abramson, J.; Adler, J.; Dunger, J.; Evans, R.; Green, T.; Pritzel, A.; Ronneberger, O.; Willmore, L.; Ballard, A.J.; Bambrick, J.; et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 2024, 630, 493–500. [Google Scholar] [CrossRef]
Ahmad, W.; Simon, E.; Chithrananda, S.; Grand, G.; Ramsundar, B. ChemBERTa-2: Towards Chemical Foundation Models. arXiv 2022, arXiv:2209.01712. [Google Scholar] [CrossRef]
Bagal, V.; Aggarwal, R.; Vinod, P.K.; Priyakumar, U.D. MolGPT: Molecular Generation Using a Transformer-Decoder Model. J. Chem. Inf. Model. 2021, 62, 2064–2076. [Google Scholar] [CrossRef] [PubMed]
Sharma, S.; Gupta, S.; Sharma, R.; Sharma, D.K.; Sharma, A. Computational Landscape in Drug Discovery: From AI/ML Models to Translational Application. Scientifica 2025, 2025, 1688637. [Google Scholar] [CrossRef]
Savage, N. Tapping into the drug discovery potential of AI. Biopharma Deal. 2021, B37–B39. [Google Scholar] [CrossRef]
Gaulton, A.; Bellis, L.J.; Bento, A.P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; et al. ChEMBL: A large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012, 40, D1100–D1107. [Google Scholar] [CrossRef]
Artrith, N.; Butler, K.T.; Coudert, F.X.; Han, S.; Isayev, O.; Jain, A.; Walsh, A. Best practices in machine learning for chemistry. Nat. Chem. 2021, 13, 505–508. [Google Scholar] [CrossRef]
Ramsundar, B.; Eastman, P.; Walters, P.; Pande, V. Deep Learning for the Life Sciences; O’Reilly Media: San Francisco, CA, USA, 2019. [Google Scholar]
Warnat-Herresthal, S.; Schultze, H.; Shastry, K.L.; Manamohan, S.; Mukherjee, S.; Garg, V.; Sarveswara, R.; Händler, K.; Pickkers, P.; Aziz, N.A.; et al. Swarm learning for decentralized and confidential clinical machine learning. Nature 2021, 594, 265–270. [Google Scholar] [CrossRef] [PubMed]
Settles, B. Active Learning Literature Survey; Computer Sciences Technical Report 1648; University of Wiscon-sin-Madison: Madison, WI, USA, 2009. [Google Scholar]
Mahmud, T.; Barua, K.; Habiba, S.U.; Sharmen, N.; Hossain, M.S.; Andersson, K. An Explainable AI Paradigm for Alzheimer’s Diagnosis Using Deep Transfer Learning. Diagnostics 2024, 14, 345. [Google Scholar] [CrossRef]
Kale, M.; Wankhede, N.; Pawar, R.; Ballal, S.; Kumawat, R.; Goswami, M.; Khalid, M.; Taksande, B.; Upaganlawar, A.; Umekar, M.; et al. AI-driven innovations in Alzheimer’s disease: Integrating early diagnosis, personalized treatment, and prognostic modelling. Ageing Res. Rev. 2024, 101, 102497. [Google Scholar] [CrossRef] [PubMed]
Mak, K.K.; Pichika, M.R. Artificial intelligence in drug development: Present status and future prospects. Drug Discov. Today 2019, 24, 773–780. [Google Scholar] [CrossRef] [PubMed]
Tyagi, E.; Kumari, P.; Prakash, A.; Bhuyan, R. Revolutionizing Anti-Cancer Drug Discovery: The Role of Artificial Intelligence. Int. J. Bioinform. Intell. Comput. 2025, 4, 1–38. [Google Scholar] [CrossRef]
Harreiter, J.; Roden, M. Diabetes mellitus: Definition, classification, diagnosis, screening and prevention (Update 2023). Wien. Klin. Wochenschr. 2023, 135, 7–17. [Google Scholar] [CrossRef] [PubMed]
Canha, D.; Bour, C.; Barraud, S.; Aguayo, G.; Fagherazzi, G. The transformative role of artificial intelligence in diabetes care and research. Diabetes Metab. 2024, 50, 101565. [Google Scholar] [CrossRef]
Soni, K.; Hasija, Y. Artificial Intelligence Assisted Drug Research and Development. In Proceedings of the 2022 IEEE Delhi Section Conference (DELCON), New Delhi, India, 11–13 February 2022; IEEE: New York, NY, USA, 2022; pp. 1–10. [Google Scholar] [CrossRef]
US Food and Drug Administration. Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD) Action Plan. January 2021. Available online: https://www.fda.gov/media/145022/download (accessed on 21 December 2025).
US Food and Drug Administration. Marketing Submission Recommendations for a Predetermined Change Control Plan for Artificial Intelligence/Machine Learning (AI/ML)-Enabled Device Software Functions. April 2023. Available online: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/ (accessed on 21 December 2025).
Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. arXiv 2017. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. arXiv 2017. [Google Scholar] [CrossRef]
European Medicines Agency. Reflection Paper on the Use of Artificial Intelligence (AI) in the Medicinal Product Lifecycle. Draft. July 2023. Available online: https://www.ema.europa.eu/en/documents/scientific-guideline/reflection-paper-use-artificial-intelligence-ai-medicinal-product-lifecycle_en.pdf (accessed on 21 December 2025).
Exscientia. Exscientia Announces First Patient Dosed in Phase I Clinical Trial of EXS-21546. Press Release. January 2020. Available online: https://www.exscientia.ai/news (accessed on 21 December 2025).
Gertrudes, J.C.; Maltarollo, V.G.; Silva, R.A.; Oliveira, P.R.; Honório, K.M.; da Silva, A.B.F. Machine learning techniques and drug design. Curr. Med. Chem. 2012, 19, 4289–4297. [Google Scholar] [CrossRef]
Vora, L.K.; Gholap, A.D.; Jetha, K.; Thakur, R.R.S.; Solanki, H.K.; Chavda, V.P. Artificial Intelligence in Pharmaceutical Technology and Drug Delivery Design. Pharmaceutics 2023, 15, 1916. [Google Scholar] [CrossRef]
Bhowmick, M.; Goswami, S.; Bhowmick, P.; Hait, S.; Rath, D.; Yasmin, S. Future prospective of AI in drug discovery. Adv. Pharmacol. 2025, 103, 429–449. [Google Scholar] [CrossRef] [PubMed]
Kadan, A.; Ryczko, K.; Roitberg, A.; Yamazaki, T. Guided multi-objective generative AI to enhance structure-based drug design. Chem. Sci. 2025, 16, 13196–13210. [Google Scholar] [CrossRef]
Mohapatra, M.; Sahu, C.; Mohapatra, S. Trends of Artificial Intelligence (AI) Use in Drug Targets, Discovery and Development: Current Status and Future Perspectives. Curr. Drug Targets 2025, 26, 221–242. [Google Scholar] [CrossRef] [PubMed]
Angajala, S.R. Generative AI Revolutionizing Drug Discovery and Materials Science: A Descriptive Research Approach. Bull. Technol. Hist. J. 2024, 4, 86–90. [Google Scholar] [CrossRef]
Tiwari, P.C.; Pal, R.; Chaudhary, M.J.; Nath, R. Artificial intelligence revolutionizing drug development: Exploring opportunities and challenges. Drug Dev. Res. 2023, 84, 1652–1663. [Google Scholar] [CrossRef]
Strubell, E.; Ganesh, A.; McCallum, A. Energy and policy considerations for deep learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, ACL 2019, Florence, Italy, 28 July–2 August 2019; pp. 3645–3650. [Google Scholar] [CrossRef]
Schwartz, R.; Dodge, J.; Smith, N.A.; Etzioni, O. Green AI. Commun. ACM 2020, 63, 54–63. [Google Scholar] [CrossRef]
Han, S.; Pool, J.; Tran, J.; Dally, W.J. Learning both weights and connections for efficient neural networks. Adv. Neural Inf. Process Syst. 2015, 28, 1135–1143. [Google Scholar] [CrossRef]
Jacob, B.; Kligys, S.; Chen, B.; Zhu, M.; Tang, M.; Howard, A.; Adam, H.; Kalenichenko, D. Quantization and training of neural networks for efficient inte-ger-arithmetic-only inference. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; IEEE: New York, NY, USA, 2018; pp. 2704–2713. [Google Scholar] [CrossRef]
Hinton, G.; Vinyals, O.; Dean, J. Distilling the knowledge in a neural network. arXiv 2015, arXiv:1503.02531. [Google Scholar] [CrossRef]
Patterson, D.; Gonzalez, J.; Le, Q.; Liang, C.; Munguia, L.M.; Rothchild, D.; So, D.; Texier, M.; Dean, J. Carbon emissions and large neural network training. arXiv 2021, arXiv:2104.10350. [Google Scholar] [CrossRef]
Anthony, L.F.W.; Kanding, B.; Selvan, R. Carbontracker: Tracking and predicting the carbon footprint of training deep learning models. arXiv 2020, arXiv:2007.03051. [Google Scholar] [CrossRef]
Bender, A.; Cortés-Ciriano, I. Artificial intelligence in drug discovery: What is realistic, what are illusions? Part 1: Ways to make an impact, and why we are not there yet. Drug Discov. Today 2021, 26, 511–524. [Google Scholar] [CrossRef]
United States Government Accountability Office. Artificial Intelligence in Health Care: Benefits and Challenges of Machine Learning in Drug Development; GAO-20-215SP; U.S. Government Accountability Office: Washington, DC, USA, 2020. Available online: https://www.gao.gov/products/gao-20-215sp (accessed on 21 December 2025).

Figure 1. Pharmaceutical development applications of AI and its subfields.

Figure 2. Distribution of studies by research focus area (n = 81 total references). (Left): Pie chart showing proportional distribution across seven thematic categories. (Right): Horizontal bar chart displaying absolute publication counts per category. Drug Discovery & Development represents the largest category (29 papers, 35.8%), followed by Technical AI Methods (25 papers, 30.9%) and Ethical & Regulatory Issues (10 papers, 12.3%). The top three categories collectively account for 79.0% of total references, demonstrating the review’s focused coverage of core AI applications in pharmaceutical R&D while maintaining balanced treatment of methodological, ethical, and disease-specific perspectives. The distribution reflects comprehensive coverage across the drug development pipeline from computational methods through clinical implementation and regulatory considerations.

Figure 3. AI-driven drug discovery pipeline showing eight key stages from data collection through post-market surveillance. Each stage specifies the AI/ML models employed (e.g., GNN and NLP for target identification, GAN/VAE/RNN/RL for de novo design). The continuous feedback loop enables iterative refinement. Bottom panel shows quantified AI impact: 85–90% time reduction, ~60% cost savings, and estimated success rate improvement from 13% to 20–30%.

Figure 4. GAN architecture for de novo drug design showing the complete workflow from data preprocessing through candidate selection. The Generator Network creates novel molecules from random noise, while the Discriminator Network evaluates authenticity. Post-training validation includes property prediction, validity checking, and candidate ranking. Performance metrics indicate 85–95% chemical validity, 60–80% novelty, and 70–85% success in meeting target criteria, highlighting both the capabilities and limitations of generative approaches discussed in the critical analysis (based on reference [36]).

Figure 5. AI-Enhanced Design–Make–Test–Analyze (DMTA) Cycle. The iterative workflow shows how AI technologies accelerate drug optimization across four phases: (1) DESIGN—generative models and molecular optimization; (2) MAKE-retrosynthesis and route planning; (3) TEST-AI-guided experimental design; (4) ANALYZE-pattern recognition and model refinement. The continuous learning loop enables 2–4-week cycles versus 3–6 months traditionally, with 75%-time reduction through data-driven decisions and reduced experimental burden.

Table 1. Traditional versus AI-driven methods, and the advantages of using the unsupervised/supervised AI models.

	Traditional Method	AI-Driven Methods
Development Cycle	It takes 10 to 20 years, between discovery and commercialization.	Time can be reduced from 12 to 30 months in some cases.
Failure Rates	High rate, especially in complex diseases.	Algorithms help in predicting efficacy and toxicity Increased success.
Costs	High costs.	Cost reduction through virtual screening and less dependence on clinical trials and laboratories.
Clinical Trials	Manual, slow and low-efficiency process. Testing with multiple molecules and poorly viable candidates.	Automatic selection of compounds and patients based on genetic and clinical data.
Advantages of AI models
Supervised		Unsupervised
High precision through existing data (compiled data) of good quality.		Search and discovery free of unknown data.
Objective-trained model: orientation to a certain drug allows a more accurate prediction of toxicity as well as a higher clinical success rate.		Detection of correlations, trends and clusters is unexpected.
Generalization capacity: for new compounds, this model can generate the necessary and correct information if the data includes chemical diversity.		Cost savings over manual labelling.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lopes, A.B.; Rodrigues, C.F.; Silva, F.A.M. From Algorithm to Medicine: AI in the Discovery and Development of New Drugs. AI 2026, 7, 26. https://doi.org/10.3390/ai7010026

AMA Style

Lopes AB, Rodrigues CF, Silva FAM. From Algorithm to Medicine: AI in the Discovery and Development of New Drugs. AI. 2026; 7(1):26. https://doi.org/10.3390/ai7010026

Chicago/Turabian Style

Lopes, Ana Beatriz, Célia Fortuna Rodrigues, and Francisco A. M. Silva. 2026. "From Algorithm to Medicine: AI in the Discovery and Development of New Drugs" AI 7, no. 1: 26. https://doi.org/10.3390/ai7010026

APA Style

Lopes, A. B., Rodrigues, C. F., & Silva, F. A. M. (2026). From Algorithm to Medicine: AI in the Discovery and Development of New Drugs. AI, 7(1), 26. https://doi.org/10.3390/ai7010026

Article Menu

From Algorithm to Medicine: AI in the Discovery and Development of New Drugs

Abstract

1. Introduction

2. Materials and Methods

3. Drug Development Cycle: Traditional Versus AI-Driven Methods

3.1. Discovery and Development

3.2. Preclinical Research

3.3. Clinical Research

3.4. Post-Marketing

4. Fundamentals of the Different Artificial Intelligence Approaches and Drug Research

4.1. Artificial Intelligence with Supervision

4.2. Unsupervised Artificial Intelligence

4.3. Artificial Intelligence Techniques

4.3.1. Supervised Models

4.3.2. Unsupervised Models

4.3.3. Deep Learning Models

4.3.4. Generative Models

4.3.5. Bayesian Models

4.3.6. Molecular Representations and Techniques of QSAR/QSPR

4.4. Identification and Evaluation of Therapeutic Targets with Artificial Intelligence

4.5. Design of New Compounds

4.6. Artificial Intelligence Screening and Optimization

4.7. Emerging AI Technologies: Large Predictive and Language Models

5. Cycle: Design, Make, Test, Analyze

5.1. Absorption, Distribution, Metabolization, Excretion and Toxicity

5.2. Patient Selection and Challenges

5.3. Advantages of AI in Pharmaceutical Research Versus Data Challenges and Solutions

6. Application of Artificial Intelligence in Managing Specific Pathologies

6.1. Alzheimer’s Disease

6.2. Cancer

6.3. Diabetes Mellitus

6.4. Bacterial Infections

6.5. Obsessive–Compulsive Disorder

7. Ethical, Regulatory Implementation and Societal Challenges

7.1. Ethical Challenges

7.1.1. Algorithm Bias and Health Equity

7.1.2. Data Privacy and Informed Consent

7.1.3. Socio-Economic Impact

7.2. Regulatory Challenges and Implementation Frameworks

7.2.1. Algorithmic Explainability Challenge

7.2.2. Current Regulatory Frameworks and Solutions

7.3. Path Forward

8. Discussion and Future Perspectives

8.1. Achievements and Critical Gaps

8.2. Future Technological Directions

8.3. Emerging Application Frontiers

8.3.1. Precision Medicine and Therapeutic Personalization

8.3.2. Expanding Therapeutic Modalities

8.4. Computational Sustainability and Resource Considerations

8.5. Regulatory Evolution and Governance

8.6. Education and Workforce Development

8.7. Synthesis and Path Forward

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI