Submit to Pharmaceuticals Review for Pharmaceuticals Propose a Special Issue

Journal Menu

Journal Browser

► Journal Browser

Integrating Machine Learning (ML) into Medicinal Chemistry and Cheminformatics

Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Pharmaceuticals (ISSN 1424-8247). This special issue belongs to the section "Medicinal Chemistry".

Deadline for manuscript submissions: closed (20 August 2025) | Viewed by 10031

Share This Special Issue

Special Issue Editor

Dr. Vartika Tomar

E-Mail Website
Guest Editor

School of Medicine, Johns Hopkins University, Baltimore, MD 21205, USA
Interests: chemotherapy; tumor; nanomedicine; drug development and delivery; cystic fibrosis; gene therapy
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Machine learning (ML) has rapidly become a critical tool in computer-aided drug discovery, offering a powerful alternative to traditional physical models such as quantum chemistry and molecular dynamics simulations. Unlike these explicit models, ML techniques rely on pattern recognition algorithms to uncover mathematical relationships between empirical data and predict the chemical, biological, and physical properties of novel compounds. ML’s efficiency and scalability make it particularly well-suited for handling large datasets, a significant advantage over computationally intensive physical models. In drug discovery, ML enhances our understanding of the relationships between chemical structures and their biological activities, integrating seamlessly with chemoinformatics and quantitative structure–activity relationship (QSAR) modeling to drive predictive molecular design and analysis. Recent advances in computational power and deep learning algorithms have further propelled ML, addressing previously unmet challenges in pharmaceutical research. The surge in chemical 'big data' from high-throughput screening (HTS) and combinatorial synthesis underscores ML's role in mining large compound databases and designing drugs with critical biological properties.

This Special Issue on “Machine Learning in Chemoinformatics and Medicinal Chemistry” invites original research and review articles that explore the latest ML and deep learning applications in computational drug design. Topics of interest include cheminformatics, QSAR, novel ML applications in drug development, molecular descriptors, molecular similarity, structure-based and ligand-based screening, homology modeling, molecular docking, and the stability of drug–receptor interactions. The issue aims to highlight the transformative impact of ML in advancing drug discovery and design.

Dr. Vartika Tomar
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Pharmaceuticals is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2900 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

small molecules
molecular descriptors
molecular similarity
structure-based and ligand-based screening
homology modeling
molecular docking
drug–receptor docking
biological activity

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (6 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

27 pages, 14027 KB

Open AccessArticle

Machine Learning and Integrative Structural Dynamics Identify Potent ALK Inhibitors from Natural Compound Libraries

by Rana Alateeq

Pharmaceuticals 2025, 18(8), 1178; https://doi.org/10.3390/ph18081178 - 10 Aug 2025

Viewed by 1016

Abstract

Background: Anaplastic lymphoma kinase (ALK) is a validated oncogenic driver in non-small cell lung cancer and other malignancies, making it a clinically relevant target for small-molecule inhibition. Methods: Here, we report a computational discovery pipeline integrating structure-based virtual screening, machine learning-guided prioritization, molecular dynamics simulations, and binding free energy analysis to identify potential ALK inhibitors from a natural product-derived subset of the ZINC20 database. We trained and benchmarked eleven machine learning models, including tree-based, kernel-based, linear, and neural architectures, on curated bioactivity datasets of ALK inhibitors to capture nuanced structure-activity relationships and prioritize candidates beyond conventional docking metrics. Results: Six compounds were shortlisted based on binding affinity, solubility, bioavailability, and synthetic accessibility. Molecular dynamics simulations over 100 ns revealed stable ligand engagement, with limited conformational fluctuations and consistent retention of the protein’s structural integrity. Key catalytic residues, including GLU105, MET107, and ASP178, displayed minimal fluctuation, while hydrogen bonding and residue interaction analyses confirmed persistent engagement across all ligand-bound complexes. Binding free energy estimates identified ZINC3870414 and ZINC8214398 as top-performing candidates, with ΔG_total values of –46.02 and –46.18 kcal/mol, respectively. Principal component and dynamic network analyses indicated that these compounds restrict conformational sampling and reorganize residue communication pathways, consistent with functional inhibition. Conclusions: These results highlight ZINC3870414 and ZINC8214398 as promising scaffolds for further optimization and support the utility of integrating machine learning with dynamic and network-based metrics in early-stage kinase inhibitor discovery. Full article

(This article belongs to the Special Issue Integrating Machine Learning (ML) into Medicinal Chemistry and Cheminformatics)

► Show Figures

Graphical abstract

18 pages, 1197 KB

Open AccessArticle

Precision Enhanced Bioactivity Prediction of Tyrosine Kinase Inhibitors by Integrating Deep Learning and Molecular Fingerprints Towards Cost-Effective and Targeted Cancer Therapy

by Fatma Hilal Yagin, Yasin Gormez, Cemil Colak, Abdulmohsen Algarni, Fahaid Al-Hashem and Luca Paolo Ardigò

Pharmaceuticals 2025, 18(7), 975; https://doi.org/10.3390/ph18070975 - 28 Jun 2025

Viewed by 1394

Abstract

Background and Objective: Dysregulated tyrosine kinase signaling is a central driver of tumorigenesis, metastasis, and therapeutic resistance. While tyrosine kinase inhibitors (TKIs) have revolutionized targeted cancer treatment, identifying compounds with optimal bioactivity remains a critical bottleneck. This study presents a robust machine learning framework—leveraging deep artificial neural networks (dANNs), convolutional neural networks (CNNs), and structural molecular fingerprints—to accurately predict TKI bioactivity, ultimately accelerating the preclinical phase of drug development. Methods: A curated dataset of 28,314 small molecules from the ChEMBL database targeting 11 tyrosine kinases was analyzed. Using Morgan fingerprints and physicochemical descriptors (e.g., molecular weight, LogP, hydrogen bonding), ten supervised models, including dANN, SVM, CatBoost, and CNN, were trained and optimized through a randomized hyperparameter search. Model performance was evaluated using F1-score, ROC–AUC, precision–recall curves, and log loss. Results: SVM achieved the highest F1-score (87.9%) and accuracy (85.1%), while dANNs yielded the lowest log loss (0.25096), indicating superior probabilistic reliability. CatBoost excelled in ROC–AUC and precision–recall metrics. The integration of Morgan fingerprints significantly improved bioactivity prediction across all models by enhancing structural feature recognition. Conclusions: This work highlights the transformative role of machine learning—particularly dANNs and SVM—in rational drug discovery. By enabling accurate bioactivity prediction, our model pipeline can effectively reduce experimental burden, optimize compound selection, and support personalized cancer treatment design. The proposed framework advances kinase inhibitor screening pipelines and provides a scalable foundation for translational applications in precision oncology. By enabling early identification of bioactive compounds with favorable pharmacological profiles, the results of this study may support more efficient candidate selection for clinical drug development, particularly in regards to cancer therapy and kinase-associated disorders. Full article

(This article belongs to the Special Issue Integrating Machine Learning (ML) into Medicinal Chemistry and Cheminformatics)

► Show Figures

Figure 1

23 pages, 8529 KB

Open AccessArticle

Machine Learning-Driven Consensus Modeling for Activity Ranking and Chemical Landscape Analysis of HIV-1 Inhibitors

by Danishuddin, Md Azizul Haque, Geet Madhukar, Qazi Mohammad Sajid Jamal, Jong-Joo Kim and Khurshid Ahmad

Pharmaceuticals 2025, 18(5), 714; https://doi.org/10.3390/ph18050714 - 13 May 2025

Cited by 2 | Viewed by 1688

Abstract

Background/Objective: This study aimed to develop a predictive model to classify and rank highly active compounds that inhibit HIV-1 integrase (IN). Methods: A total of 2271 potential HIV-1 inhibitors were selected from the ChEMBL database. The most relevant molecular descriptors were identified using a hybrid GA-SVM-RFE approach. Predictive models were built using Random Forest (RF), eXtreme Gradient Boosting (XGBoost), Support Vector Machines (SVM), and Multi-Layer Perceptron (MLP). The models underwent a comprehensive evaluation employing calibration, Y-randomization, and Net Gain methodologies. Results: The four models demonstrated intense calibration, achieving an accuracy greater than 0.88 and an area under the curve (AUC) exceeding 0.90. Net Gain at a high probability threshold indicates that the models are both effective and highly selective, ensuring more reliable predictions with greater confidence. Additionally, we combine the predictions of multiple individual models by using majority voting to determine the final prediction for each compound. The Rank Score (weighted sum) serves as a confidence indicator for the consensus prediction, with the majority of highly active compounds identified through high scores in both the 2D descriptors and ECFP4-based models, highlighting the models’ effectiveness in predicting potent inhibitors. Furthermore, cluster analysis identified significant classes associated with vigorous biological activity. Conclusions: Some clusters were found to be enriched in highly potent compounds while maintaining moderate scaffold diversity, making them promising candidates for exploring unique chemical spaces and identifying novel lead compounds. Overall, this study provides valuable insights into predicting integrase binders, thereby enhancing the accuracy of predictive models. Full article

(This article belongs to the Special Issue Integrating Machine Learning (ML) into Medicinal Chemistry and Cheminformatics)

► Show Figures

Figure 1

15 pages, 4198 KB

Open AccessArticle

Chemical Space Exploration and Machine Learning-Based Screening of PDE7A Inhibitors

by Yuze Li, Zhe Wang, Shengyao Ma, Xiaowen Tang and Hanting Zhang

Pharmaceuticals 2025, 18(4), 444; https://doi.org/10.3390/ph18040444 - 21 Mar 2025

Cited by 1 | Viewed by 1096

Abstract

Background/Objectives: Phosphodiesterase 7 (PDE7), a member of the PDE superfamily, selectively catalyzes the hydrolysis of cyclic adenosine 3′,5′-monophosphate (cAMP), thereby regulating the intracellular levels of this second messenger and influencing various physiological functions and processes. There are two subtypes of PDE7, PDE7A and PDE7B, which are encoded by distinct genes. PDE7 inhibitors have been shown to exert therapeutic effects on neurological and respiratory diseases. However, FDA-approved drugs based on the PDE7A inhibitor are still absent, highlighting the need for novel compounds to advance PDE7A inhibitor development. Methods: To address this urgent and important issue, we conducted a comprehensive cheminformatics analysis of compounds with potential for PDE7A inhibition using a curated database to elucidate the chemical characteristics of the highly active PDE7A inhibitors. The specific substructures that significantly enhance the activity of PDE7A inhibitors, including benzenesulfonamido, acylamino, and phenoxyl, were identified by an interpretable machine learning analysis. Subsequently, a machine learning model employing the Random Forest–Morgan pattern was constructed for the qualitative and quantitative prediction of PDE7A inhibitors. Results: As a result, six compounds with potential PDE7A inhibitory activity were screened out from the SPECS compound library. These identified compounds exhibited favorable molecular properties and potent binding affinities with the target protein, holding promise as candidates for further exploration in the development of potent PDE7A inhibitors. Conclusions: The results of the present study would advance the exploration of innovative PDE7A inhibitors and provide valuable insights for future endeavors in the discovery of novel PDE inhibitors. Full article

(This article belongs to the Special Issue Integrating Machine Learning (ML) into Medicinal Chemistry and Cheminformatics)

► Show Figures

Graphical abstract

23 pages, 1344 KB

Open AccessArticle

In Silico Approach for Antibacterial Discovery: PTML Modeling of Virtual Multi-Strain Inhibitors Against Staphylococcus aureus

by Valeria V. Kleandrova, M. Natália D. S. Cordeiro and Alejandro Speck-Planche

Pharmaceuticals 2025, 18(2), 196; https://doi.org/10.3390/ph18020196 - 31 Jan 2025

Cited by 8 | Viewed by 1669

Abstract

Background/Objectives: Infectious diseases caused by Staphylococcus aureus (S. aureus) have become alarming health issues worldwide due to the ever-increasing emergence of multidrug resistance. In silico approaches can accelerate the identification and/or design of versatile antibacterial chemicals with the ability to target multiple S. aureus strains with varying degrees of drug resistance. Here, we develop a perturbation theory machine learning model based on a multilayer perceptron neural network (PTML-MLP) for the prediction and design of versatile virtual inhibitors against S. aureus strains. Methods: To develop the PTML-MLP model, chemical and biological data associated with antibacterial activity against S. aureus strains were retrieved from the ChEMBL database. We applied the Box–Jenkins approach to convert the topological indices into multi-label graph-theoretical indices; the latter were used as inputs for the creation of the PTML-MLP model. Results: The PTML-MLP model exhibited accuracy higher than 80% in both training and test sets. The physicochemical and structural interpretation of the PTML-MLP model was performed through the fragment-based topological design (FBTD) approach. Such interpretations permitted the analysis of different molecular fragments with favorable contributions to the multi-strain antibacterial activity and the design of four new drug-like molecules using different fragments as building blocks. The designed molecules were predicted/confirmed by our PTML model as multi-strain inhibitors of diverse S. aureus strains, thus representing promising chemotypes to be considered for future synthesis and biological testing of versatile anti-S. aureus agents. Conclusions: This work envisages promising applications of PTML modeling for early antibacterial drug discovery and related antimicrobial research areas. Full article

(This article belongs to the Special Issue Integrating Machine Learning (ML) into Medicinal Chemistry and Cheminformatics)

► Show Figures

Figure 1

11 pages, 2439 KB

Open AccessArticle

AISMPred: A Machine Learning Approach for Predicting Anti-Inflammatory Small Molecules

by Subathra Selvam, Priya Dharshini Balaji, Honglae Sohn and Thirumurthy Madhavan

Pharmaceuticals 2024, 17(12), 1693; https://doi.org/10.3390/ph17121693 - 15 Dec 2024

Cited by 4 | Viewed by 2249

Abstract

Background/Objectives: Inflammation serves as a vital response to diverse harmful stimuli like infections, toxins, or tissue injuries, aiding in the elimination of pathogens and tissue repair. However, persistent inflammation can lead to chronic diseases. Peptide therapeutics have gained attention for their specificity in targeting cells, yet their development remains costly and time-consuming. Therefore, small molecules, with their stability, low immunogenicity, and oral bioavailability, have become a focal point for predicting anti-inflammatory small molecules (AISMs). Methods: In this study, we introduce a computational method called AISMPred, designed to classify AISMs and non-AISMs. To develop this approach, we constructed a dataset comprising 1750 AISMs and non-AISMs, each annotated with

{I C}_{50}

values sourced from the PubChem BioAssay database. We computed two distinct types of molecular descriptors using PaDEL and Mordred tools. Subsequently, these descriptors were concatenated to form a hybrid feature set. The SVC-L1 regularization method was implemented for the optimum feature selection to develop robust Machine learning (ML) models. Five different conventional ML classifiers were employed, such as RF, ET, KNN, LR, and Ensemble methods. Results: A total of 15 ML models were developed using 2D, FP, and Hybrid feature sets, with the ET model with hybrid features achieving the highest accuracy of 92% and an AUC of 0.97 on the independent test dataset. Conclusions: This study provides an effective method for screening AISMs, potentially impacting drug discovery and design. Full article

(This article belongs to the Special Issue Integrating Machine Learning (ML) into Medicinal Chemistry and Cheminformatics)

► Show Figures

Journal Menu

Journal Browser

Integrating Machine Learning (ML) into Medicinal Chemistry and Cheminformatics

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (6 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI