Integrating Machine Learning (ML) into Medicinal Chemistry and Cheminformatics

A special issue of Pharmaceuticals (ISSN 1424-8247). This special issue belongs to the section "Medicinal Chemistry".

Deadline for manuscript submissions: 20 August 2025 | Viewed by 3440

Special Issue Editor


E-Mail Website
Guest Editor
School of Medicine, Johns Hopkins University, Baltimore, MD 21205, USA
Interests: chemotherapy; tumor; nanomedicine; drug development and delivery; cystic fibrosis; gene therapy
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Machine learning (ML) has rapidly become a critical tool in computer-aided drug discovery, offering a powerful alternative to traditional physical models such as quantum chemistry and molecular dynamics simulations. Unlike these explicit models, ML techniques rely on pattern recognition algorithms to uncover mathematical relationships between empirical data and predict the chemical, biological, and physical properties of novel compounds. ML’s efficiency and scalability make it particularly well-suited for handling large datasets, a significant advantage over computationally intensive physical models. In drug discovery, ML enhances our understanding of the relationships between chemical structures and their biological activities, integrating seamlessly with chemoinformatics and quantitative structure–activity relationship (QSAR) modeling to drive predictive molecular design and analysis. Recent advances in computational power and deep learning algorithms have further propelled ML, addressing previously unmet challenges in pharmaceutical research. The surge in chemical 'big data' from high-throughput screening (HTS) and combinatorial synthesis underscores ML's role in mining large compound databases and designing drugs with critical biological properties.

This Special Issue on “Machine Learning in Chemoinformatics and Medicinal Chemistry” invites original research and review articles that explore the latest ML and deep learning applications in computational drug design. Topics of interest include cheminformatics, QSAR, novel ML applications in drug development, molecular descriptors, molecular similarity, structure-based and ligand-based screening, homology modeling, molecular docking, and the stability of drug–receptor interactions. The issue aims to highlight the transformative impact of ML in advancing drug discovery and design.

Dr. Vartika Tomar
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Pharmaceuticals is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2900 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • small molecules
  • molecular descriptors
  • molecular similarity
  • structure-based and ligand-based screening
  • homology modeling
  • molecular docking
  • drug–receptor docking
  • biological activity

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

23 pages, 8529 KiB  
Article
Machine Learning-Driven Consensus Modeling for Activity Ranking and Chemical Landscape Analysis of HIV-1 Inhibitors
by Danishuddin, Md Azizul Haque, Geet Madhukar, Qazi Mohammad Sajid Jamal, Jong-Joo Kim and Khurshid Ahmad
Pharmaceuticals 2025, 18(5), 714; https://doi.org/10.3390/ph18050714 - 13 May 2025
Viewed by 288
Abstract
Background/Objective: This study aimed to develop a predictive model to classify and rank highly active compounds that inhibit HIV-1 integrase (IN). Methods: A total of 2271 potential HIV-1 inhibitors were selected from the ChEMBL database. The most relevant molecular descriptors were identified [...] Read more.
Background/Objective: This study aimed to develop a predictive model to classify and rank highly active compounds that inhibit HIV-1 integrase (IN). Methods: A total of 2271 potential HIV-1 inhibitors were selected from the ChEMBL database. The most relevant molecular descriptors were identified using a hybrid GA-SVM-RFE approach. Predictive models were built using Random Forest (RF), eXtreme Gradient Boosting (XGBoost), Support Vector Machines (SVM), and Multi-Layer Perceptron (MLP). The models underwent a comprehensive evaluation employing calibration, Y-randomization, and Net Gain methodologies. Results: The four models demonstrated intense calibration, achieving an accuracy greater than 0.88 and an area under the curve (AUC) exceeding 0.90. Net Gain at a high probability threshold indicates that the models are both effective and highly selective, ensuring more reliable predictions with greater confidence. Additionally, we combine the predictions of multiple individual models by using majority voting to determine the final prediction for each compound. The Rank Score (weighted sum) serves as a confidence indicator for the consensus prediction, with the majority of highly active compounds identified through high scores in both the 2D descriptors and ECFP4-based models, highlighting the models’ effectiveness in predicting potent inhibitors. Furthermore, cluster analysis identified significant classes associated with vigorous biological activity. Conclusions: Some clusters were found to be enriched in highly potent compounds while maintaining moderate scaffold diversity, making them promising candidates for exploring unique chemical spaces and identifying novel lead compounds. Overall, this study provides valuable insights into predicting integrase binders, thereby enhancing the accuracy of predictive models. Full article
Show Figures

Figure 1

15 pages, 4198 KiB  
Article
Chemical Space Exploration and Machine Learning-Based Screening of PDE7A Inhibitors
by Yuze Li, Zhe Wang, Shengyao Ma, Xiaowen Tang and Hanting Zhang
Pharmaceuticals 2025, 18(4), 444; https://doi.org/10.3390/ph18040444 - 21 Mar 2025
Viewed by 373
Abstract
Background/Objectives: Phosphodiesterase 7 (PDE7), a member of the PDE superfamily, selectively catalyzes the hydrolysis of cyclic adenosine 3′,5′-monophosphate (cAMP), thereby regulating the intracellular levels of this second messenger and influencing various physiological functions and processes. There are two subtypes of PDE7, PDE7A [...] Read more.
Background/Objectives: Phosphodiesterase 7 (PDE7), a member of the PDE superfamily, selectively catalyzes the hydrolysis of cyclic adenosine 3′,5′-monophosphate (cAMP), thereby regulating the intracellular levels of this second messenger and influencing various physiological functions and processes. There are two subtypes of PDE7, PDE7A and PDE7B, which are encoded by distinct genes. PDE7 inhibitors have been shown to exert therapeutic effects on neurological and respiratory diseases. However, FDA-approved drugs based on the PDE7A inhibitor are still absent, highlighting the need for novel compounds to advance PDE7A inhibitor development. Methods: To address this urgent and important issue, we conducted a comprehensive cheminformatics analysis of compounds with potential for PDE7A inhibition using a curated database to elucidate the chemical characteristics of the highly active PDE7A inhibitors. The specific substructures that significantly enhance the activity of PDE7A inhibitors, including benzenesulfonamido, acylamino, and phenoxyl, were identified by an interpretable machine learning analysis. Subsequently, a machine learning model employing the Random Forest–Morgan pattern was constructed for the qualitative and quantitative prediction of PDE7A inhibitors. Results: As a result, six compounds with potential PDE7A inhibitory activity were screened out from the SPECS compound library. These identified compounds exhibited favorable molecular properties and potent binding affinities with the target protein, holding promise as candidates for further exploration in the development of potent PDE7A inhibitors. Conclusions: The results of the present study would advance the exploration of innovative PDE7A inhibitors and provide valuable insights for future endeavors in the discovery of novel PDE inhibitors. Full article
Show Figures

Graphical abstract

23 pages, 1344 KiB  
Article
In Silico Approach for Antibacterial Discovery: PTML Modeling of Virtual Multi-Strain Inhibitors Against Staphylococcus aureus
by Valeria V. Kleandrova, M. Natália D. S. Cordeiro and Alejandro Speck-Planche
Pharmaceuticals 2025, 18(2), 196; https://doi.org/10.3390/ph18020196 - 31 Jan 2025
Cited by 2 | Viewed by 963
Abstract
Background/Objectives: Infectious diseases caused by Staphylococcus aureus (S. aureus) have become alarming health issues worldwide due to the ever-increasing emergence of multidrug resistance. In silico approaches can accelerate the identification and/or design of versatile antibacterial chemicals with the ability to [...] Read more.
Background/Objectives: Infectious diseases caused by Staphylococcus aureus (S. aureus) have become alarming health issues worldwide due to the ever-increasing emergence of multidrug resistance. In silico approaches can accelerate the identification and/or design of versatile antibacterial chemicals with the ability to target multiple S. aureus strains with varying degrees of drug resistance. Here, we develop a perturbation theory machine learning model based on a multilayer perceptron neural network (PTML-MLP) for the prediction and design of versatile virtual inhibitors against S. aureus strains. Methods: To develop the PTML-MLP model, chemical and biological data associated with antibacterial activity against S. aureus strains were retrieved from the ChEMBL database. We applied the Box–Jenkins approach to convert the topological indices into multi-label graph-theoretical indices; the latter were used as inputs for the creation of the PTML-MLP model. Results: The PTML-MLP model exhibited accuracy higher than 80% in both training and test sets. The physicochemical and structural interpretation of the PTML-MLP model was performed through the fragment-based topological design (FBTD) approach. Such interpretations permitted the analysis of different molecular fragments with favorable contributions to the multi-strain antibacterial activity and the design of four new drug-like molecules using different fragments as building blocks. The designed molecules were predicted/confirmed by our PTML model as multi-strain inhibitors of diverse S. aureus strains, thus representing promising chemotypes to be considered for future synthesis and biological testing of versatile anti-S. aureus agents. Conclusions: This work envisages promising applications of PTML modeling for early antibacterial drug discovery and related antimicrobial research areas. Full article
Show Figures

Figure 1

11 pages, 2439 KiB  
Article
AISMPred: A Machine Learning Approach for Predicting Anti-Inflammatory Small Molecules
by Subathra Selvam, Priya Dharshini Balaji, Honglae Sohn and Thirumurthy Madhavan
Pharmaceuticals 2024, 17(12), 1693; https://doi.org/10.3390/ph17121693 - 15 Dec 2024
Viewed by 1366
Abstract
Background/Objectives: Inflammation serves as a vital response to diverse harmful stimuli like infections, toxins, or tissue injuries, aiding in the elimination of pathogens and tissue repair. However, persistent inflammation can lead to chronic diseases. Peptide therapeutics have gained attention for their specificity in [...] Read more.
Background/Objectives: Inflammation serves as a vital response to diverse harmful stimuli like infections, toxins, or tissue injuries, aiding in the elimination of pathogens and tissue repair. However, persistent inflammation can lead to chronic diseases. Peptide therapeutics have gained attention for their specificity in targeting cells, yet their development remains costly and time-consuming. Therefore, small molecules, with their stability, low immunogenicity, and oral bioavailability, have become a focal point for predicting anti-inflammatory small molecules (AISMs). Methods: In this study, we introduce a computational method called AISMPred, designed to classify AISMs and non-AISMs. To develop this approach, we constructed a dataset comprising 1750 AISMs and non-AISMs, each annotated with IC50 values sourced from the PubChem BioAssay database. We computed two distinct types of molecular descriptors using PaDEL and Mordred tools. Subsequently, these descriptors were concatenated to form a hybrid feature set. The SVC-L1 regularization method was implemented for the optimum feature selection to develop robust Machine learning (ML) models. Five different conventional ML classifiers were employed, such as RF, ET, KNN, LR, and Ensemble methods. Results: A total of 15 ML models were developed using 2D, FP, and Hybrid feature sets, with the ET model with hybrid features achieving the highest accuracy of 92% and an AUC of 0.97 on the independent test dataset. Conclusions: This study provides an effective method for screening AISMs, potentially impacting drug discovery and design. Full article
Show Figures

Figure 1

Back to TopTop