Next Article in Journal
Synthesis and Biological Evaluation of Thiazolyl-Benzene/Camphor Sulfonamide Derivatives as Antibacterial, Antioxidant, and Antidiabetic Compounds
Previous Article in Journal
Experimental Design in Pharmaceutical Formulation Development: Achievements, Limitations and the Transition Toward Intelligent Optimization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning Integration of In-Silico QSAR, Graph Neural Networks and Docking Reveal Natural Products Inhibitors Against Mycobacterium tuberculosis

by
Sakthidhasan Periasamy
1,2,*,
Rajesh Ramasamy
3,
Rajasekar Chinnaiyan
4 and
Arun Sridhar
5,*
1
Department of Nanotechnology, Noorul Islam Centre for Higher Education, Kumaracoil, Nagercoil 629180, Tamil Nadu, India
2
Department of Botany, Bharathidasan University, Tiruchirappalli 620024, Tamil Nadu, India
3
Department of Botany, PTMTMC College, Kottaimedu 623604, Tamil Nadu, India
4
Department of Botany, Alagappa University, Karaikudi 630003, Tamil Nadu, India
5
Immunology-Vaccinology, Department of Infectious and Parasitic Diseases, Fundamental and Applied Research for Animals & Health (FARAH), Faculty of Veterinary Medicine, University of Liège, B-4000 Liège, Belgium
*
Authors to whom correspondence should be addressed.
Sci. Pharm. 2026, 94(2), 39; https://doi.org/10.3390/scipharm94020039
Submission received: 25 February 2026 / Revised: 4 May 2026 / Accepted: 9 May 2026 / Published: 14 May 2026

Abstract

Background/Objectives: Tuberculosis (TB), caused by Mycobacterium tuberculosis, remains a major global health challenge, exacerbated by the emergence of multidrug-resistant strains and limited efficacy of existing therapies. Given the involvement of multiple essential mycobacterial proteins, multitarget drug discovery represents a rational therapeutic strategy. Methods: In this study, an integrated in silico pipeline combining machine learning–based quantitative structure–activity relationship modeling, graph neural network–driven drug–target affinity prediction, molecular docking, molecular dynamics (MD) simulations, and pharmacokinetic–toxicity profiling was employed to identify potential antitubercular leads from natural products. Results: A curated library of over 0.69 million compounds from the COCONUT database was systematically screened against seven essential M. tuberculosis protein targets. Machine learning and heterogeneous graph neural network models effectively captured complex ligand–protein interaction patterns, enabling high-confidence multitarget prioritization. Structure-based docking and MM-GBSA analyses revealed favorable binding affinities, further supported by 100 ns Molecular Dynamics simulations demonstrating stable binding and conformational integrity. In silico ADMET and toxicity predictions identified pharmacokinetically balanced candidates, while density functional theory calculations corroborated favorable electronic properties. Conclusions: Notably, a myricetin-based flavonoid glycoside exhibited consistent multitarget binding and dynamic stability across all targets. Overall, this study underscores the potential of integrated artificial intelligence and structure-based approaches in accelerating natural product-based antitubercular drug discovery and supports further experimental validation of prioritized leads.

Graphical Abstract

1. Introduction

Tuberculosis (TB) remains one of the world’s leading causes of death from a single infectious agent, Mycobacterium tuberculosis, despite the fact that it is a disease that could be prevented and treated. A quarter of the global population are carriers of latent TB infection; although only a limited number of them develop the active disease, the probability increases with factors such as immune deficiency, malnutrition, diabetes, or tobacco usage. In 2023, TB led to the death of 1.25 million people and reported 10.8 million new cases, of which drug-resistant cases neared 4 million [1]. TB is more prevalent in low- and middle-income countries, where 50% of the cases are distributed between Bangladesh, China, India, Indonesia, Nigeria, Pakistan, the Philippines, and South Africa. M. tuberculosis infects the host cells by blocking the phagosome-lysosome fusion process to escape the immune defenses [2]. The use of anti-TB drugs was first exemplified with streptomycin (1943) and para-aminosalicylic acid (1946), followed by isoniazid, pyrazinamide (1952), ethambutol, and rifampicin (1966), which, by their combination, became the foundation of present-day TB chemotherapy [3]. The resistance to rifampicin is considered a sign of the coming of multidrug-resistant (MDR)—TB, and generally there is a problem of rifampicin mono-resistant TB patients that become MDR-TB [4]. Nevertheless, the fight against TB is still not easy because of the growing number of MDR-TB as well as extensively drug-resistant (XDR)-TB, which need long and expensive treatments, so patients often discontinue the treatment, leading to worsening health conditions [5]. The continued emergence of drug-resistant strains, including MDR and XDR-TB, continues to drive the search for novel antitubercular drugs.
M. tuberculosis H37Rv (most widely used reference strain) has approximately 4.41 Mb GC-rich genome encoding nearly 4000 genes, with a strong emphasis on lipid metabolism and unique glycine-rich proteins that likely contribute to antigenic variation and pathogenicity [6,7]. Mtb Proteins were classified into functional categories, predominantly cell wall and cell processes, intermediary metabolism, and conserved hypothetical proteins. PATRIC (Pathosystems Resource Integration Center) analysis identified 10 essential proteins, including five membrane proteins (Phosphatidylserine synthase, Siderophore exporter MmpL5, GatB, EccCa1, and Rv1634) and five secreted proteins (Nrp, SirA, EspA, TopA, and Rv3103c). Among these, MmpL5 and Rv1634 are linked to antibiotic resistance, while phosphatidylserine synthase (PssA) represents a potential drug target [8]. However, β-Ketoacyl-ACP synthase, InhA (the NADH-dependent enoyl-acyl carrier protein reductase), and KasA (3-oxoacyl-ACP synthase) have been demonstrated as key targets of isoniazid, playing crucial roles in mycolic acid biosynthesis. Additionally, DNA-dependent RNA polymerase and ATP synthase are important targets involved in transcription and energy metabolism, respectively [9,10,11,12].
Natural products (NP) have always been a major source of drugs for the treatment of many infectious diseases, mainly because of their amazing scaffold diversity and evolutionary optimization for biological functions. Although their use has been traditional, it still demonstrates efficacy and safety; however, problems in screening, large-scale production, and material acquisition have led to a decrease in the discovery of drugs based on natural products despite some success [13]. Medicinal and aromatic plants have been used for centuries as a base for traditional remedies, and knowledge of them has been transmitted through pharmacopeias; they are still the main source of new bioactive scaffolds. Their largely unexplored biodiversity is a huge source of new remedies. Bioactive compounds can be extracted from these plants using modern technologies for potential pharmaceutical applications. In the phase of discovery and development, it is essential that various disciplines need to work together extensively and continuously [14,15,16,17].
Plant-derived natural compounds are chemically categorized into several major classes according to their molecular structures, including alkaloids, terpenoids, glucosinolates, saponins, and phenolic compounds (such as lignans, neolignans, coumarins, flavonoids, quinones, and tannins) [18,19]. Flavonols are major secondary metabolites that offer a broad spectrum of antibacterial activity against both Gram-positive and Gram-negative bacteria. It exerts its effect by DNA gyrase inhibition, membrane potential disruption, and motility suppression. Among these, quercetin, kaempferol, and myricetin are the main compounds, with quercetin being the most potent one, demonstrating the strongest effect against Escherichia coli and Staphylococcus aureus. Besides, rhamnetin and morin are also effective against Chlamydia pneumoniae, wherein the hydrophobic methoxy group of rhamnetin is known for its better activity. In addition, modifications and glycosides of quercetin and kaempferol will turn into an antibacterial reservoir [20,21,22].
In silico drug design enables the rapid screening and optimization of large chemical libraries at a substantially lower resource and time than experimental assays, thereby accelerating hit discovery and lead optimization [23]. Natural product chemical database COCONUT is a large open repository of 600k+ curated natural product structures with organism and provenance information, search tools, and NPASS, which links NPs to quantitative bioactivity and species data for target-based screening and prioritization. The NP databases power virtual screening, knowledge-graph target mapping, and AI-driven generation/ADMET filtering, accelerating hit finding from natural chemical space [24,25]. Recent advances in computational power, machine learning, and high-throughput molecular modeling have further enhanced prediction accuracy and broadened applicability. The present study highlights the in-silico screening of NPs against essential M. tuberculosis proteins. The research design combines AI/ML-based QSAR (quantitative structure–activity relationship) modeling, virtual screening, molecular docking, molecular dynamics (MD) simulations, and ADMET profiling to evaluate binding affinity, electronic features, complex stability, and toxicity risk thoroughly. By employing these computational workflows, the present research aims to pick out the most effective natural product candidates for anti-tubercular drug design.

2. Materials and Methods

2.1. Dataset Collection and Preprocessing

The phytochemical structures analyzed in this work were retrieved from the Coconut Natural Product Database (COCONUT: the COlleCtion of Open NatUral producTs; https://coconut.naturalproducts.net (accessed 25 August 2025)). Compounds were downloaded in Simplified Molecular Input Line Entry System (SMILES) format. To reduce complexity and redundancy, stereochemistry annotations were removed. Because stereochemistry plays a crucial role in determining the biological activity and binding affinity of compounds, as different stereoisomers can exhibit significant variations in pharmacological effects, exemplified by Ethambutol (an antitubercular agent) residing exclusively in its distomer. However, to reduce complexity and redundancy, stereochemical annotations were omitted, while acknowledging their potential influence on ligand–target interactions [26,27,28]. Canonical SMILES were generated using RDKit (v2020.09.1.0), and duplicates or invalid SMILES were filtered. The ChEMBL structure curation pipeline was applied to standardize chemical structures, remove salts, solvents, and isotopes, and generate parent molecules [29].

2.2. QSAR Analysis and Fingerprint Generation

QSAR modeling was conducted using the publicly available repository ML_QSAR_model (https://github.com/shinoxide/ML_QSAR_model (accessed 18 July 2025)). Molecular descriptors and fingerprints were computed with PaDEL-Descriptor (v0.1.16). The workflow implemented Random Forest (RF) and Support Vector Machine (SVM) models, with hyperparameter optimization performed using grid search. Data were split into training (80%) and test (20%) sets using stratified sampling. Model performance was evaluated by internal tenfold cross-validation and external validation on the test set. Evaluation metrics included accuracy, balanced accuracy, F1-score, Matthews correlation coefficient (MCC), and ROC-AUC. Y-randomization tests were applied to assess chance correlation, and the applicability domain (AD) was determined by Euclidean distance thresholds [30].
Available experimental binding and inhibitory data for natural products against selected protein targets were retrieved from PubChem bioassays. IC50 values were converted to pIC50 (–log10 IC50) to normalize the distribution and enable regression analysis. For classification tasks, a threshold pIC50 was applied to divide compounds into “active” and “inactive” categories.

2.3. Virtual Screening

Graph-Based Deep Learning for Binding Affinity Prediction

The binding affinity was predicted by a hybrid machine learning setup, which combines descriptor-based QSAR models and a neighborhood-aware heterogeneous graph neural network (NHGNN-DTA) architecture [https://github.com/hehh77/NHGNN-DTA (accessed 19 July 2025)] as an adaptation of newly published techniques for the drug target affinity prediction problem [31]. In the QSAR workflow, molecular descriptors and ECFP4 fingerprints were generated using PaDEL-Descriptor and RDKit, correspondingly. RF and SVM models, implemented with scikit-learn, were trained on a variety of bioassay datasets that have been cleaned up. Tenfold cross-validation and external test sets, balanced accuracy, F1-score, Matthews correlation coefficient (MCC), and ROC-AUC were employed to evaluate the models’ abilities.
Besides the QSAR models, we used the NHGNN-DTA framework, representing drug molecules and protein targets as a heterogeneous graph in which the nodes represent atoms, amino acids, and neighborhood interaction features. Molecular graphs were made from SMILES strings by RDKit. Atoms were nodes with their associated physicochemical features and bonds as edges. The protein sequence graphs were derived from the structural contact maps encoding the residues as nodes and edges connecting the spatial adjacency of the surface or the amino acids with the closest structural neighbors. The NHGNN component did not just collect the information from direct neighbors but also from different kinds of nodes and multiple-hop connections, thus getting deeper structural and functional features of ligand–protein interactions. PyTorch Geometric (2.4) was used to implement the NHGNN-DTA model, which was trained for 500 epochs with the Adam optimizer. Mean squared error (MSE) was used as the loss function. The performance of the model was evaluated using Root Mean Square Error (RMSE), Concordance Index (CI), Pearson correlation, and Spearman correlation coefficients, which are the same metrics used in benchmarking standards for drug–target affinity prediction. The predicted affinity scores of the QSAR models and NHGNN-DTA were plotted, and the compounds with the highest predicted affinities that were common to both approaches were selected. These top-scoring coconut-derived natural products were then chosen for further structure-based docking and molecular dynamics validation.

2.4. Protein and Ligand Preparation

M. tuberculosis proteins were initially selected from the Protein Data Bank (PDB) based on their reported involvement in essential biological processes, including cell wall biosynthesis, transcriptional regulation, iron acquisition, and energy metabolism. Based on these criteria, seven proteins were selected for detailed investigation: 2KGW (OmpATb domain), 3G1M (EthR), 3LOG (MbtI), 5JZX (MurB), 5LD8 and 5W07 (KasA/InhA-related enzymes), and 8J57 (ATP synthase Fo complex). The selected proteins comprised diverse functional classes such as enoyl-ACP reductase, KasA, MurB, MbtI, EthR, ATP synthase, and membrane-associated proteins. These targets were chosen due to their essential roles in bacterial survival and pathogenicity. Contact maps were generated from Cα-atom distances to capture spatial residue relationships. For molecular docking studies, all ligands were preprocessed using the LigPrep module, where geometry optimization and protonation were carried out at physiological conditions (pH 7 ± 2) employing Epik in conjunction with the OPLS4 force field (Schrödinger Maestro Release 2025-1; Epik, Schrödinger, LLC, New York, NY, USA). Protein structures were prepared with the Protein Preparation Wizard by eliminating crystallographic water molecules and cofactors, adding missing hydrogen atoms, correcting bond orders, and optimizing hydrogen-bonding networks at pH 7.4 ± 2, followed by restrained energy minimization using OPLS4. The stereochemical integrity of the prepared proteins was assessed through Ramachandran plot analysis, which confirmed that more than 95% of residues occupied favored conformational regions.

2.5. Molecular Docking and Binding Free Energy Calculation

The proteins’ structural information of the selected targets was obtained from the RCSB Protein Data Bank and prepared by the Protein Preparation Wizard in Schrödinger Maestro (Release 2025-1; Schrödinger, LLC, New York, NY, USA). The steps included in the preparation are the removal of crystallographic water molecules and cofactors, correction of mislabeled atoms, addition of hydrogen atoms, assignment of bond orders, optimization of hydrogen bonding networks, and energy minimization with the OPLS4 force field. Prime was used to remodel missing loops and side chains so that the structures are complete and stable both energetically and geometrically. First, LigPrep was used to prepare ligands, where the OPLS4 force field was applied, geometries were optimized, and hydrogen atoms were assigned; and then Epik predicted protonation states at physiological pH (7 ± 2). Then, using the Glide Receptor Grid Generation tool, receptor grids were set up, and the binding pockets that might be involved in interactions were identified by SiteMap, which places probes along a 3D grid over the protein surface. SiteMap derives a SiteScore from the properties of the putative binding site, including hydrogen bond donor/acceptor capacity, solvent-exposed surface, hydrophobic/hydrophilic balance, and pocket volume. The pockets with high SiteScores and residues at the hotspot areas were selected as binding sites for molecular docking [32].
Molecular docking was done by using the Glide module in Extra Precision (XP) mode [33] that keeps ligand flexibility to allow conformational sampling; at the same time, it rapidly removes energetically unfavorable ligand conformers. The receptor grid files resulted in the protein structure with its identified active site and were used as inputs for docking simulations. The highest-ranking docking orientations were examined on the basis of Glide scores and binding interactions, and the most advantageous complexes were subjected to Prime MM-GBSA treatment for an approximate binding free energy (ΔG_bind) and to check the thermodynamic stability of receptor–ligand interactions.
ΔG(bind) = ΔG(solv) + ΔE(MM) + ΔG(SA)
ΔG(solv) represents the difference in GBSA solvation free energy between the protein–ligand complex and the combined solvation energies of the isolated protein and ligand. ΔE(MM) denotes the variation in minimized molecular mechanics energy of the protein–ligand complex relative to the sum of the individual energies of the unbound protein and ligand. ΔG(SA) corresponds to the change in surface area–dependent energy of the complex when compared with the total surface area energies of the protein and ligand in their free states.

2.6. In Silico Pharmacokinetics Study

Six bioactive phytocompounds were selected based on their reported in silico pharmacological potential. Their chemical structures were retrieved in SMILES format and used as inputs for predictive computational analyses.

ADMET and Toxicity Prediction

The pharmacokinetic and toxicity profiles of the selected phytocompounds were evaluated using advanced web-based platforms. absorption, distribution, metabolism, excretion, and toxicity (ADMET) parameters were predicted with ADMETlab 3.0 [34]. This platform applies a multi-task Deep Message Passing Neural Network (DMPNN) that integrates molecular graph representations with RDKit-derived two-dimensional descriptors to achieve robust property prediction. ADMETlab 3.0 predicts 21 physicochemical properties, 19 medicinal chemistry attributes, 34 ADME-related endpoints, 36 toxicity endpoints, and 8 toxicophore rules. To enhance prediction reliability, the framework incorporates evidential deep learning for regression-based uncertainty estimation and Monte Carlo dropout for classification tasks, thereby providing probabilistic confidence scores to support informed compound prioritization. For toxicity assessment, the ProTox-III server was used [35]. This tool employs ensemble machine-learning approaches and deep neural networks trained on extensive toxicogenomic and cheminformatics datasets. It predicts a broad spectrum of toxicity endpoints, including acute toxicity, organ-specific toxicity, immunotoxicity, mutagenicity, cytotoxicity, and endocrine disruption potential. The platform assigns each compound to a toxicological class (I–VI) with associated probability scores and reports endpoint-specific activity status (active/inactive), together delivering a comprehensive evaluation of compound safety.

2.7. Molecular Dynamics (MD) Simulation

To confirm that the docked complexes remain stable over time, the Molecular Dynamics (MD) simulations were taken using GROMACS v2023.2 [36]. Protein was parameterized with the AMBER99SB-ILDN force field, and the ligand was assigned with the AM1-BCC method. In order to accommodate each complex, a triclinic water box using the TIP4P model was used with no less than 1.0 nm of padding from the box edges and neutralized with Cl counterions. Energy minimization using the steepest descent algorithm was implemented to get rid of steric clashes. The systems were stabilized after equilibration under NVT and NPT ensembles for 1000 ps each, employing the Parrinello–Rahman barostat to maintain pressure. The run of production was for 100 ns with a 2 fs timestep under periodic boundary conditions. The Particle Mesh Ewald (PME) method was used for long-range electrostatics, and all bonds that were of adjustable lengths were fixed by the LINCS algorithm. The simulation trajectories were counted to analyze root mean square deviation (RMSD), root mean square fluctuation (RMSF), and hydrogen bond occupancy, which together offer the conformational stability of proteins and the persistence of ligand binding throughout the simulations.

3. Results

3.1. ML-Based Data Collections, Preprocessing, and QSAR Analysis

A total of 695,119 natural product structures were retrieved from the COCONUT Natural Product Database and subjected to a comprehensive preprocessing workflow prior to QSAR modeling. Structural refinement using RDKit involved the removal of salts, solvents, isotopes, duplicates, and chemically inconsistent fragments, as well as the correction of stereochemical inconsistencies to ensure descriptor-ready SMILES. A machine learning-based Quantitative Structure–Activity Relationship model was developed to correlate molecular descriptors of the compounds. Molecular descriptors, including physicochemical (molecular weight, LogP, hydrogen bond donors and acceptors, and topological polar surface area) and structural features, were calculated.
After this multistep curation, the dataset was split into valid, invalid, and inactive subsets (Figure 1A). Therefore, 57.5% of the molecules were considered structurally valid and unique, which is equivalent to 399,575 compounds that were “active” for QSAR modeling. And 89,118 records (12.8%) were determined to be invalid or stereochemically incorrect SMILES, and 206,426 molecules (29.7%) were identified as inactive or duplicates; additionally, 1794 compounds were labeled as inactive during affinity prediction. This classification was consistent with the quality-distribution pattern illustrated in the dataset validation chart. The curated set of active natural products was subsequently subjected to machine-learning–based QSAR analysis, generating predicted pIC50 values that revealed a characteristic potency gradient across the chemical space (Figure 1B). A small subset of molecules exhibited high predicted potency (pIC50 > 8), whereas a broader distribution of compounds fell within the moderate-potency range (pIC50 = 7–8). The decline in pIC50 values showed an inflection around 7.27, marking a transition to the low-potency region (pIC50 < 7), which encompassed the majority of the dataset (Supplementary Material—Table S1). This combined preprocessing and QSAR workflow effectively refined the large natural product collection into a structurally reliable and biologically meaningful set, enabling the prioritization of high potential candidates for subsequent docking and molecular dynamics analyses.

3.2. Graph-Neural-Network–Based Affinity Prediction (NHGNN-DTA)

To complement the QSAR predictions, binding affinity was estimated using the NHGNN-DTA model, which encodes ligand and protein structures as heterogeneous graphs capturing atom-level, residue-level, and multi-hop neighborhood interactions. The model was trained for 500 epochs with the Adam optimizer using MSE as the loss function and evaluated through RMSE, Pearson correlation, Spearman correlation, and CI metrics, demonstrating strong predictive stability. The NHGNN-DTA predictions generated affinity values for 350,045 compounds across multiple tuberculosis target proteins (2KGW, 3G1M, 3LOG, 5JZX, 5LD8, 5W07, 8J57). Statistical analysis of the predicted scores revealed wide dynamic ranges that helped to distinguish high-confidence binders from weaker candidates shown in Figure 2.
Affinity values for 2KGW ranged from 0 to 227.5 (kcal/mol), hence indicating mostly moderate binders. The 3G1M protein exhibited a wider activity profile with an average affinity of 90.95 (kcal/mol) and a maximum value of 158.9 (kcal/mol); therefore, medium-affinity compounds seemed to be more represented. Alternatively, the 3LOG protein had the largest range of values, going as far as 47,305 (kcal/mol), but most of the molecules were around the median value of 286.6 (kcal/mol). Both 5JZX and 5LD8 followed the same pattern; the median affinities were between 250 (kcal/mol) and 314 (kcal/mol), while the upper extremes were 1.46 × 106 (kcal/mol) and 3.64 × 105 (kcal/mol), respectively. Protein 5W07 had a moderate spread; it was characterized by an average affinity of 7008.8 (kcal/mol) and was interspersed with occasional high-affinity outliers. 8J57 was one of the proteins that had the widest range of values among all proteins, going from 0 to 1.40 × 107 (kcal/mol), but most of the compounds were close to the median affinity (Supplementary Material—Table S2). Together, these distributions reveal different binding landscapes across targets and the model’s capacity to detect very slight changes in the ligand–protein interaction propensity in a structurally diverse natural-product library. Across all protein targets, the predicted affinities from the NHGNN-DTA model demonstrated substantial variability, reflecting that the chemical diversity of the screened natural products was an alternative source of antibacterial drugs.

3.3. Molecular Docking

The graph-neural-network–based scoring approach (NHGNN-DTA) efficiently reduced the large Coconut NP library to the top 100 recurrent high-affinity molecules that consistently ranked among the strongest predicted binders across multiple M. tuberculosis protein targets (Supplementary Material—Table S3). These shortlisted candidates were further evaluated through standard-precision molecular docking using Schrödinger Glide. Among these compounds, certain flavonoid glycosides, particularly derivatives of myricetin and quercetin, were observed to exhibit favorable and relatively consistent binding interactions across multiple targets. Enrichment of flavonoids may be influenced by their representation within the screened dataset. Therefore, rather than indicating an inherent property of all polyphenolic scaffolds, these results suggest that specific flavonoid derivatives may represent promising candidates in anti-tuberculosis drug discovery.
Docking calculations provided detailed binding metrics, including docking scores, Glide energies, and protein-specific interaction profiles, collectively revealing a clear convergence toward a relatively small subset of structurally conserved molecules that consistently exhibited favorable docking behavior across all seven targets analyzed (Table 1; Figure 3). The MM-GBSA binding free energy values further substantiated the docking results by confirming the thermodynamic stability of the top-ranked ligand–protein complexes, thereby strengthening the reliability of the predicted binding interactions. For the 2KGW receptor, the myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside scaffold emerged among the top-scoring ligands, with a docking score of −5.445 kcal/mol, a Glide energy of −48.737 kcal/mol, and an MM-GBSA binding free energy of −30.92 kcal/mol (Figure 3A). The ligand established an extensive hydrogen-bonding network with key residues, including Lys279, Asp283, and Tyr237, which likely played a central role in anchoring the glycosidic moieties within the binding cavity. Additional polar and van der Waals interactions with Ala282, Val286, Ala287, Ile280, Ile295, and Thr297 further reinforced ligand stabilization. The compound was deeply accommodated within a mixed-polarity binding groove, where its polyphenolic core and multiple hydroxyl groups enabled multipoint interactions, thereby enhancing binding stability and favorable molecular recognition.
Similarly, the 3G1M protein demonstrated strong binding affinity toward compound_4, which achieved a docking score of −8.353 kcal/mol, a Glide energy of −32.181 kcal/mol, and an MM-GBSA binding free energy of −104.85 kcal/mol (Figure 3B). Myricetin-based glycosides also showed favorable binding in this system, with docking scores ranging from −7.5 to −7.7 kcal/mol and MM-GBSA values between −31.79 and −53.17 kcal/mol, highlighting their potential as multitarget inhibitors. Compound_44 exhibited prominent aromatic and hydrogen-bond interactions involving Trp207, Phe110, Glu180, Asn176, and Thr105. In addition, hydrophobic residues such as Leu87, Val152, and Met142 further stabilized the binding conformation. Together, these interactions suggest that the ligand achieves a compact and energetically favorable orientation within the active site, which may enhance receptor inhibition through improved pocket occupancy.
The 3LOG receptor also exhibited favorable interactions with flavonoid-based molecules. Compound_21 recorded a docking score of −7.062 kcal/mol, a Glide energy of −23.815 kcal/mol, and an MM-GBSA binding free energy of −58.65 kcal/mol, while several structurally related metabolites scored between −6.4 and −6.5 kcal/mol, indicating a consistent affinity of this receptor toward flavonoid scaffolds (Figure 3C). Notably, compound_21 demonstrated electrostatic complementarity through direct interaction with Arg382 and water-mediated bridges involving Tyr385 and Trp415. Hydrophobic contacts with residues such as Val410 and Leu384 further restricted ligand mobility within the pocket. These combined polar and nonpolar interactions suggest that the ligand is effectively retained within the active site, thereby supporting its potential to interfere with receptor function. In contrast, myricetin-3-rhamnosyl formed a denser hydrogen-bonding network with Glu65, Arg133, Gly242, and Ile244, supplemented by π-alkyl interactions with Val410 and Leu241, which likely explains its comparatively superior docking performance. It showed a slightly lower docking score (−6.225 kcal/mol) but maintained a favorable MM-GBSA value of −31.25 kcal/mol.
Docking against 5JZX also revealed a high-affinity profile, comparable to the results obtained for 3LOG (Figure 3D). The top candidates, including compound_60 and compound_44, scored −5.82 and −5.55 kcal/mol, respectively, with MM-GBSA values of −19.24 and −19.03 kcal/mol, again highlighting the recurring dominance of flavonoid glycosides. Compound_60 formed essential hydrogen bonds with Thr38 and Gln44, while its core scaffold was enclosed within a hydrophobic pocket shaped by Ile15, Ile37, and Cys39. This binding pattern indicates that both hydrogen-bonding capacity and hydrophobic complementarity contributed to ligand stabilization. Myricetin-3-rhamnosyl exhibited even more extensive interactions, suggesting improved residence within the binding cavity and stronger inhibitory potential.
The 5LD8 protein also supported strong ligand binding, with myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside achieving a docking score of −8.18 kcal/mol, a Glide energy of −88.56 kcal/mol, and an MM-GBSA binding free energy of −81.92 kcal/mol, alongside notable binding by compound_8 (−33.64 kcal/mol) and moracetin (−46.30 kcal/mol) (Figure 3E). The ligand formed hydrogen bonds with Gly115, Val168, Ser169, Glu199, Pro201, and Gln200, while aromatic stabilization from Phe404 and hydrophobic contacts with Met277 and Val278 further strengthened the complex. These interactions indicate that the ligand was well accommodated within both polar and hydrophobic regions of the binding site, supporting stable receptor engagement.
The highest binding affinity across all proteins was observed for 5W07, where myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside achieved an exceptionally strong docking score of −9.312 kcal/mol, a Glide energy of −74.094 kcal/mol, and an MM-GBSA binding free energy of −79.39 kcal/mol (Figure 3F). This result indicates deep and highly stabilized insertion of the ligand into the binding pocket. Several additional compounds also scored below −9.0, identifying 5W07 as a highly responsive target. Myricetin-3-rhamnosyl formed extensive hydrogen bonds with Cys195, Thr161, Met164, Gln103, and Lys168, while hydrophobic residues such as Phe150, Leu221, and Ile197 contributed to dual anchoring within both deep and surface regions of the receptor. Such an interaction profile suggests a strong capacity for sustained ligand retention and enhanced inhibitory efficiency.
Consistent strong interactions were also observed for the 8J57 receptor, where myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside recorded a docking score of −6.706 kcal/mol, a Glide energy of −71.327 kcal/mol, and an MM-GBSA binding free energy of −51.17 kcal/mol (Figure 3G). Other high-performing ligands, including compound_15, compound_16, and compound_57, displayed docking scores between −5.95 and −6.7. The high-magnitude Glide energies reflected substantial hydrophobic packing and hydrogen-bond contributions, with key interactions involving Glu175, Glu176, Gly163, Asn172, and Lys179, alongside aromatic and hydrophobic contacts with Phe164, Leu168, Ile79, and Ala78. Collectively, these findings indicate that flavonoid glycosides possess a structurally favorable scaffold for broad-spectrum target engagement, thereby supporting their promise as multitarget inhibitory candidates.
With the purpose of structural interpretation of docking results, representative compounds were selected, and their chemical structures are presented in Figure 4. These compounds were chosen based on binding affinity, interaction profiles, and scaffold diversity. The selected molecules encompass a wide range of chemical classes, including small heterocyclic compounds, aromatic scaffolds, and complex polyphenolic glycosides. Structurally simple molecules (e.g., compounds 5, 7, and 26) exhibit limited functional groups, whereas larger compounds such as flavonoid derivatives (compounds 19 and 50) contain multiple hydroxyl and glycosidic moieties that may enhance protein–ligand interactions through hydrogen bonding and polar contacts.

3.4. In-Silico Pharmacology Assessment

3.4.1. ADMET Profiling

The ADMET profiles of five phytocompounds were analyzed using the ADMETlab 3.0 platform. The evaluated Compounds_5, 21, 26, 44, 60, and myricetin were comparatively analyzed using correlation heat maps and hierarchical clustering to elucidate similarities and differences in pharmacokinetic and toxicity-related behavior (Table 2 and Figure 5A). Compound-wise correlation analysis indicated that Compounds_21, 26, 44, and 60 were tightly grouped, showing similar ADMET trends in absorption, metabolism, and toxicity parameters. On the other hand, Compound_5 revealed a more dissimilar pattern, pointing to different pharmacokinetic and safety features. Myricetin was nearest to Compounds_26 and 72, implying partial similarity but still having distinct pharmacokinetic characteristics. Absorption and distribution-related properties, such as human intestinal absorption and blood–brain barrier permeability, were found to be closely related thus, their combined impact on systemic exposure was revealed. Compounds_21 and 60 were found to have relatively good absorption–distribution profiles, while myricetin was predicted to have lower oral bioavailability. Medical parameters associated with metabolism, mainly CYP1A2 and CYP3A4 inhibition, formed a separate cluster that reflected similar trends of cytochrome P450 interactions. Compound_60 had greater CYP interaction potential, and this led to the suggestion of its increased metabolic liability, whereas Compound_26 was characterized by lower CYP liability; thus, it was more metabolically stable. The parameters half-life and P-glycoprotein substrate propensity were grouped together, and this emphasized how efflux transport and systemic persistence are related; myricetin had a very long predicted half-life, and it was also predicted to be a strong P-gp substrate. Parameters of toxicity, such as Ames mutagenicity, hERG inhibition, and respiratory toxicity, were grouped together, and Compound_26 had fewer toxicity signals in comparison with Compounds_44 and 60, which displayed higher respiratory toxicity trends. Myricetin was predicted to cause very low hERG inhibition and respiratory toxicity; thus, it was considered to be cardiopulmonary safe to a greater extent.

3.4.2. Toxicological Endpoint Analysis

Detailed evaluation of individual toxicological endpoints revealing compound-specific safety signatures was shown in Figure 5B and Table 3. Compound_5 was identified to have a high probability for nephrotoxicity and mutagenicity, as well as AhR and CYP2C9 activity, which means that this compound may bring up a number of toxicological and regulatory problems. On the other hand, Compounds_21, 26, 44, and 60 were most of the time predicted as inactive with respect to nephrotoxicity, cardiotoxicity, immunotoxicity, and mutagenicity, thus suggesting better safety profiles. Myricetin showed a very different toxicological profile in which the only two effects that were highlighted are immunotoxicity and transthyretin (TTR) activity, whereas this compound was inactive for nephrotoxicity, cardiotoxicity, mutagenicity, and CYP2C9 inhibition. The comprehensive ADMET–toxicity analysis of these three compounds together with toxicity profiling revealed Compound_26 to be the most balanced hit compound. Compound_21 and 44 ranked second and third in this assessment, respectively. Compound_60 showed good absorption; however, it was associated with increased metabolic and respiratory risk. Compound_5 showed multiple predicted toxicological liabilities. Myricetin had a distinct profile that interacts with efflux and has selective toxicity-related activities. Therefore, it needs further optimization and experimental validation.

3.5. Binding Analysis of Myricetin Derivative

To further validate the docking results, the binding poses of myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside in all seven M. tuberculosis targets were inspected using the Ligand Designer module of Schrödinger. Across all proteins, the ligand adopted a broadly conserved orientation in which the polyphenolic aglycone was buried deep inside the binding cavity, while the extended trisaccharide chain followed the contour of the pocket and projected partially toward the solvent. In 2KGW (Figure 6A), the flavonoid core anchored the ligand via hydrogen bonds with Lys279, Asp283, Tyr237, Ala282, Val286, and Ala287, complemented by hydrophobic contacts with Ile295 and surrounding nonpolar residues, consistent with the strong docking scores obtained for this complex. In 3G1M (Figure 6B), the ligand was further stabilized by multiple hydrogen-bond interactions with Asn176, Glu180, and Thr105 and by a dense hydrophobic environment contributed by residues such as Met142, Leu87, and Val152, creating a compact and well-packed pose. For 3LOG (Figure 6C), Ligand Designer highlighted a combination of polar and nonpolar contacts, including hydrogen bonds to Arg382/Tyr385 (and water-mediated links involving Trp415) together with hydrophobic packing by Val410, Leu384, and neighboring aliphatic residues, explaining the favorable interaction energy of this complex.
In 5JZX (Figure 6D), myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside formed key hydrogen bonds with residues lining the entrance of the pocket, supported by a ring of hydrophobic contacts from Ile and Leu side chains that surrounded the sugar moieties and limited solvent exposure. The 5LD8 complex (Figure 6E) showed one of the most consolidated interaction networks: the ligand engaged Gly115, Val168, Ser169, Glu199, Pro201, and Gln200 through multiple hydrogen bonds, while Phe404, Met277, and Val278 provided π–π and hydrophobic stabilization around the aromatic core. In 5W07 (Figure 6F), the ligand penetrated deeply into the binding groove, forming an extensive hydrogen-bonding array with Cys195, Thr161, Met164, Gln103, and Lys168 and hydrophobic contacts with Phe150, Leu221, Ile197, and other nonpolar residues; this dual polar–hydrophobic anchoring is consistent with the very low docking and Glide energy values observed for this target. Finally, in 8J57 (Figure 6G), myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside interacted through a dense set of polar contacts with Glu175, Glu176, Gly163, Asn172, and Lys179, while π–π and hydrophobic interactions with Phe164, Leu168, Ile79, and Ala78 further stabilized the complex. Overall, the Ligand Designer visualization confirmed that this natural product flavonoid glycoside establishes rich, complementary interaction networks of hydrogen bonding, ionic contacts, aromatic stacking, and hydrophobic packing in all seven proteins, supporting its proposed multitarget inhibitory potential against M. tuberculosis.

3.6. Dynamic Stability of Multitarget Binding Behavior

To assess the dynamic stability of myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside within the binding pockets of seven M. tuberculosis target proteins, ligand root-mean-square deviation (RMSD) profiles were monitored over 100 ns molecular dynamics simulations. Ligand RMSD provides a direct measure of conformational stability and positional retention of the ligand relative to its initial docked pose, thereby serving as a critical indicator of binding robustness under dynamic conditions. Among the analyzed complexes, the ligand exhibited the lowest RMSD fluctuations in the 3LOG and 5LD8 systems, where RMSD values remained consistently below ~0.7 Å throughout the simulation. This behavior indicates minimal conformational drift and strong retention of the initial binding pose, suggesting highly stable ligand–protein interactions. Similarly, the 5JZX, 5W07, and 8J57 complexes displayed moderate yet stable ligand RMSD values, stabilizing between ~1.0 and 1.2 Å after the initial equilibration phase (Figure 7). These trajectories reflect controlled flexibility of the ligand while maintaining persistent binding within the active site.
In contrast, the 2KGW and 3G1M complexes showed comparatively higher ligand RMSD deviations, particularly after ~50 ns, with values approaching ~2.0–2.5 Å. Despite this increase, the RMSD trajectories did not show abrupt divergence or unbounded drift, indicating that the ligand remained associated with the binding pocket while undergoing conformational rearrangements or repositioning within a flexible cavity. Importantly, all seven complexes reached equilibrium without evidence of ligand dissociation, confirming the overall stability of the ligand across diverse protein environments. The ligand RMSD analysis provides critical insight into the multitarget binding behavior of myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside beyond static docking predictions. The exceptionally low RMSD observed for the 3LOG and 5LD8 complexes highlights the presence of well-defined binding pockets that strongly constrain ligand motion. This stability is consistent with the extensive hydrogen-bonding networks, aromatic stacking, and hydrophobic contacts identified in the docking interaction analyses, indicating that these proteins may represent particularly favorable targets for this flavonoid glycoside.
The moderate and well-converged RMSD profiles observed for 5JZX, 5W07, and 8J57 suggest a balance between flexibility and stability, where the ligand undergoes limited conformational adaptation while remaining firmly anchored within the binding site. Such behavior is often desirable for natural-product-derived ligands, as controlled flexibility can enhance binding adaptability across multiple targets without compromising residence time. Notably, the strong docking scores and interaction density observed for 5W07 are supported by its stable ligand RMSD profile, reinforcing this protein as a high-confidence target. The relatively higher RMSD fluctuations in the 2KGW and 3G1M complexes likely reflect increased pocket flexibility or larger solvent-exposed regions accommodating the bulky glycosyl moiety of the ligand. However, the absence of ligand dissociation and the maintenance of bounded RMSD values indicate that these deviations arise from internal conformational rearrangements rather than loss of binding. This dynamic adaptability underscores the structural versatility of myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside and its ability to remain engaged within diverse protein architectures.
Collectively, the ligand RMSD results strongly support the multitarget inhibitory potential of myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside against M. tuberculosis. The observed stability across all seven complexes ranging from highly rigid to moderately flexible binding modes aligns well with a polypharmacological mechanism of action, which is particularly advantageous for combating drug resistance in tuberculosis. These findings validate the docking predictions and highlight this flavonoid glycoside as a robust and dynamically stable lead candidate, warranting further investigation through interaction persistence analysis, binding free-energy calculations, and experimental validation.

3.7. RMSD-Based Free Energy Landscape Analysis

RMSD-based Free Energy Landscape (FEL) analyses were performed to characterize the dynamic binding behavior of myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside across seven M. tuberculosis protein targets. The FEL plots revealed well-defined low-energy basins for all complexes, indicating the presence of stable and frequently populated binding conformations throughout the molecular dynamic simulations (Figure 8). This figure provided further insight into the conformational stability and dynamic behavior of the protein–ligand complexes during molecular dynamics simulations. In these plots, low-energy regions (dark purple basins) represent energetically favorable and structurally stable conformational states, whereas high-energy regions (yellow/orange regions) indicate less favorable or transient conformations.
The 3LOG and 5LD8 systems exhibited the most compact and deeply confined minima, characterized by low protein backbone RMSD (<0.15 nm) and low ligand RMSD (<0.5 nm), reflecting highly stable binding modes. Moderate yet well-converged energy basins were observed for 5JZX, 5W07, and 8J57, suggesting controlled ligand flexibility within stable binding pockets. In contrast, 2KGW and 3G1M displayed broader free-energy distributions, indicative of increased conformational adaptability, while still maintaining a dominant low-energy bound state. Importantly, none of the systems showed features consistent with ligand dissociation, confirming stable binding across all targets. The FEL analyses support the multitarget inhibitory potential of myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside by demonstrating stable, low-energy binding conformations across diverse M. tuberculosis proteins.

3.8. Trajectories-Based Free Energy Surface Analysis

To further explore the dominant collective motions of the complexes, principal component analysis (PCA) was performed on the Molecular Dynamics (MD) trajectories. The free energy surface (FES) constructed along the first two principal components (PC1 and PC2) revealed distinct low-energy basins, indicating preferred conformational states that complement the RMSD-based stability analysis.
ΔG = −RT ln P (x, y)
This mapping was employed to characterize the dominant collective motions and conformational stability of myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside–bound complexes across seven M. tuberculosis targets. The three-dimensional free energy surfaces projected along PC1 and PC2 revealed well-defined low-energy basins for all protein–ligand systems, indicating the presence of thermodynamically favorable conformational states during the simulations (Figure 9). For the 2KGW complex (Figure 9A), the energy landscape exhibited a moderately broad minimum, suggesting that ligand binding is stable but accompanied by appreciable protein flexibility along the principal motions. The 3G1M system (Figure 9B) similarly displayed an extended basin with multiple shallow minima, indicative of conformational adaptability while maintaining a preferred bound state. In contrast, the 3LOG complex (Figure 9C) showed a sharply confined and deep free energy minimum, reflecting highly restricted collective motions and a rigid, well-stabilized binding mode.
The 5JZX free energy surface (Figure 9D) revealed a single dominant basin with smooth energy gradients, consistent with controlled protein motion and sustained ligand engagement. The 5LD8 system (Figure 9E) exhibited one of the most pronounced and compact minima, indicating strong conformational confinement and minimal energetic penalty for ligand retention. For 5W07 (Figure 9F), the energy surface was broader with multiple accessible low-energy regions, suggesting enhanced conformational plasticity of the protein while preserving ligand binding. Finally, the 8J57 complex (Figure 9G) displayed a well-defined minimum with moderate basin width, supporting stable binding coupled with limited collective motion. Across all systems, the absence of fragmented or diffuse energy landscapes indicates that myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside consistently stabilizes energetically favorable conformations in structurally diverse targets.
The PCA-based free energy surfaces confirm that myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside maintains stable, low-energy conformational states across multiple M. tuberculosis proteins. The coexistence of rigid binding in some targets such as 3LOG and 5LD8 and adaptive flexibility in others like 2KGW and 5W07 highlights a favorable multitarget and polypharmacological binding profile. These findings, together with RMSD and interaction analyses, strengthen the candidacy of this flavonoid glycoside as a multitarget lead compound for tuberculosis drug discovery.

4. Discussion

Natural product-derived flavonoids are widely recognized for their antimycobacterial potential due to their ability to interact with multiple essential bacterial targets. Previous studies have shown that flavonoids can inhibit DNA gyrase, interfere with mycolic acid biosynthesis, alter membrane integrity, and enhance susceptibility to existing anti-TB drugs [37,38,39,40,41]. Lechner et al. [42] demonstrated the flavonoids epicatechin, isoharmnetin, kaempferol, luteolin, myricetin, quercetin, rutin, and taxifolin modulate isoniazid susceptibility in fast-growing Mycobacteria. These reports support the relevance of exploring flavonoid-rich coconut-derived metabolites against key M. tuberculosis proteins. Recent advances in artificial intelligence–assisted virtual screening have further improved the efficiency of antimicrobial lead identification, particularly for structurally complex natural products.
Almatroudi et al. [43] demonstrated machine learning–guided screening of chemical databases to identify compounds such as ergotamine, withanolide E, DOPPA, ergost-2-en-26-oic acid, and hydroxy-1-isomangostin as potential inhibitors of the bacterial membrane protein BacA. Similarly, Zheng et al. [44] reported aldoxorubicin and quarfloxin as promising antitubercular candidates through virtual screening. Innovative approaches such as word embedding–based virtual screening (WEBVS) and graph-based deep learning models have also successfully prioritized antimicrobial leads from diverse chemical libraries [45,46].
In the present study, molecular docking and MM-GBSA analyses revealed a clear preference for polyhydroxylated flavonoid glycosides, particularly myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside, across all seven selected Mtb targets. The consistent high binding affinity of this compound suggests that its structural architecture is highly compatible with diverse protein binding pockets. The multiple hydroxyl groups and sugar moieties likely enhanced binding through extensive hydrogen-bonding networks, while the flavonoid core contributed to hydrophobic and aromatic stabilization. This dual interaction profile appears to be a key factor underlying its broad multitarget activity. Among the screened targets, the strongest interaction was observed for 5W07, where myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside showed the most favorable docking score (−9.312 kcal/mol), Glide energy (−74.094 kcal/mol), and MM-GBSA binding free energy (−79.39 kcal/mol). These values indicate strong thermodynamic stability and suggest effective occupation of the active site. Similarly, highly favorable binding was observed with 5LD8 (MM-GBSA: −81.92 kcal/mol) and 8J57 (MM-GBSA: −51.17 kcal/mol), indicating that this scaffold may effectively interact with multiple proteins involved in essential mycobacterial survival pathways. Such multitarget behavior is particularly valuable in tuberculosis, where resistance often arises through target-specific mutations.
In addition to flavonoid glycosides, several coconut-derived non-glycosidic metabolites also demonstrated promising target-specific interactions. Compound_26 exhibited strong affinity toward 3G1M, with a highly favorable MM-GBSA value of −104.85 kcal/mol, suggesting a particularly stable complex. Likewise, compound_21 showed strong binding with 3LOG, supported by electrostatic and water-mediated interactions that may enhance ligand retention within the binding cavity. The Compound_44 demonstrated favorable receptor complementarity. These findings indicate that, beyond the dominant flavonoid glycosides, structurally distinct coconut metabolites may serve as complementary lead scaffolds for future optimization. The recurring identification of the same lead compounds across multiple targets strengthens the reliability of the screening results and suggests genuine structure-based affinity rather than target-specific bias. Importantly, the agreement between docking scores, Glide energies, and MM-GBSA values provides additional confidence in the predicted binding stability of the lead compounds. Collectively, these findings demonstrate that the integration of deep learning–based screening with structure-based refinement offers a robust and cost-effective strategy for accelerating natural product–based antitubercular drug discovery, while also supporting the further experimental validation of coconut-derived metabolites as promising multitarget anti-TB candidates.
Molecular dynamics (MD) simulations consistently confirmed the stability of the protein–ligand complexes, as evidenced by low RMSD and RMSF fluctuations throughout the simulation trajectories. A stable binding of 1-Hydroxy-D-788-7–DNA gyrase B and hecogenin–CYP51 complexes was reported previously, with RMSD values maintained within ~0.25 nm, supporting the robustness of docking predictions under dynamic conditions [47]. For the PknB (PDB ID: 5U94) system, the 5U94–CCL complex exhibited greater conformational flexibility, whereas the 5U94–C-2 complex reached early stabilization, indicating enhanced structural stability [48]. Similarly, MD validation of RpfB–secondary metabolite complexes demonstrated sustained interactions and compact conformations, reinforcing their potential role in disrupting mechanisms associated with latent tuberculosis [49]. The dynamic equilibration and steady interaction profiles were also observed for mycobacterial DNA gyrase B complexes, further supporting their structural reliability [50]. In addition, compound_21–GyrB and compound_26–GyrB complexes maintained structural stability over 100 ns simulations, confirming persistent engagement within the ATP-binding pocket [51]. Likewise, desulfohaplosamate–PimA complexes exhibited compact trajectories and sustained hydrogen bonding, indicating reliable inhibition of phosphatidylinositol mannoside biosynthesis [52]. Stable MD profiles were also reported for PanK complexes with Morkotin A and Rutin, supporting their suitability as inhibitors of CoA biosynthesis in M. tuberculosis [53].

5. Limitations of the Study

Although the present study provides strong computational evidence supporting the multitarget antitubercular potential of coconut-derived metabolites, the findings remain predictive and require further validation through in vitro and in vivo studies. In addition, the lead compound, myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside, is not readily commercially available, which currently limits immediate biological evaluation. Efforts to isolate and purify this compound from plant extracts are ongoing to support future experimental validation.

6. Conclusions

This study presents a comprehensive in-silico framework for the identification of multitarget antitubercular agents from coconut-derived natural products. By integrating machine learning–based QSAR screening, graph neural network affinity prediction, molecular docking, MM-GBSA binding free energy analysis, molecular dynamics simulations, ADMET profiling, and density functional theory analysis, structurally diverse and pharmacologically relevant candidates were efficiently prioritized from a large natural product dataset. Among the screened compounds, myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside emerged as the most promising multitarget lead, exhibiting consistent binding affinity, favorable interaction stability, and robust dynamic behavior across seven essential M. tuberculosis proteins. In addition, compound_26 showed a favorable balance between binding efficiency and predicted ADMET safety, supporting its potential as an alternative lead scaffold. The strong agreement among multiple computational approaches enhances confidence in the identified candidates and highlights the therapeutic promise of coconut-derived polypharmacological metabolites against drug-resistant tuberculosis. Overall, these findings provide a strong computational basis for future experimental validation and further lead optimization while also demonstrating the utility of integrated AI-assisted in silico pipelines in natural product-based anti-infective drug discovery.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/scipharm94020039/s1, Table S1: QSAR Predict.xlsx; Table S2: Bind_affinity.csv; Table S3: Top_100_compounds.csv.

Author Contributions

Conceptualization—S.P., R.R., R.C. and A.S.; methodology, S.P.; software, S.P.; validation, S.P., R.R., R.C. and A.S.; formal analysis, S.P.; investigation, S.P. and A.S.; resources, S.P.; data curation, S.P.; writing—original draft preparation, S.P.; writing—review and editing, A.S.; visualization, S.P., R.R., R.C. and A.S.; supervision, A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.

Acknowledgments

We gratefully acknowledge the Bioinformatics Resources and Applications Facility (BRAF), C-DAC, Pune, for providing high-performance computing (HPC) resources for conducting machine learning algorithms and molecular dynamics simulations.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. World TB Day 2025. Available online: https://www.who.int/campaigns/world-tb-day/2025 (accessed on 9 January 2026).
  2. Global Tuberculosis Report 2025. Available online: https://www.who.int/teams/global-programme-on-tuberculosis-and-lung-health/tb-reports/global-tuberculosis-report-2025 (accessed on 9 January 2026).
  3. Boniface, P.K.; Ferreira, E.I. Opportunities and Challenges for Flavonoids as Potential Leads for the Treatment of Tuberculosis. In Studies in Natural Products Chemistry; Elsevier: Amsterdam, The Netherlands, 2020; Volume 65, pp. 85–124. ISBN 978-0-12-817905-5. [Google Scholar]
  4. Sulis, G.; Centis, R.; Sotgiu, G.; D’Ambrosio, L.; Pontali, E.; Spanevello, A.; Matteelli, A.; Zumla, A.; Migliori, G.B. Recent Developments in the Diagnosis and Management of Tuberculosis. npj Prim. Care Respir. Med. 2016, 26, 16078. [Google Scholar] [CrossRef]
  5. Campaniço, A.; Moreira, R.; Lopes, F. Drug Discovery in Tuberculosis. New Drug Targets and Antimycobacterial Agents. Eur. J. Med. Chem. 2018, 150, 525–545. [Google Scholar] [CrossRef]
  6. Cole, S.T.; Brosch, R.; Parkhill, J.; Garnier, T.; Churcher, C.; Harris, D.; Gordon, S.V.; Eiglmeier, K.; Gas, S.; Barry, C.E.; et al. Deciphering the Biology of Mycobacterium tuberculosis from the Complete Genome Sequence. Nature 1998, 393, 537–544. [Google Scholar] [CrossRef]
  7. Yang, T.; Zhong, J.; Zhang, J.; Li, C.; Yu, X.; Xiao, J.; Jia, X.; Ding, N.; Ma, G.; Wang, G.; et al. Pan-Genomic Study of Mycobacterium tuberculosis Reflecting the Primary/Secondary Genes, Generality/Individuality, and the Interconversion Through Copy Number Variations. Front. Microbiol. 2018, 9, 1886. [Google Scholar] [CrossRef] [PubMed]
  8. Chiliza, T.E.; Pillay, M.; Pillay, B. Identification of Unique Essential Proteins from a Mycobacterium Tuberculosis F15/LAM4/KZN Phage Secretome Library. Pathog. Dis. 2017, 75, ftx001. [Google Scholar] [CrossRef] [PubMed]
  9. Cole, S.T. Mycobacterium Tuberculosis: Drug-Resistance Mechanisms. Trends Microbiol. 1994, 2, 411–415. [Google Scholar] [CrossRef]
  10. Mdluli, K.; Slayden, R.A.; Zhu, Y.; Ramaswamy, S.; Pan, X.; Mead, D.; Crane, D.D.; Musser, J.M.; Barry, C.E. Inhibition of a Mycobacterium Tuberculosis β-Ketoacyl ACP Synthase by Isoniazid. Science 1998, 280, 1607–1610. [Google Scholar] [CrossRef] [PubMed]
  11. Rozwarski, D.A.; Grant, G.A.; Barton, D.H.R.; Jacobs, W.R.; Sacchettini, J.C. Modification of the NADH of the Isoniazid Target (InhA) from Mycobacterium tuberculosis. Science 1998, 279, 98–102. [Google Scholar] [CrossRef]
  12. Harikishore, A.; Grüber, G. Mycobacterium tuberculosis F-ATP Synthase Inhibitors and Targets. Antibiotics 2024, 13, 1169. [Google Scholar] [CrossRef]
  13. Atanasov, A.G.; Zotchev, S.B.; Dirsch, V.M.; the International Natural Product Sciences Taskforce; Orhan, I.E.; Banach, M.; Rollinger, J.M.; Barreca, D.; Weckwerth, W.; Bauer, R.; et al. Natural Products in Drug Discovery: Advances and Opportunities. Nat. Rev. Drug Discov. 2021, 20, 200–216. [Google Scholar] [CrossRef]
  14. Balunas, M.J.; Kinghorn, A.D. Drug Discovery from Medicinal Plants. Life Sci. 2005, 78, 431–441. [Google Scholar] [CrossRef]
  15. d’Avigdor, E.; Wohlmuth, H.; Asfaw, Z.; Awas, T. The Current Status of Knowledge of Herbal Medicine and Medicinal Plants in Fiche, Ethiopia. J. Ethnobiol. Ethnomedicine 2014, 10, 38. [Google Scholar] [CrossRef] [PubMed]
  16. Yuan, H.; Ma, Q.; Ye, L.; Piao, G. The Traditional Medicine and Modern Medicine from Natural Products. Molecules 2016, 21, 559. [Google Scholar] [CrossRef]
  17. Chaachouay, N.; Zidane, L. Plant-Derived Natural Products: A Source for Drug Discovery and Development. Drugs Drug Candidates 2024, 3, 184–207, Correction in Drugs Drug Candidates 2026, 5, 14. https://doi.org/10.3390/ddc5010014. [Google Scholar] [CrossRef]
  18. Lopes, G.; Pinto, E.; Salgueiro, L. Natural Products: An Alternative to Conventional Therapy for Dermatophytosis? Mycopathologia 2017, 182, 143–167. [Google Scholar] [CrossRef]
  19. Swain, P.; Kumari, R.; Biswas, S.; Misra, N.; Kushwaha, G.S.; Suar, M. Plant Natural Products for Antibacterial Drug Development against ESKAPE Pathogens. Microbe 2025, 9, 100610. [Google Scholar] [CrossRef]
  20. Vijaya, K.; Ananthan, S. Therapeutic Efficacy of Medicinal Plants against Experimentally Induced Shigellosis in Guinea Pigs. Indian J. Pharm. Sci. 1996, 58, 191–193. [Google Scholar]
  21. Panche, A.N.; Diwan, A.D.; Chandra, S.R. Flavonoids: An Overview. J. Nutr. Sci. 2016, 5, e47. [Google Scholar] [CrossRef]
  22. Mutha, R.E.; Tatiya, A.U.; Surana, S.J. Flavonoids as Natural Phenolic Compounds and Their Role in Therapeutics: An Overview. Futur. J. Pharm. Sci. 2021, 7, 25. [Google Scholar] [CrossRef]
  23. Chang, Y.; Hawkins, B.A.; Du, J.J.; Groundwater, P.W.; Hibbs, D.E.; Lai, F. A Guide to In Silico Drug Design. Pharmaceutics 2022, 15, 49. [Google Scholar] [CrossRef] [PubMed]
  24. Gan, C.; Jia, X.; Fan, S.; Wang, S.; Jing, W.; Wei, X. Virtual Screening and Molecular Dynamics Simulation to Identify Potential SARS-CoV-2 3CLpro Inhibitors from a Natural Product Compounds Library. Acta Virol. 2023, 67, 12464. [Google Scholar] [CrossRef]
  25. Zeng, T.; Li, J.; Wu, R. Natural Product Databases for Drug Discovery: Features and Applications. Pharm. Sci. Adv. 2024, 2, 100050, Correction in Pharm. Sci. Adv. 2024, 2, 100054. https://doi.org/10.1016/j.pscia.2024.100054. [Google Scholar] [CrossRef] [PubMed]
  26. Chuong, P.-H.; Nguyen, L.A.; He, H. Chiral Drugs. An Overview. Int. J. Biomed. Sci. 2006, 2, 85–100. [Google Scholar] [CrossRef]
  27. Yendapally, R.; Lee, R.E. Design, Synthesis, and Evaluation of Novel Ethambutol Analogues. Bioorganic Med. Chem. Lett. 2008, 18, 1607–1611. [Google Scholar] [CrossRef]
  28. Smith, S.W. Chiral Toxicology: It’s the Same Thing…Only Different. Toxicol. Sci. 2009, 110, 4–30. [Google Scholar] [CrossRef]
  29. Chandrasekhar, V.; Rajan, K.; Kanakam, S.R.S.; Sharma, N.; Weißenborn, V.; Schaub, J.; Steinbeck, C. COCONUT 2.0: A Comprehensive Overhaul and Curation of the Collection of Open Natural Products Database. Nucleic Acids Res. 2025, 53, D634–D643. [Google Scholar] [CrossRef]
  30. Odugbemi, A.I.; Nyirenda, C.; Christoffels, A.; Egieyeh, S.A. Machine Learning Prediction of Intestinal α-Glucosidase Inhibitors Using a Diverse Set of Ligands: A Drug Repurposing Effort with drugBank Database Screening. Silico Pharmacol. 2025, 13, 95. [Google Scholar] [CrossRef]
  31. He, H.; Chen, G.; Chen, C.Y.-C. NHGNN-DTA: A Node-Adaptive Hybrid Graph Neural Network for Interpretable Drug–Target Binding Affinity Prediction. Bioinformatics 2023, 39, btad355. [Google Scholar] [CrossRef] [PubMed]
  32. Friesner, R.A.; Banks, J.L.; Murphy, R.B.; Halgren, T.A.; Klicic, J.J.; Mainz, D.T.; Repasky, M.P.; Knoll, E.H.; Shelley, M.; Perry, J.K.; et al. Glide: A New Approach for Rapid, Accurate Docking and Scoring. 1. Method and Assessment of Docking Accuracy. J. Med. Chem. 2004, 47, 1739–1749. [Google Scholar] [CrossRef] [PubMed]
  33. Friesner, R.A.; Murphy, R.B.; Repasky, M.P.; Frye, L.L.; Greenwood, J.R.; Halgren, T.A.; Sanschagrin, P.C.; Mainz, D.T. Extra Precision Glide:  Docking and Scoring Incorporating a Model of Hydrophobic Enclosure for Protein−Ligand Complexes. J. Med. Chem. 2006, 49, 6177–6196. [Google Scholar] [CrossRef]
  34. Fu, L.; Shi, S.; Yi, J.; Wang, N.; He, Y.; Wu, Z.; Peng, J.; Deng, Y.; Wang, W.; Wu, C.; et al. ADMETlab 3.0: An Updated Comprehensive Online ADMET Prediction Platform Enhanced with Broader Coverage, Improved Performance, API Functionality and Decision Support. Nucleic Acids Res. 2024, 52, W422–W431. [Google Scholar] [CrossRef]
  35. Banerjee, P.; Kemmler, E.; Dunkel, M.; Preissner, R. ProTox 3.0: A Webserver for the Prediction of Toxicity of Chemicals. Nucleic Acids Res. 2024, 52, W513–W520. [Google Scholar] [CrossRef]
  36. Abraham, M.J.; Murtola, T.; Schulz, R.; Páll, S.; Smith, J.C.; Hess, B.; Lindahl, E. GROMACS: High Performance Molecular Simulations through Multi-Level Parallelism from Laptops to Supercomputers. SoftwareX 2015, 1–2, 19–25. [Google Scholar] [CrossRef]
  37. Ahmad, A.; Kaleem, M.; Ahmed, Z.; Shafiq, H. Therapeutic Potential of Flavonoids and Their Mechanism of Action against Microbial and Viral Infections—A Review. Food Res. Int. 2015, 77, 221–235. [Google Scholar] [CrossRef]
  38. Ohemeng, K.A.; Schwender, C.F.; Fu, K.P.; Barrett, J.F. DNA Gyrase Inhibitory and Antibacterial Activity of Some Flavones(1). Bioorganic Med. Chem. Lett. 1993, 3, 225–230. [Google Scholar] [CrossRef]
  39. Ikigai, H.; Nakae, T.; Hara, Y.; Shimamura, T. Bactericidal Catechins Damage the Lipid Bilayer. Biochim. Biophys. Acta Biomembr. 1993, 1147, 132–136. [Google Scholar] [CrossRef]
  40. Plaper, A.; Golob, M.; Hafner, I.; Oblak, M.; Šolmajer, T.; Jerala, R. Characterization of Quercetin Binding Site on DNA Gyrase. Biochem. Biophys. Res. Commun. 2003, 306, 530–536. [Google Scholar] [CrossRef] [PubMed]
  41. Chen, L.-W.; Cheng, M.-J.; Peng, C.-F.; Chen, I.-S. Secondary Metabolites and Antimycobacterial Activities from the Roots of Ficus Nervosa. Chem. Biodivers. 2010, 7, 1814–1821. [Google Scholar] [CrossRef]
  42. Lechner, D.; Gibbons, S.; Bucar, F. Modulation of Isoniazid Susceptibility by Flavonoids in Mycobacterium. Phytochem. Lett. 2008, 1, 71–75. [Google Scholar] [CrossRef]
  43. Almatroudi, A. Integrative Machine Learning, Virtual Screening, and Molecular Modeling for BacA-Targeted Anti-Biofilm Drug Discovery Against Staphylococcal Infections. Crystals 2024, 14, 1057. [Google Scholar] [CrossRef]
  44. Zheng, S.; Gu, Y.; Gu, Y.; Zhao, Y.; Li, L.; Wang, M.; Jiang, R.; Yu, X.; Chen, T.; Li, J. Machine Learning–Enabled Virtual Screening Indicates the Anti-Tuberculosis Activity of Aldoxorubicin and Quarfloxin with Verification by Molecular Docking, Molecular Dynamics Simulations, and Biological Evaluations. Brief. Bioinform. 2025, 26, bbae696. [Google Scholar] [CrossRef] [PubMed]
  45. Yabuuchi, H.; Hayashi, K.; Shigemoto, A.; Fujiwara, M.; Nomura, Y.; Nakashima, M.; Ogusu, T.; Mori, M.; Tokumoto, S.; Miyai, K. Virtual Screening of Antimicrobial Plant Extracts by Machine-Learning Classification of Chemical Compounds in Semantic Space. PLoS ONE 2023, 18, e0285716. [Google Scholar] [CrossRef] [PubMed]
  46. Lin, B.; Yan, S.; Zhen, B. A Machine Learning Method for Predicting Molecular Antimicrobial Activity. Sci. Rep. 2025, 15, 6559. [Google Scholar] [CrossRef]
  47. Ibrahim, M.; Detroja, A.; Bhimani, A.; Bhatt, T.C.; Koradiya, J.; Sanghvi, G.; Bishoyi, A.K. In Silico Discovery of Potential Novel Anti-Tuberculosis Drug Candidates from Phytoconstituents of Chlorophytum borivilianum and Asparagus racemosus. Heliyon 2025, 11, e42859. [Google Scholar] [CrossRef]
  48. Khan, F.; Li, D.; Ahmad, I.; Abbasi, S.W.; Nishan, U.; Sheheryar, S.; Moura, A.A.; Ullah, R.; Ibrahim, M.A.; Shah, M.; et al. Exploring the Genomic Potential of Kytococcus schroeteri for Antibacterial Metabolites against Multi-Drug Resistant Mycobacterium tuberculosis. J. Infect. Public Health 2025, 18, 102598. [Google Scholar] [CrossRef]
  49. Khan, S.A.; Rather, M.A.; Jia, Z.; Qadir, S.M.; Khan, M.U.; Ejaz, H.; Alruwaili, M.; Baughn, A.D.; Thomas Shier, W.; Ahmad, M.S. Discovery of Antitubercular Potential of Trans-3-Indoleacrylic Acid and Its Derivatives Targeting Mycobacterium tuberculosis: A Combined in Vitro and in Silico Investigation. Bioorganic Chem. 2025, 163, 108668. [Google Scholar] [CrossRef] [PubMed]
  50. Elsaman, T.; Mohamed, M.A.; Mohamed, M.S.; Eltayib, E.M.; Abdalla, A.E. Microbial-Based Natural Products as Potential Inhibitors Targeting DNA Gyrase B of Mycobacterium tuberculosis: An In Silico Study. Front. Chem. 2025, 13, 1524607. [Google Scholar] [CrossRef]
  51. Nyijime, T.A.; Shallangwa, G.A.; Uzairu, A.; Umar, A.B.; Ibrahim, M.T.; Kavalapure, R.S.; Ramu, R. In Silico Exploration of 6-Sulfonyl-8-Nitrobenzothiazinone Derivatives as Mycobacterium tuberculosis GyrB Inhibitors: Molecular Docking, Md Simulation, DFT, and Pharmacokinetic Studies. Silico Res. Biomed. 2025, 1, 100018. [Google Scholar] [CrossRef]
  52. Pitaloka, D.A.E.; Arfan, A.; Khairunnisa, S.F.; Megantara, S. In Silico Identification of a Phosphate Marine Steroid from Indonesian Marine Compounds as a Potential Inhibitor of Phosphatidylinositol Mannosyltransferase (PimA) in Mycobacterium tuberculosis. Comput. Biol. Med. 2025, 186, 109677. [Google Scholar] [CrossRef]
  53. Singh, S.; Verma, P.; Gaur, M.; Bhati, L.; Madan, R.; Sharma, P.P.; Rawat, A.; Rathi, B.; Singh, M. In-Silico Development of a Novel TLR2-Mediating Multi-Epitope Vaccine against Mycobacterium tuberculosis. Silico Pharmacol. 2025, 13, 34. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Machine Leaning-based QSAR analysis for COCONUT natural product dataset. (A) Donut chart summarizing dataset of curated compounds showing valid, unique entries; inactive or duplicate compounds; and invalid structures after preprocessing of the COCONUT database. (B) Distribution of predicted bioactivity pIC50 values of the curated compounds, illustrating the distribution of predicted activity and classification into high, moderate, and low potency groups, with the indicated cutoff used for compound prioritization.
Figure 1. Machine Leaning-based QSAR analysis for COCONUT natural product dataset. (A) Donut chart summarizing dataset of curated compounds showing valid, unique entries; inactive or duplicate compounds; and invalid structures after preprocessing of the COCONUT database. (B) Distribution of predicted bioactivity pIC50 values of the curated compounds, illustrating the distribution of predicted activity and classification into high, moderate, and low potency groups, with the indicated cutoff used for compound prioritization.
Scipharm 94 00039 g001
Figure 2. Chemical diversity analysis of screened compounds based on fingerprint similarity. Fingerprint-based pairwise Tanimoto similarity heatmap generated using Morgan fingerprints illustrates the structural diversity of the screened compounds. The predominance of low similarity values (blue regions) indicates a high degree of chemical diversity within the dataset, suggesting broad structural variation among the selected molecules. The diagonal line (red) corresponds to self-comparisons of individual compounds, resulting in a Tanimoto similarity score of 1.0.
Figure 2. Chemical diversity analysis of screened compounds based on fingerprint similarity. Fingerprint-based pairwise Tanimoto similarity heatmap generated using Morgan fingerprints illustrates the structural diversity of the screened compounds. The predominance of low similarity values (blue regions) indicates a high degree of chemical diversity within the dataset, suggesting broad structural variation among the selected molecules. The diagonal line (red) corresponds to self-comparisons of individual compounds, resulting in a Tanimoto similarity score of 1.0.
Scipharm 94 00039 g002
Figure 3. Predicted binding of top-ranked ligands in seven M. tuberculosis protein targets. 2D interaction views of ligand–protein complexes are shown for (A) myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside–2KGW, (B) compound_4–3G1M, (C) compound_21–3LOG, (D) compound_60–5JZX, (E) myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside–5LD8, (F) myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside–5W07, and (G) myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside–8J57.
Figure 3. Predicted binding of top-ranked ligands in seven M. tuberculosis protein targets. 2D interaction views of ligand–protein complexes are shown for (A) myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside–2KGW, (B) compound_4–3G1M, (C) compound_21–3LOG, (D) compound_60–5JZX, (E) myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside–5LD8, (F) myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside–5W07, and (G) myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside–8J57.
Scipharm 94 00039 g003
Figure 4. Representative chemical structures of selected compounds from the screened dataset illustrating structural diversity. (a) compound_4–3-methoxy-N-[2-(4-methoxyphenyl)-4-oxo-4H-chromen-6-yl]benzamide; (b) compound_5–2-methylpyrazin-1-ium-1-olate; (c) compound_7–itaconic anhydride; (d) compound_8–2-(4-oxo-3-phenylspiro[6H-benzo[h]quinazoline-5,1′-cyclohexane]-2-yl)sulfanylacetamide; (e) compound_15–(3R,5R)-5-amino-4,5-dihydro-3H-1,2,4-triazole-3-carboxylic acid; (f) compound_16–(3S)-3-amino-5-hydroxy-3,4-dihydropyrrol-2-one; (g) compound_21–5-hydroxy-3-methyl-4H-imidazol-2-one; (h) compound_22–(5R)-5-methylimidazolidine-2,4-dione; (i) compound_26–2,5-dimethylpyrazine 1-oxide; (j) compound_44–pyrazolo[5,1-b][1,3]oxazin-5-imine; (k) compound_59–(5R)-5-methylcyclohex-2-ene-1,4-dione; (l) compound_60–pyrrolo[3,4-c]oxazin-3-amine; (m) compound_72–4-pentenal; (n) compound_76–2-acetylthiophene; (o) compound_56–moracetin; (p) compound_19–myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside; (q) compound_50–quercetin 3,4′-diglucoside.
Figure 4. Representative chemical structures of selected compounds from the screened dataset illustrating structural diversity. (a) compound_4–3-methoxy-N-[2-(4-methoxyphenyl)-4-oxo-4H-chromen-6-yl]benzamide; (b) compound_5–2-methylpyrazin-1-ium-1-olate; (c) compound_7–itaconic anhydride; (d) compound_8–2-(4-oxo-3-phenylspiro[6H-benzo[h]quinazoline-5,1′-cyclohexane]-2-yl)sulfanylacetamide; (e) compound_15–(3R,5R)-5-amino-4,5-dihydro-3H-1,2,4-triazole-3-carboxylic acid; (f) compound_16–(3S)-3-amino-5-hydroxy-3,4-dihydropyrrol-2-one; (g) compound_21–5-hydroxy-3-methyl-4H-imidazol-2-one; (h) compound_22–(5R)-5-methylimidazolidine-2,4-dione; (i) compound_26–2,5-dimethylpyrazine 1-oxide; (j) compound_44–pyrazolo[5,1-b][1,3]oxazin-5-imine; (k) compound_59–(5R)-5-methylcyclohex-2-ene-1,4-dione; (l) compound_60–pyrrolo[3,4-c]oxazin-3-amine; (m) compound_72–4-pentenal; (n) compound_76–2-acetylthiophene; (o) compound_56–moracetin; (p) compound_19–myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside; (q) compound_50–quercetin 3,4′-diglucoside.
Scipharm 94 00039 g004
Figure 5. ADMET and toxicity profiles of selected compounds. (A) Property-wise clustered heatmap of predicted ADMET parameters for the selected compounds and myricetin derivative, showing relative trends in absorption, distribution, metabolism, excretion, and safety-related properties. (B) Property-wise clustered heatmap of predicted toxicological endpoints, illustrating comparative toxicity risks across the compounds based on standardized scores. Yellow to orange indicate higher predicted toxicity, while purple to black indicate lower toxicity.
Figure 5. ADMET and toxicity profiles of selected compounds. (A) Property-wise clustered heatmap of predicted ADMET parameters for the selected compounds and myricetin derivative, showing relative trends in absorption, distribution, metabolism, excretion, and safety-related properties. (B) Property-wise clustered heatmap of predicted toxicological endpoints, illustrating comparative toxicity risks across the compounds based on standardized scores. Yellow to orange indicate higher predicted toxicity, while purple to black indicate lower toxicity.
Scipharm 94 00039 g005
Figure 6. Predicted binding modes of myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside in seven M. tuberculosis targets visualized using Ligand Designer. (A) 2KGW; (B) 3G1M; (C) 3LOG; (D) 5JZX; (E) 5LD8; (F) 5W07 and (G) 8J57. The ligand is shown in stick representation, and interacting protein residues are labeled. Red-shaded residues indicate hydrogen-bond acceptors, blue-shaded residues indicate hydrogen-bond donors, and green-shaded residues represent hydrophobic contacts. Positively and negatively charged (ionic) contacts are highlighted in blue and orange, respectively, while aromatic ring interactions are marked where present.
Figure 6. Predicted binding modes of myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside in seven M. tuberculosis targets visualized using Ligand Designer. (A) 2KGW; (B) 3G1M; (C) 3LOG; (D) 5JZX; (E) 5LD8; (F) 5W07 and (G) 8J57. The ligand is shown in stick representation, and interacting protein residues are labeled. Red-shaded residues indicate hydrogen-bond acceptors, blue-shaded residues indicate hydrogen-bond donors, and green-shaded residues represent hydrophobic contacts. Positively and negatively charged (ionic) contacts are highlighted in blue and orange, respectively, while aromatic ring interactions are marked where present.
Scipharm 94 00039 g006
Figure 7. Molecular dynamics simulation analysis of myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside bound to M. tuberculosis protein targets. (A) Ligand-based RMSD profiles of the 100 ns simulation time, illustrating the structural stability with seven M. tuberculosis protein targets. (B) Residue-wise RMSF profiles of the corresponding complexes, highlighting flexibility patterns of protein residues during the simulation.
Figure 7. Molecular dynamics simulation analysis of myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside bound to M. tuberculosis protein targets. (A) Ligand-based RMSD profiles of the 100 ns simulation time, illustrating the structural stability with seven M. tuberculosis protein targets. (B) Residue-wise RMSF profiles of the corresponding complexes, highlighting flexibility patterns of protein residues during the simulation.
Scipharm 94 00039 g007
Figure 8. Free energy landscape (FEL) analysis of myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside bound to selected M. tuberculosis protein targets. (AG) Two-dimensional FEL plots generated from molecular dynamics simulation trajectories using ligand RMSD and protein backbone RMSD as reaction coordinates. (A) 2KGW; (B) 3G1M; (C) 3LOG; (D) 5JZX; (E) 5LD8; (F) 5W07; and (G) 8J57. The color gradients represent relative free energy (ΔG), where dark purple regions indicate low-energy, thermodynamically favorable conformational basins associated with stable ligand binding, while yellow/orange regions correspond to higher-energy, less favorable or transient conformational states. The distribution and depth of the energy minima provide insight into the conformational stability and dynamic behavior of the protein–ligand complexes throughout the simulation.
Figure 8. Free energy landscape (FEL) analysis of myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside bound to selected M. tuberculosis protein targets. (AG) Two-dimensional FEL plots generated from molecular dynamics simulation trajectories using ligand RMSD and protein backbone RMSD as reaction coordinates. (A) 2KGW; (B) 3G1M; (C) 3LOG; (D) 5JZX; (E) 5LD8; (F) 5W07; and (G) 8J57. The color gradients represent relative free energy (ΔG), where dark purple regions indicate low-energy, thermodynamically favorable conformational basins associated with stable ligand binding, while yellow/orange regions correspond to higher-energy, less favorable or transient conformational states. The distribution and depth of the energy minima provide insight into the conformational stability and dynamic behavior of the protein–ligand complexes throughout the simulation.
Scipharm 94 00039 g008
Figure 9. Three-dimensional free energy landscapes derived from principal component analysis of molecular dynamics trajectories. (AG) 3D free energy surfaces plotted along the first two principal components (PC1 and PC2) for myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside bound to seven M. tuberculosis protein targets. (A) 2KGW; (B) 3G1M; (C) 3LOG; (D) 5JZX; (E) 5LD8; (F) 5W07 and (G) 8J57. The color scale represents relative free energy (ΔG), with low-energy basins indicating stable conformational states sampled during the simulations.
Figure 9. Three-dimensional free energy landscapes derived from principal component analysis of molecular dynamics trajectories. (AG) 3D free energy surfaces plotted along the first two principal components (PC1 and PC2) for myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside bound to seven M. tuberculosis protein targets. (A) 2KGW; (B) 3G1M; (C) 3LOG; (D) 5JZX; (E) 5LD8; (F) 5W07 and (G) 8J57. The color scale represents relative free energy (ΔG), with low-energy basins indicating stable conformational states sampled during the simulations.
Scipharm 94 00039 g009
Table 1. Docking results of top rank natural products against M. tuberculosis proteins.
Table 1. Docking results of top rank natural products against M. tuberculosis proteins.
Ligand NameProtein IDDocking Score kcal/molGlide Energy kcal/molMM-GBSA
ΔG
kcal/mol
Prime Energy
kcal/mol
myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside2KGW−5.445−48.737−30.92−4999.5
compound_44−5.303−20.476−21.29−4864.8
compound_59−5.172−16.736−19.10−4888.6
compound_43G1M−8.353−32.181−104.85−8376.2
compound_26−7.534−30.77−31.79−8208.7
compound_76−7.515−26.331−53.17−8180.1
myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside−7.514−48.737−49.68−8334.9
compound_213LOG−7.062−23.815−58.65−18,763.17
compound_22−6.437−25.123−9.21−18,714.51
compound_7−6.406−19.197−27.91−18,706.29
myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside−6.225−48.737−31.25−18,659.81
compound_605JZX−5.82−22.49−19.24−12,170.14
compound_44−5.55−21.02−19.03−12,160.16
compound_72−5.38−19.57−17.66−12,214.68
myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside−5.1−17.22−19.72−12,295.42
myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside5LD8−8.18−88.56−81.92−16,842.4
compound_8−7.522−55.938−33.64−16,815.8
Moracetin−6.812−78.124−46.30−16,816.9
myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside5W07−9.312−74.094−79.39−10,710.53
Quercetin 3,4′-diglucoside−9.154−66.274−67.53−10,699.35
compound_5−8.507−30.636−21.63−10,531.34
myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside8J57−6.706−71.327−51.17−8799.18
compound_15−6.46−24.852−14.64−8677.73
compound_16−5.956−27.06−26.47−8698.9
Table 2. Comparative ADMET summary of selected natural products.
Table 2. Comparative ADMET summary of selected natural products.
ADMET PropertyCompound_5Compound_21Compound_26Compound_44Compound_60Myricetin 3-Rhamnosyl-(1→3)-glucosyl-(1→6)-glucoside
Human Intestinal Absorption (HIA)InactiveInactiveInactiveInactiveInactiveInactive
BBB PermeabilityInactiveActiveInactiveInactiveActiveInactive
CYP1A2 InhibitorInactiveInactiveInactiveActiveActiveInactive
CYP3A4 InhibitorActiveInactiveInactiveInactiveActiveInactive
P-gp SubstrateInactiveInactiveInactiveInactiveInactiveActive
Half-life (T1/2)ActiveActiveActiveActiveActiveActive
Ames ToxicityActiveActiveInactiveActiveActiveActive
hERG InhibitionActiveInactiveInactiveInactiveInactiveInactive
Oral BioavailabilityInactiveActiveActiveActiveActiveInactive
Respiratory ToxicityActiveActiveActiveActiveActiveInactive
Active—predicted toxic effect present; Inactive—absence of toxicity.
Table 3. Summary of ProTox 3.0 toxicity predictions of selected natural products.
Table 3. Summary of ProTox 3.0 toxicity predictions of selected natural products.
CompoundNephro-ToxicityCardio-ToxicityImmuno-ToxicityMutagenicityNutritional ToxicityAhR ActiveTTR ActiveCYP2C9 Active
Compound_5ActiveInactiveInactiveActiveInactiveActiveInactiveActive
Compound_21InactiveInactiveInactiveInactiveInactiveInactiveInactiveInactive
Compound_26InactiveInactiveInactiveInactiveInactiveInactiveInactiveInactive
Compound_44InactiveInactiveInactiveInactiveInactiveInactiveInactiveInactive
Compound_60InactiveInactiveInactiveInactiveInactiveInactiveInactiveInactive
myricetin 3-rhamnosyl-(1→3)-glucosyl-(1→6)-glucosideInactiveInactiveActiveInactiveInactiveInactiveActiveInactive
Active—predicted toxic effect present; Inactive—absence of toxicity.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Periasamy, S.; Ramasamy, R.; Chinnaiyan, R.; Sridhar, A. Machine Learning Integration of In-Silico QSAR, Graph Neural Networks and Docking Reveal Natural Products Inhibitors Against Mycobacterium tuberculosis. Sci. Pharm. 2026, 94, 39. https://doi.org/10.3390/scipharm94020039

AMA Style

Periasamy S, Ramasamy R, Chinnaiyan R, Sridhar A. Machine Learning Integration of In-Silico QSAR, Graph Neural Networks and Docking Reveal Natural Products Inhibitors Against Mycobacterium tuberculosis. Scientia Pharmaceutica. 2026; 94(2):39. https://doi.org/10.3390/scipharm94020039

Chicago/Turabian Style

Periasamy, Sakthidhasan, Rajesh Ramasamy, Rajasekar Chinnaiyan, and Arun Sridhar. 2026. "Machine Learning Integration of In-Silico QSAR, Graph Neural Networks and Docking Reveal Natural Products Inhibitors Against Mycobacterium tuberculosis" Scientia Pharmaceutica 94, no. 2: 39. https://doi.org/10.3390/scipharm94020039

APA Style

Periasamy, S., Ramasamy, R., Chinnaiyan, R., & Sridhar, A. (2026). Machine Learning Integration of In-Silico QSAR, Graph Neural Networks and Docking Reveal Natural Products Inhibitors Against Mycobacterium tuberculosis. Scientia Pharmaceutica, 94(2), 39. https://doi.org/10.3390/scipharm94020039

Article Metrics

Back to TopTop