TransQSAR-pf: A Bio-Informed QSAR Framework Using Plasmodium falciparum Stress Signatures for Enhanced Antiplasmodial Activity Prediction †
Abstract
1. Introduction
2. Materials and Methods
2.1. Transcriptomic Data Acquisition and Processing
2.2. Differential Expression Analysis
2.3. Gene Set Enrichment Analysis (GSEA) and Gene Set Curation
2.4. Transcriptomic Feature Engineering
2.5. QSAR Dataset and Molecular Descriptors
2.6. Feature Integration Strategy
2.7. Boruta Feature Selection
2.8. Machine Learning Modeling
2.9. Feature Importance Aggregation and Reporting
3. Results
3.1. Transcriptomic Landscape of Chloroquine Stress
3.2. Boruta Feature Selection Identifies 13 Critical Predictors
3.3. TransQSAR-pf Framework Generates an Interpretable Bio-Informed Model
3.4. Biological Feature Importance Distribution
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
References
- World Malaria Report 2023. Available online: https://www.who.int/teams/global-malaria-programme/reports/world-malaria-report-2023 (accessed on 19 October 2025).
- Tropsha, A. Best Practices for QSAR Model Development, Validation, and Exploitation. Mol. Inform. 2010, 29, 476–488. [Google Scholar] [CrossRef] [PubMed]
- Apeh, I.S.; Ayoka, T.O.; Nnadi, C.O.; Obonga, W.O. Modeling the Quantitative Structure–Activity Relationships of 1,2,4-Triazolo[1,5-a]Pyrimidin-7-Amine Analogs in the Inhibition of Plasmodium Falciparum. Eng. Proc. 2025, 87, 52. [Google Scholar] [CrossRef]
- Subramanian, A.; Narayan, R.; Corsello, S.M.; Peck, D.D.; Natoli, T.E.; Lu, X.; Gould, J.; Davis, J.F.; Tubelli, A.A.; Asiedu, J.K.; et al. A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles. Cell 2017, 171, 1437–1452.e17. [Google Scholar] [CrossRef]
- Pabon, N.A.; Xia, Y.; Estabrooks, S.K.; Ye, Z.; Herbrand, A.K.; Süß, E.; Biondi, R.M.; Assimon, V.A.; Gestwicki, J.E.; Brodsky, J.L.; et al. Predicting Protein Targets for Drug-like Compounds Using Transcriptomics. PLoS Comput. Biol. 2018, 14, e1006651. [Google Scholar] [CrossRef]
- Baillif, B.; Wichard, J.; Méndez-Lucio, O.; Rouquié, D. Exploring the Use of Compound-Induced Transcriptomic Data Generated from Cell Lines to Predict Compound Activity Toward Molecular Targets. Front. Chem. 2020, 8, 296. [Google Scholar] [CrossRef]
- Verbist, B.; Klambauer, G.; Vervoort, L.; Talloen, W.; Shkedy, Z.; Thas, O.; Bender, A.; Göhlmann, H.W.H.; Hochreiter, S. Using Transcriptomics to Guide Lead Optimization in Drug Discovery Projects: Lessons Learned from the QSTAR Project. Drug Discov. Today 2015, 20, 505–513. [Google Scholar] [CrossRef]
- Ha, S.V.; Jaensch, S.; Kańduła, M.M.; Herman, D.; Czodrowski, P.; Ceulemans, H. Cross Modality Learning of Cell Painting and Transcriptomics Data Improves Mechanism of Action Clustering and Bioactivity Modelling. Sci. Rep. 2025, 15, 23010. [Google Scholar] [CrossRef]
- The Transcriptome of the Intraerythrocytic Developmental Cycle of Plasmodium Falciparum. PLoS Biol. 2003, 1, e5. Available online: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.0000005 (accessed on 19 October 2025).
- Le Roch, K.G.; Zhou, Y.; Blair, P.L.; Grainger, M.; Moch, J.K.; Haynes, J.D.; De La Vega, P.; Holder, A.A.; Batalov, S.; Carucci, D.J.; et al. Discovery of Gene Function by Expression Profiling of the Malaria Parasite Life Cycle. Science 2003, 301, 1503–1508. [Google Scholar] [CrossRef]
- Ritchie, M.E.; Phipson, B.; Wu, D.; Hu, Y.; Law, C.W.; Shi, W.; Smyth, G.K. Limma Powers Differential Expression Analyses for RNA-Sequencing and Microarray Studies. Nucleic Acids Res. 2015, 43, e47. [Google Scholar] [CrossRef] [PubMed]
- Gene Set Enrichment Analysis: A Knowledge-Based Approach for Interpreting Genome-Wide Expression Profiles. Proc. Natl. Acad. Sci. USA 2005, 102, 15545–15550. Available online: https://www.pnas.org/doi/10.1073/pnas.0506580102 (accessed on 19 October 2025). [CrossRef] [PubMed]
- Kursa, M.B.; Rudnicki, W.R. Feature Selection with the Boruta Package. J. Stat. Soft. 2010, 36, 1–13. [Google Scholar] [CrossRef]
- Jiang, H.; Patel, J.J.; Yi, M.; Mu, J.; Ding, J.; Stephens, R.; Cooper, R.A.; Ferdig, M.T.; Su, X.Z. Genome-Wide Compensatory Changes Accompany Drug- Selected Mutations in the Plasmodium Falciparum Crt Gene. PLoS ONE 2008, 3, e2484. [Google Scholar] [CrossRef]
- Irizarry, R.A.; Hobbs, B.; Collin, F.; Beazer-Barclay, Y.D.; Antonellis, K.J.; Scherf, U.; Speed, T.P. Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data. Biostatistics 2003, 4, 249–264. [Google Scholar] [CrossRef]
- Korotkevich, G.; Sukhov, V.; Budin, N.; Shpak, B.; Artyomov, M.N.; Sergushichev, A. Fast Gene Set Enrichment Analysis. bioRxiv 2019. [Google Scholar] [CrossRef]
- Amos, B.; Aurrecoechea, C.; Barba, M.; Barreto, A.; Basenko, E.Y.; Bażant, W.; Belnap, R.; Blevins, A.S.; Böhme, U.; Brestelli, J.; et al. VEuPathDB: The Eukaryotic Pathogen, Vector and Host Bioinformatics Resource Center. Nucleic Acids Res. 2022, 50, D898–D911. [Google Scholar] [CrossRef]
- Landrum, G. RDKit: Open-Source Cheminformatics Software, Version 2024.03.5; RDKit Contributors. Available online: https://www.rdkit.org (accessed on 19 October 2025).
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Scikit-Learn: Machine Learning in Python. Available online: https://www.jmlr.org/papers/v12/pedregosa11a.html (accessed on 19 October 2025).
- Zou, H.; Hastie, T. Regularization and Variable Selection Via the Elastic Net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef]
- Zhang, M.; Wang, C.; Otto, T.D.; Oberstaller, J.; Liao, X.; Adapa, S.R.; Udenze, K.; Bronner, I.F.; Cassandra, D.; Mayho, M.; et al. Uncovering the Essential Genome of the Human Malaria Parasite Plasmodium Falciparum by Saturation Mutagenesis. Science 2018, 360, eaap7847. [Google Scholar] [CrossRef]
- Ali, F.; Wali, H.; Jan, S.; Zia, A.; Aslam, M.; Ahmad, I.; Afridi, S.G.; Shams, S.; Khan, A. Analysing the Essential Proteins Set of Plasmodium Falciparum PF3D7 for Novel Drug Targets Identification against Malaria. Malar. J. 2021, 20, 335. [Google Scholar] [CrossRef]
- Singh, G.; Gupta, D. In-Silico Functional Annotation of Plasmodium Falciparum Hypothetical Proteins to Identify Novel Drug Targets. Front. Genet. 2022, 13, 821516. [Google Scholar] [CrossRef] [PubMed]
- Hillier, C.; Pardo, M.; Yu, L.; Bushell, E.; Sanderson, T.; Metcalf, T.; Herd, C.; Anar, B.; Rayner, J.C.; Billker, O.; et al. Landscape of the Plasmodium Interactome Reveals Both Conserved and Species-Specific Functionality. Cell Rep. 2019, 28, 1635–1647.e5. [Google Scholar] [CrossRef] [PubMed]
- Panda, M.; Srivastava, V.; Singh, S.; Prusty, D. Unveiling Prospective Therapeutic Potential of Conserved Hypothetical Plasmodium Falciparum Proteins by Using Integrated Proteo Genomic Annotation and In-Silico Therapeutic Discovery Approach. Protein J. 2025, 44, 437–463. [Google Scholar] [CrossRef]
- Hadjimichael, E.; Deitsch, K.W. Variable Surface Antigen Expression, Virulence, and Persistent Infection by Plasmodium Falciparum Malaria Parasites. Microbiol. Mol. Biol. Rev. 2025, 89, e00114-23. [Google Scholar] [CrossRef]
- Silva, M.; Malmberg, M.; Otienoburu, S.D.; Björkman, A.; Ngasala, B.; Mårtensson, A.; Gil, J.P.; Veiga, M.I. Plasmodium Falciparum Drug Resistance Genes Pfmdr1 and Pfcrt In Vivo Co-Expression During Artemether-Lumefantrine Therapy. Front. Pharmacol. 2022, 13, 868723. [Google Scholar] [CrossRef] [PubMed]
- Ye, C.; Ho, D.J.; Neri, M.; Yang, C.; Kulkarni, T.; Randhawa, R.; Henault, M.; Mostacci, N.; Farmer, P.; Renner, S.; et al. DRUG-Seq for Miniaturized High-Throughput Transcriptome Profiling in Drug Discovery. Nat. Commun. 2018, 9, 4307. [Google Scholar] [CrossRef]



| Rank | Feature | Importance | Category | Context |
|---|---|---|---|---|
| 1 | CQ_WT_DE_40 | 9.43 | CUF | CQ response in wild-type |
| 2 | CQ_WT_DE_128 | 5.31 | DR | CQ response (logFC = −0.304) |
| 3 | Variability_Pf.12.198.0 | 4.61 | CUF | Strain variability |
| 4 | CQ_WT_DE_169 | 4.54 | CUF | CQ response |
| 5 | Genotype_76I_DE_88 | 4.17 | U | Genotype difference (logFC = 2.179) |
| 6 | Variability_Pf.13_1.443.0 | 4.11 | CUF | Strain variability |
| 7 | CQ_WT_DE_79 | 3.29 | CUF | CQ response |
| 8 | Variability_Pf.2.13.0 | 3.23 | CUF | Strain variability |
| 9 | Genotype_76I_DE_92 | 2.33 | U | Genotype difference (logFC = 0.748) |
| 10 | Genotype_76I_DE_184 | 1.97 | CUF | Genotype difference |
| 11 | Genotype_76I_DE_98 | 1.97 | U | Genotype difference (logFC = 1.096) |
| 12 | Variability_Pf.5.281.0 | 1.83 | CUF | Strain variability |
| 13 | CQ_WT_DE_135 | 1.00 | CUF | CQ response |
| Model Stage | R2 Train | R2 Test | RMSE | Features | Notes |
|---|---|---|---|---|---|
| QSAR-only RF | 0.812 | 0.719 | 0.529 | 15 | Baseline (modest fit, good generalization) |
| RF Tuned | 0.974 | 0.602 | 0.653 | 779 | Naive Integration (poor generalization) |
| RF Boruta-Selected | 0.962 | 0.762 | 0.470 | 28 | TransQSAR-pf (best model) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Igwezeke, F.O.; Nnadi, C.O. TransQSAR-pf: A Bio-Informed QSAR Framework Using Plasmodium falciparum Stress Signatures for Enhanced Antiplasmodial Activity Prediction. Eng. Proc. 2026, 124, 37. https://doi.org/10.3390/engproc2026124037
Igwezeke FO, Nnadi CO. TransQSAR-pf: A Bio-Informed QSAR Framework Using Plasmodium falciparum Stress Signatures for Enhanced Antiplasmodial Activity Prediction. Engineering Proceedings. 2026; 124(1):37. https://doi.org/10.3390/engproc2026124037
Chicago/Turabian StyleIgwezeke, Favour O., and Charles O. Nnadi. 2026. "TransQSAR-pf: A Bio-Informed QSAR Framework Using Plasmodium falciparum Stress Signatures for Enhanced Antiplasmodial Activity Prediction" Engineering Proceedings 124, no. 1: 37. https://doi.org/10.3390/engproc2026124037
APA StyleIgwezeke, F. O., & Nnadi, C. O. (2026). TransQSAR-pf: A Bio-Informed QSAR Framework Using Plasmodium falciparum Stress Signatures for Enhanced Antiplasmodial Activity Prediction. Engineering Proceedings, 124(1), 37. https://doi.org/10.3390/engproc2026124037

