Mutational Signatures and Machine Learning for Risk Stratification of Acute Myeloid Leukaemia Based on Targeted Sequencing Data
Simple Summary
Abstract
1. Introduction
2. Materials and Methods
Molecular Annotation
3. Results
3.1. TS of the AML-NCRI Patients Overlapped with the TS of the AML-SG and the WGS/WES of TCGA
3.2. Risk Stratification of AML Patients Using MS Analysis of WGS/WES and TS Data
3.3. The Changes in the Risk Assessment from ELN2017 to ELN 2022 of AML-NCRI Were Recapitulated by the NNMF Analysis
3.4. RSs Can Differentiate AML Patients According to Their Response to Induction CTX
3.5. The ML Algorithm RF Uses Encoded Genomic Information to Make Patient-Level Therapy Predictions
3.6. Germline Mutations and Balanced Data Increased the Prediction Efficiency of RF
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Stelmach, P.; Trumpp, A. Leukemic stem cells and therapy resistance in acute myeloid leukemia. Haematologica 2023, 108, 353–366. [Google Scholar] [CrossRef]
- Stubbins, R.J.; Karsan, A. Differentiation therapy for myeloid malignancies: Beyond cytotoxicity. Blood Cancer J. 2021, 11, 193. [Google Scholar] [CrossRef]
- Khwaja, A.; Bjorkholm, M.; Gale, R.E.; Levine, R.L.; Jordan, C.T.; Ehninger, G.; Bloomfield, C.D.; Estey, E.; Burnett, A.; Cornelissen, J.J.; et al. Acute myeloid leukaemia. Nat. Rev. Dis. Prim. 2016, 2, 16010. [Google Scholar] [CrossRef] [PubMed]
- Bhatt, V.R.; Shostrom, V.; Giri, S.; Gundabolu, K.; Monirul Islam, K.M.; Appelbaum, F.R.; Maness, L.J. Early mortality and overall survival of acute myeloid leukemia based on facility type. Am. J. Hematol. 2017, 92, 764–771. [Google Scholar] [CrossRef] [PubMed]
- Burnett, A.; Wetzler, M.; Löwenberg, B. Therapeutic advances in acute myeloid leukemia. J. Clin. Oncol. 2011, 29, 487–494. [Google Scholar] [CrossRef] [PubMed]
- Hu, F.; Wang, Y.; Wang, W.D.; Gale, R.P.; Wu, B.Y.; Liang, Y. Improving prediction accuracy in acute myeloid leukaemia: Micro-environment, immune and metabolic models. Leukemia 2021, 35, 3073–3077. [Google Scholar] [CrossRef]
- Walter, R.B.; Othus, M.; Burnett, A.K.; Löwenberg, B.; Kantarjian, H.M.; Ossenkoppele, G.J.; Estey, E.H.; Döhner, H.; Ravandi, F.; Cornelissen, J.J.; et al. Resistance prediction in AML: Analysis of 4601 patients from MRC/NCRI, HOVON/SAKK, SWOG, and MD Anderson Cancer Center. Leukemia 2015, 29, 312–320. [Google Scholar] [CrossRef]
- Ng, S.W.K.; Mitchell, A.; Kennedy, J.A.; Chen, W.C.; McLeod, J.; Ibrahimova, N.; Arruda, A.; Popescu, A.; Gupta, V.; Schimmer, A.D.; et al. A 17-gene stemness score for rapid determination of risk in acute leukaemia. Nature 2016, 540, 433–437. [Google Scholar] [CrossRef]
- Estey, E.; Gale, R.P. How good are we at predicting the fate of someone with acute myeloid leukaemia? Leukemia 2017, 31, 1255–1258. [Google Scholar] [CrossRef]
- Alexandrov, L.; Nik-Zainal, S.; Wedge, D.; Campbell, P.; Stratton, M. Deciphering signatures of mutational processes operative in human cancer. Cell Rep. 2013, 3, 246–259. [Google Scholar] [CrossRef]
- Alexandrov, L.B.; Kim, J.; Haradhvala, N.J.; Huang, M.N.; Ng, A.W.T.; Wu, Y.; Boot, A.; Covington, K.R.; Gordenin, D.A.; Bergstrom, E.N.; et al. The repertoire of mutational signatures in human cancer. Nature 2020, 578, 94–101. [Google Scholar] [CrossRef]
- Lee, D.D.; Seung, H.S. Learning the parts of objects by non-negative matrix factorization. Nature 1999, 401, 788–791. [Google Scholar] [CrossRef]
- Tazi, Y.; Arango-Ossa, J.E.; Zhou, Y.; Bernard, E.; Thomas, I.; Gilkes, A.; Ossa, H.; Turker, F.; Wang, S.A.; Ravandi, F.; et al. Unified classification and risk-stratification in acute myeloid leukemia. Nat. Commun. 2022, 13, 4622. [Google Scholar] [CrossRef]
- Gerstung, M.; Papaemmanuil, E.; Martincorena, I.; Bullinger, L.; Gaidzik, V.I.; Paschka, P.; Heuser, M.; Thol, F.; Bolli, N.; Ganly, P.; et al. Precision oncology for acute myeloid leukemia using a knowledge bank approach. Nat. Genet. 2017, 49, 332–340. [Google Scholar] [CrossRef] [PubMed]
- Papaemmanuil, E.; Gerstung, M.; Bullinger, L.; Gaidzik, V.I.; Paschka, P.; Roberts, N.D.; Potter, N.E.; Heuser, M.; Thol, F.; Bolli, N.; et al. Genomic classification and prognosis in acute myeloid leukemia. N. Engl. J. Med. 2016, 374, 2209–2221. [Google Scholar] [CrossRef] [PubMed]
- Cancer Genome Atlas Research Network. Genomic and Epigenomic Landscapes of Adult de Novo Acute Myeloid Leukemia. National Cancer Institute, NIH. Available online: https://www.cancer.gov/ccg/research/genome-sequencing/tcga (accessed on 4 June 2026).
- Yang, S.; Zheng, X.; Yuan, F. A patient outcome prediction based on random forest. In Proceedings of the 4th International Conference on Crowd Science and Engineering (ICCSE’19); Association for Computing Machinery: New York, NY, USA, 2019; pp. 220–227. [Google Scholar] [CrossRef]
- Wu, Y.; Misra, S. Intelligent image segmentation for organic-rich shales using random forest, wavelet transform, and Hessian matrix. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1144–1147. [Google Scholar] [CrossRef]
- Kantarjian, H.; Kadia, T.; DiNardo, C.; Daver, N.; Borthakur, G.; Jabbour, E.; Garcia-Manero, G.; Konopleva, M.; Ravandi, F.; Andreeff, M.; et al. Acute myeloid leukemia: Current progress and future directions. Blood Cancer J. 2021, 11, 41. [Google Scholar] [CrossRef]
- Li, H.; Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009, 25, 1754–1760. [Google Scholar] [CrossRef] [PubMed]
- Jones, D.; Raine, K.M.; Davies, T.; Patrick, S.; Butler, A.P.; Teague, J.W.; Nik-Zainal, S.; Campbell, P.J. cgpCaVEManWrapper: Simple execution of CaVEMan in order to detect somatic single nucleotide variants in NGS data. Curr. Protoc. Bioinform. 2016, 56, 15.10.1–15.10.18. [Google Scholar] [CrossRef]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Zhou, T.; Rong, J.; Liu, Y.; Gong, W.; Li, C. An ensemble approach to predict binding hotspots in protein-RNA interactions based on SMOTE data balancing and random grouping feature selection strategies. Bioinformatics 2022, 38, 2452–2458. [Google Scholar] [CrossRef] [PubMed]
- Fernández, A.; García, S.; Herrera, F.; Chawla, N.V. SMOTE for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary. J. Artif. Intell. Res. 2018, 61, 863–905. [Google Scholar] [CrossRef]
- Döhner, H.; Estey, E.; Grimwade, D.; Amadori, S.; Appelbaum, F.R.; Büchner, T.; Dombret, H.; Ebert, B.L.; Fenaux, P.; Larson, R.A.; et al. Diagnosis and management of AML in adults: 2017 ELN recommendations from an international expert panel. Blood 2017, 129, 424–447. [Google Scholar] [CrossRef] [PubMed]
- Döhner, H.; Wei, A.H.; Appelbaum, F.R.; Craddock, C.; DiNardo, C.D.; Dombret, H.; Ebert, B.L.; Fenaux, P.; Godley, L.A.; Hasserjian, R.P.; et al. Diagnosis and management of AML in adults: 2022 recommendations from an international expert panel on behalf of the ELN. Blood 2022, 140, 1345–1377. [Google Scholar] [CrossRef]
- Lachowiez, C.A.; Long, N.; Saultz, J.; Gandhi, A.; Newell, L.F.; Hayes-Lattin, B.; Walter, R.B.; Patnaik, M.M.; Othus, M.; Lancet, J.E.; et al. Comparison and validation of the 2022 European LeukemiaNet guidelines in acute myeloid leukemia. Blood Adv. 2023, 7, 1899–1909. [Google Scholar] [CrossRef] [PubMed]
- Klco, J.M.; Mullighan, C.G. Advances in germline predisposition to acute leukaemias and myeloid neoplasms. Nat. Rev. Cancer 2021, 21, 122–137. [Google Scholar] [CrossRef]
- Ahmed, H.A.; Hameed, A.; Bawany, N.Z. Network intrusion detection using oversampling technique and machine learning algorithms. PeerJ Comput. Sci. 2022, 8, e820. [Google Scholar] [CrossRef]
- Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J. Random forests for classification in ecology. Ecology 2007, 88, 2783–2792. [Google Scholar] [CrossRef]
- Koh, G.; Degasperi, A.; Zou, X.; Momen, S.; Nik-Zainal, S. Mutational signatures: Emerging concepts, caveats and clinical applications. Nat. Rev. Cancer 2021, 21, 619–637. [Google Scholar] [CrossRef]
- Sagi, L.; Trau, H. The Koebner phenomenon. Clin. Dermatol. 2011, 29, 231–236. [Google Scholar] [CrossRef]
- Diani, M.; Cozzi, C.; Altomare, G. Heinrich Koebner and his phenomenon. JAMA Dermatol. 2016, 152, 919. [Google Scholar] [CrossRef] [PubMed]
- Forbes, S.A.; Beare, D.; Boutselakis, H.; Bamford, S.; Bindal, N.; Tate, J.; Cole, C.G.; Ward, S.; Dawson, E.; Ponting, L.; et al. COSMIC: Somatic cancer genetics at high-resolution. Nucleic Acids Res. 2017, 45, D777–D783. [Google Scholar] [CrossRef] [PubMed]
- Lai, B.; Lai, Y.; Zhang, Y.; Zhou, M.; OuYang, G. Survival prediction in acute myeloid leukemia using gene expression profiling. BMC Med. Inform. Decis. Mak. 2022, 22, 57. [Google Scholar] [CrossRef]
- Li, J.; Tian, Y.; Zhu, Y.; Zhou, T.; Li, J.; Ding, K. A multicenter random forest model for effective prognosis prediction in collaborative clinical research network. Artif. Intell. Med. 2020, 103, 101814. [Google Scholar] [CrossRef] [PubMed]





Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Elhaddad, H.; Chiriches, C.; Nandi, S.P.; van Eijk, P.; Gilkes, A.; Watts, K.; Houseman, A.; Wilhelm-Benartzi, C.S.; Ottmann, O.G.; Reed, S.H.; et al. Mutational Signatures and Machine Learning for Risk Stratification of Acute Myeloid Leukaemia Based on Targeted Sequencing Data. Cancers 2026, 18, 1925. https://doi.org/10.3390/cancers18121925
Elhaddad H, Chiriches C, Nandi SP, van Eijk P, Gilkes A, Watts K, Houseman A, Wilhelm-Benartzi CS, Ottmann OG, Reed SH, et al. Mutational Signatures and Machine Learning for Risk Stratification of Acute Myeloid Leukaemia Based on Targeted Sequencing Data. Cancers. 2026; 18(12):1925. https://doi.org/10.3390/cancers18121925
Chicago/Turabian StyleElhaddad, Heba, Claudia Chiriches, Shuvro Prokash Nandi, Patrick van Eijk, Amanda Gilkes, Katie Watts, Amy Houseman, Charlotte S. Wilhelm-Benartzi, Oliver Gerhard Ottmann, Simon H. Reed, and et al. 2026. "Mutational Signatures and Machine Learning for Risk Stratification of Acute Myeloid Leukaemia Based on Targeted Sequencing Data" Cancers 18, no. 12: 1925. https://doi.org/10.3390/cancers18121925
APA StyleElhaddad, H., Chiriches, C., Nandi, S. P., van Eijk, P., Gilkes, A., Watts, K., Houseman, A., Wilhelm-Benartzi, C. S., Ottmann, O. G., Reed, S. H., & Ruthardt, M. (2026). Mutational Signatures and Machine Learning for Risk Stratification of Acute Myeloid Leukaemia Based on Targeted Sequencing Data. Cancers, 18(12), 1925. https://doi.org/10.3390/cancers18121925

