Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration
Abstract
:1. Introduction
2. Materials and Methods
2.1. Sparse Data Generation
2.2. Baseline/Original Model and Additional Implementation
2.3. Accuracy Calculation
2.4. Data Augmentation
2.5. Data Integration
2.6. Feature Importance Assessment
3. Results
3.1. Assessment of State-of-the-Art Deep Learning Model on Sparse Somatic Mutation Data
3.2. Data Augmentation
3.3. Data Integration
3.4. Assessment of Top 1, 2 and 3 Ranked Predictions
3.5. Feature Importance Assessment
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Goldie, S.J.; Chincarini, G.; Darido, C. Targeted Therapy Against the Cell of Origin in Cutaneous Squamous Cell Carcinoma. Int. J. Mol. Sci. 2019, 20, 2201. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Greco, F.A. Molecular diagnosis of the tissue of origin in cancer of unknown primary site: Useful in patient management. Curr. Treat. Options Oncol. 2013, 14, 634–642. [Google Scholar] [CrossRef] [PubMed]
- Pavlidis, N.; Khaled, H.; Gaafar, R. A mini review on cancer of unknown primary site: A clinical puzzle for the oncologists. J. Adv. Res. 2015, 6, 375–382. [Google Scholar] [CrossRef] [Green Version]
- Salvadores, M.; Mas-Ponte, D.; Supek, F. Passenger mutations accurately classify human tumors. PLoS Comput. Biol. 2019, 15, e1006953. [Google Scholar] [CrossRef] [Green Version]
- Jiao, W.; Atwal, G.; Polak, P.; Karlic, R.; Cuppen, E.; Danyi, A.; de Ridder, J.; van Herpen, C.; Lolkema, M.P.; Steeghs, N.; et al. A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns. Nat. Commun. 2020, 11, 728. [Google Scholar] [CrossRef]
- Overman, M.J.; Modak, J.; Kopetz, S.; Murthy, R.; Yao, J.C.; Hicks, M.E.; Abbruzzese, J.L.; Tam, A.L. Use of research biopsies in clinical trials: Are risks and benefits adequately discussed? J. Clin. Oncol. 2013, 31, 17–22. [Google Scholar] [CrossRef]
- Cohen, J.D.; Li, L.; Wang, Y.; Thoburn, C.; Afsari, B.; Danilova, L.; Douville, C.; Javed, A.A.; Wong, F.; Mattox, A.; et al. Detection and localization of surgically resectable cancers with a multi-analyte blood test. Science 2018, 359, 926–930. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Sung, J.S.; Chong, H.Y.; Kwon, N.J.; Kim, H.M.; Lee, J.W.; Kim, B.; Lee, S.B.; Park, C.W.; Choi, J.Y.; Chang, W.J.; et al. Detection of somatic variants and EGFR mutations in cell-free DNA from non-small cell lung cancer patients by ultra-deep sequencing using the ion ampliseq cancer hotspot panel and droplet digital polymerase chain reaction. Oncotarget 2017, 8, 106901–106912. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Iwahashi, N.; Sakai, K.; Noguchi, T.; Yahata, T.; Matsukawa, H.; Toujima, S.; Nishio, K.; Ino, K. Liquid biopsy-based comprehensive gene mutation profiling for gynecological cancer using CAncer Personalized Profiling by deep Sequencing. Sci. Rep. 2019, 9, 10426. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bettegowda, C.; Sausen, M.; Leary, R.J.; Kinde, I.; Wang, Y.; Agrawal, N.; Bartlett, B.R.; Wang, H.; Luber, B.; Alani, R.M.; et al. Detection of circulating tumor DNA in early- and late-stage human malignancies. Sci. Transl. Med. 2014, 6, 224ra24. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Haber, D.A.; Velculescu, V.E. Blood-based analyses of cancer: Circulating tumor cells and circulating tumor DNA. Cancer Discov. 2014, 4, 650–661. [Google Scholar] [CrossRef] [Green Version]
- Mouliere, F.; Smith, C.G.; Heider, K.; Su, J.; van der Pol, Y.; Thompson, M.; Morris, J.; Wan, J.C.M.; Chandrananda, D.; Hadfield, J.; et al. Fragmentation patterns and personalized sequencing of cell-free DNA in urine and plasma of glioma patients. EMBO Mol. Med. 2021, 13, e12881. [Google Scholar] [CrossRef]
- Ma, X.; Zhu, L.; Wu, X.; Bao, H.; Wang, X.; Chang, Z.; Shao, Y.W.; Wang, Z. Cell-Free DNA Provides a Good Representation of the Tumor Genome Despite Its Biased Fragmentation Patterns. PLoS ONE 2017, 12, e0169231. [Google Scholar] [CrossRef] [PubMed]
- Cristiano, S.; Leal, A.; Phallen, J.; Fiksel, J.; Adleff, V.; Bruhm, D.C.; Jensen, S.Ø.; Medina, J.E.; Hruban, C.; White, J.R.; et al. Genome-wide cell-free DNA fragmentation in patients with cancer. Nature 2019, 570, 385–389. [Google Scholar] [CrossRef]
- Keup, C.; Suryaprakash, V.; Hauch, S.; Storbeck, M.; Hahn, P.; Sprenger-Haussels, M.; Kolberg, H.C.; Tewes, M.; Hoffmann, O.; Kimmig, R.; et al. Integrative statistical analyses of multiple liquid biopsy analytes in metastatic breast cancer. Genome Med. 2021, 13, 85. [Google Scholar] [CrossRef] [PubMed]
- Higgins, M.J.; Jelovac, D.; Barnathan, E.; Blair, B.; Slater, S.; Powers, P.; Zorzi, J.; Jeter, S.C.; Oliver, G.R.; Fetting, J.; et al. Detection of tumor PIK3CA status in metastatic breast cancer using peripheral blood. Clin. Cancer Res. 2012, 18, 3462–3469. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Razavi, P.; Li, B.T.; Brown, D.N.; Jung, B.; Hubbell, E.; Shen, R.; Abida, W.; Juluru, K.; De Bruijn, I.; Hou, C.; et al. High-intensity sequencing reveals the sources of plasma circulating cell-free DNA variants. Nat. Med. 2019, 25, 1928–1937. [Google Scholar] [CrossRef]
- Adalsteinsson, V.A.; Ha, G.; Freeman, S.S.; Choudhury, A.D.; Stover, D.G.; Parsons, H.A.; Gydush, G.; Reed, S.C.; Rotem, D.; Rhoades, J.; et al. Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors. Nat. Commun. 2017, 8, 1324. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Jiang, J.; Adams, H.P.; Yao, L.; Yaung, S.; Lal, P.; Balasubramanyam, A.; Fuhlbrück, F.; Tikoo, N.; Lovejoy, A.F.; Froehler, S.; et al. Concordance of Genomic Alterations by Next-Generation Sequencing in Tumor Tissue versus Cell-Free DNA in Stage I–IV Non-Small Cell Lung Cancer. J. Mol. Diagn. 2020, 22, 228–235. [Google Scholar] [CrossRef] [Green Version]
- McCabe, M.J.; Gauthier, M.E.A.; Chan, C.L.; Thompson, T.J.; De Sousa, S.M.; Puttick, C.; Grady, J.P.; Gayevskiy, V.; Tao, J.; Ying, K.; et al. Development and validation of a targeted gene sequencing panel for application to disparate cancers. Sci. Rep. 2019, 9, 17052. [Google Scholar] [CrossRef] [PubMed]
- Zviran, A.; Schulman, R.C.; Shah, M.; Hill, S.T.K.; Deochand, S.; Khamnei, C.C.; Maloney, D.; Patel, K.; Liao, W.; Widman, A.J.; et al. Genome-wide cell-free DNA mutational integration enables ultra-sensitive cancer monitoring. Nat. Med. 2020, 26, 1114–1124. [Google Scholar] [CrossRef] [PubMed]
- The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium; Campbell, P.J.; Getz, G. Pan-cancer analysis of whole genomes. Nature 2020, 578, 82–93. [Google Scholar] [CrossRef] [Green Version]
- Alexandrov, L.B.; Nik-Zainal, S.; Wedge, D.C.; Aparicio, S.A.; Behjati, S.; Biankin, A.V.; Bignell, G.R.; Bolli, N.; Borg, A.; Børresen-Dale, A.L.; et al. Signatures of mutational processes in human cancer. Nature 2013, 500, 415–421. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hanahan, D.; Weinberg, R.A. Hallmarks of cancer: The next generation. Cell 2011, 144, 646–674. [Google Scholar] [CrossRef] [Green Version]
- Schaefer, M.; Serrano, L. Cell type-specific properties and environment shape tissue specificity of cancer genes. Sci. Rep. 2016, 6, 20707. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980v9. [Google Scholar]
- Head, T.; Kumar, M.; Nahrstaedt, H.; Louppe, G.; Shcherbatyi, I. Scikit-Optimize/Scikit-Optimize (v0.7.4). Zenodo 2020. [Google Scholar] [CrossRef]
- Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, PMLR, Sardinia, Italy, 13–15 May 2010; Volume 9, pp. 249–256. Available online: http://proceedings.mlr.press/v9/glorot10a.html (accessed on 1 August 2020).
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-Learn: Machine Learning in Python. JMLR 2011, 12, 2825–2830. Available online: https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html (accessed on 1 January 2020).
- Bailey, M.H.; Tokheim, C.; Porta-Pardo, E.; Sengupta, S.; Bertrand, D.; Weerasinghe, A.; Colaprico, A.; Wendl, M.C.; Kim, J.; Reardon, B.; et al. Comprehensive Characterization of Cancer Driver Genes and Mutations. Cell 2018, 173, 371–385.e18. [Google Scholar] [CrossRef] [Green Version]
- Sundararajan, M.; Taly, A.; Yan, Q. Axiomatic Attribution for Deep Networks. arXiv 2017, arXiv:1703.01365v2. [Google Scholar]
- Shorten, C.; Khoshgoftaar, T.M. A survey on Image Data Augmentation for Deep Learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
- Zhong, Z.; Liang, Z.; Guoliang, K.; Shaozi, L.; Yi, Y. Random Erasing Data Augmentation. arXiv 2020, arXiv:1708.04896v2. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Summers, C.; Dinneen, M.J. Improved mixed-example data augmentation. arXiv 2018, arXiv:1805.11272v4. [Google Scholar]
- Inoue, H. Data Augmentation by Pairing Samples for Images Classification. arXiv 2018, arXiv:1801.02929v2. [Google Scholar]
- Hestness, J.; Narang, S.; Ardalani, N.; Diamos, G.; Jun, H.; Kianinejad, H.; Patwary, M.; Ali, M.; Yang, Y.; Zhou, Y. Deep Learning Scaling is Predictable, Empirically. arXiv 2017, arXiv:1712.00409v1. [Google Scholar]
- Hodis, E.; Watson, I.R.; Kryukov, G.V.; Arold, S.T.; Imielinski, M.; Theurillat, J.P.; Nickerson, E.; Auclair, D.; Li, L.; Place, C.; et al. A landscape of driver mutations in melanoma. Cell 2012, 150, 251–263. [Google Scholar] [CrossRef] [Green Version]
- Schuetz, J.M.; Johnson, N.A.; Morin, R.D.; Scott, D.W.; Tan, K.; Ben-Nierah, S.; Boyle, M.; Slack, G.W.; Marra, M.A.; Connors, J.M.; et al. BCL2 mutations in diffuse large B-cell lymphoma. Leukemia 2012, 26, 1383–1390. [Google Scholar] [CrossRef] [Green Version]
- Liu, Y.; Song, L.; Ni, H.; Sun, L.; Jiao, W.; Chen, L.; Zhou, Q.; Shen, T.; Cui, H.; Gao, T.; et al. ERBB4 acts as a suppressor in the development of hepatocellular carcinoma. Carcinogenesis 2017, 38, 465–473. [Google Scholar] [CrossRef] [PubMed]
- Soung, Y.H.; Lee, J.W.; Kim, S.Y.; Wang, Y.P.; Jo, K.H.; Moon, S.W.; Park, W.S.; Nam, S.W.; Lee, J.Y.; Yoo, N.J.; et al. Somatic mutations of the ERBB4 kinase domain in human cancers. Int. J. Cancer 2006, 118, 1426–1429. [Google Scholar] [CrossRef]
- Thomasson, M.; Hedman, H.; Junttila, T.T.; Elenius, K.; Ljungberg, B.; Henriksson, R. ErbB4 is downregulated in renal cell carcinoma—A quantitative RT-PCR and immunohistochemical analysis of the epidermal growth factor receptor family. Acta Oncol. 2004, 43, 453–459. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Danyi, A.; Jager, M.; de Ridder, J. Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration. Life 2022, 12, 1. https://doi.org/10.3390/life12010001
Danyi A, Jager M, de Ridder J. Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration. Life. 2022; 12(1):1. https://doi.org/10.3390/life12010001
Chicago/Turabian StyleDanyi, Alexandra, Myrthe Jager, and Jeroen de Ridder. 2022. "Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration" Life 12, no. 1: 1. https://doi.org/10.3390/life12010001
APA StyleDanyi, A., Jager, M., & de Ridder, J. (2022). Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration. Life, 12(1), 1. https://doi.org/10.3390/life12010001