Pathogenicity Prediction of Gene Fusion in Structural Variations: A Knowledge Graph-Infused Explainable Artificial Intelligence (XAI) Framework
Abstract
:Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Knowledge Graph
2.2. Explainable AI (XAI)
2.3. Our Explainable Learning Model
2.3.1. Correct Answer Set: Dataset 1
- The target text describes approved drugs in Japan for $DISEASENAME.
- The target text describes FDA-approved drugs for $DISEASENAME.
- The target text is referenced by guidelines about $DISEASENAME.
- The target text describes highly statistically reliable clinical trials/meta-analyses and consensus among experts on $DISEASENAME.
- The target text describes FDA-approved drugs for other cancer types.
- The target text describes highly statistically reliable clinical trials/meta-analyses and consensus among experts regarding other cancer types.
- The target text describes small-scale clinical trials that have shown usefulness regardless of cancer type.
- The target text describes the usefulness shown in case reports regardless of cancer type.
- The usefulness of target text has been reported in preclinical studies (in vitro and in vivo).
2.3.2. Features
- The amount of literature regarding the same fusion genes in Mitelman;
- The number of entries about the same fusion genes in COSMIC;
- The sequence of domains registered with Pfam on each gene;
- The lengths of both UTRs;
- TargetScan-registered miRNAs that affect UTR;
- Families of miRNAs;
- Whether each domain is within the breakpoint;
2.4. Benchmark
2.4.1. Existing Methods
2.4.2. Learning Set: Dataset 2 and Dataset 3
2.4.3. Method of Comparing the Performance of the Models
- Cross Validation
- 2.
- Holdout Validation
2.4.4. Our Benchmark Learning Model
- Sequences of domains registered with Pfam for each gene;
- Lengths of both UTRs;
- TargetScan-registered miRNAs that affect UTR;
- Families of miRNAs;
- Whether each domain was within the breakpoint.
3. Results
3.1. Evaluation of Prediction Accuracy
3.1.1. Evaluation using Cross-Validation
3.1.2. Evaluation Using Holdout Validation
3.2. Evaluation of the Explanation of Prediction
3.2.1. Case 1: KIF5B::RET Fusion
3.2.2. Case 2: BCR::JAK2 T Fusion
3.2.3. Case 3: KIAA1549::BRAF Fusion
3.2.4. Case 4: IKZF1::LRBA Fusion
3.2.5. Case 5: APOE::ALB Fusion
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Mitelman, F.; Johansson, B.; Mertens, F. The Impact of Translocations and Gene Fusions on Cancer Causation. Nat. Rev. Cancer 2007, 7, 233–245. [Google Scholar] [CrossRef]
- Chen, X.; Schulz-Trieglaff, O.; Shaw, R.; Barnes, B.; Schlesinger, F.; Källberg, M.; Cox, A.J.; Kruglyak, S.; Saunders, C.T. Manta: Rapid Detection of Structural Variants and Indels for Germline and Cancer Sequencing Applications. Bioinformatics 2016, 32, 1220–1222. [Google Scholar] [CrossRef] [PubMed]
- Lovino, M.; Montemurro, M.; Barrese, V.S.; Ficarra, E. Identifying the Oncogenic Potential of Gene Fusions Exploiting MiRNAs. J. Biomed. Inform. 2022, 129, 104057. [Google Scholar] [CrossRef] [PubMed]
- Lovino, M.; Ciaburri, M.S.; Urgese, G.; Di Cataldo, S.; Ficarra, E. DEEPrior: A Deep Learning Tool for the Prioritization of Gene Fusions. Bioinformatics 2020, 36, 3248–3250. [Google Scholar] [CrossRef] [PubMed]
- Shugay, M.; Ortiz de Mendíbil, I.; Vizmanos, J.L.; Novo, F.J. Oncofuse: A Computational Framework for the Prediction of the Oncogenic Potential of Gene Fusions. Bioinformatics 2013, 29, 2539–2546. [Google Scholar] [CrossRef]
- Sheu, R.-K.; Pardeshi, M.S. A Survey on Medical Explainable AI (XAI): Recent Progress, Explainability Approach, Human Interaction and Scoring System. Sensors 2022, 22, 8068. [Google Scholar] [CrossRef]
- Abe, S.; Tago, S.; Yokoyama, K.; Ogawa, M.; Takei, T.; Imoto, S.; Fuji, M. Explainable AI for Estimating Pathogenicity of Genetic Variants Using Large-Scale Knowledge Graphs. Cancers 2023, 15, 1118. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Resource Description Framework (RDF): Concepts and Abstract Syntax. Available online: https://www.w3.org/TR/rdf-concepts/ (accessed on 28 March 2024).
- Med2RDF. Available online: http://med2rdf.org/ (accessed on 28 March 2024).
- Auer, S.; Bizer, C.; Kobilarov, G.; Lehmann, J.; Cyganiak, R.; Ives, Z. DBpedia: A Nucleus for a Web of Open Data. In Proceedings of the The Semantic Web, Busan, Republic of Korea, 11–15 November 2007; Springer: Berlin/Heidelberg, Germany, 2007; pp. 722–735. [Google Scholar]
- Tate, J.G.; Bamford, S.; Jubb, H.C.; Sondka, Z.; Beare, D.M.; Bindal, N.; Boutselakis, H.; Cole, C.G.; Creatore, C.; Dawson, E.; et al. COSMIC: The Catalogue Of Somatic Mutations In Cancer. Nucleic Acids Res. 2019, 47, D941–D947. [Google Scholar] [CrossRef]
- Raney, B.J.; Barber, G.P.; Benet-Pagès, A.; Casper, J.; Clawson, H.; Cline, M.S.; Diekhans, M.; Fischer, C.; Navarro Gonzalez, J.; Hickey, G.; et al. The UCSC Genome Browser Database: 2024 Update. Nucleic Acids Res. 2024, 52, D1082–D1088. [Google Scholar] [CrossRef]
- Mistry, J.; Chuguransky, S.; Williams, L.; Qureshi, M.; Salazar, G.A.; Sonnhammer, E.L.L.; Tosatto, S.C.E.; Paladin, L.; Raj, S.; Richardson, L.J.; et al. Pfam: The Protein Families Database in 2021. Nucleic Acids Res. 2021, 49, D412–D419. [Google Scholar] [CrossRef] [PubMed]
- O’Leary, N.A.; Wright, M.W.; Brister, J.R.; Ciufo, S.; Haddad, D.; McVeigh, R.; Rajput, B.; Robbertse, B.; Smith-White, B.; Ako-Adjei, D.; et al. Reference Sequence (RefSeq) Database at NCBI: Current Status, Taxonomic Expansion, and Functional Annotation. Nucleic Acids Res. 2016, 44, D733–D745. [Google Scholar] [CrossRef] [PubMed]
- McGeary, S.E.; Lin, K.S.; Shi, C.Y.; Pham, T.M.; Bisaria, N.; Kelley, G.M.; Bartel, D.P. The Biochemical Basis of MicroRNA Targeting Efficacy. Science 2019, 366, eaav1741. [Google Scholar] [CrossRef] [PubMed]
- Johansson, B.; Mertens, F.; Mitelman, F. Geographic Heterogeneity of Neoplasia-Associated Chromosome Aberrations. Genes Chromosomes Cancer 1991, 3, 1–7. [Google Scholar] [CrossRef] [PubMed]
- Mitelman Database Chromosome Aberrations and Gene Fusions in Cancer. Available online: https://mitelmandatabase.isb-cgc.org/about (accessed on 25 March 2024).
- Ashburner, M.; Ball, C.A.; Blake, J.A.; Botstein, D.; Butler, H.; Cherry, J.M.; Davis, A.P.; Dolinski, K.; Dwight, S.S.; Eppig, J.T.; et al. Gene Ontology: Tool for the Unification of Biology. The Gene Ontology Consortium. Nat. Genet. 2000, 25, 25–29. [Google Scholar] [CrossRef] [PubMed]
- Gene Ontology Consortium; Aleksander, S.A.; Balhoff, J.; Carbon, S.; Cherry, J.M.; Drabkin, H.J.; Ebert, D.; Feuermann, M.; Gaudet, P.; Harris, N.L.; et al. The Gene Ontology Knowledgebase in 2023. Genetics 2023, 224, iyad031. [Google Scholar] [CrossRef] [PubMed]
- Hu, X.; Wang, Q.; Tang, M.; Barthel, F.; Amin, S.; Yoshihara, K.; Lang, F.M.; Martinez-Ledesma, E.; Lee, S.H.; Zheng, S.; et al. TumorFusions: An Integrative Resource for Cancer-Associated Transcript Fusions. Nucleic Acids Res. 2018, 46, D1144–D1149. [Google Scholar] [CrossRef] [PubMed]
- Maruhashi, K.; Todoriki, M.; Ohwa, T.; Goto, K.; Hasegawa, Y.; Inakoshi, H.; Anai, H. Learning Multi-Way Relations via Tensor Decomposition with Neural Networks. AAAI 2018, 32, 3770–3777. [Google Scholar] [CrossRef]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, San Diego, CA, USA, 12–17 June 2016. [Google Scholar] [CrossRef]
- Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models Are Few-Shot Learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
- Azure OpenAI Service. Available online: https://azure.microsoft.com/en-us/products/ai-services/openai-service/ (accessed on 25 March 2024).
- Cancer Genome Atlas Research Network; Weinstein, J.N.; Collisson, E.A.; Mills, G.B.; Shaw, K.R.M.; Ozenberger, B.A.; Ellrott, K.; Shmulevich, I.; Sander, C.; Stuart, J.M. The Cancer Genome Atlas Pan-Cancer Analysis Project. Nat. Genet. 2013, 45, 1113–1120. [Google Scholar] [CrossRef]
- Liang, W.; Zhang, Y.; Cao, H.; Wang, B.; Ding, D.; Yang, X.; Vodrahalli, K.; He, S.; Smith, D.; Yin, Y.; et al. Can Large Language Models Provide Useful Feedback on Research Papers? A Large-Scale Empirical Analysis. arXiv 2023, arXiv:2310.01783. [Google Scholar]
- JSMO Guideline. Available online: https://www.jsmo.or.jp/about/doc/20200310.pdf (accessed on 22 March 2024).
- Abate, F.; Zairis, S.; Ficarra, E.; Acquaviva, A.; Wiggins, C.H.; Frattini, V.; Lasorella, A.; Iavarone, A.; Inghirami, G.; Rabadan, R. Pegasus: A Comprehensive Annotation and Prediction Tool for Detection of Driver Gene Fusions in Cancer. BMC Syst. Biol. 2014, 8, 97. [Google Scholar] [CrossRef]
- Babiceanu, M.; Qin, F.; Xie, Z.; Jia, Y.; Lopez, K.; Janus, N.; Facemire, L.; Kumar, S.; Pang, Y.; Qi, Y.; et al. Recurrent Chimeric Fusion RNAs in Non-Cancer Tissues and Cells. Nucleic Acids Res. 2016, 44, 2859–2872. [Google Scholar] [CrossRef] [PubMed]
- Kohno, T.; Ichikawa, H.; Totoki, Y.; Yasuda, K.; Hiramoto, M.; Nammo, T.; Sakamoto, H.; Tsuta, K.; Furuta, K.; Shimada, Y.; et al. KIF5B-RET Fusions in Lung Adenocarcinoma. Nat. Med. 2012, 18, 375–377. [Google Scholar] [CrossRef] [PubMed]
- Jay, J.J.; Brouwer, C. Lollipops in the Clinic: Information Dense Mutation Plots for Precision Medicine. PLoS ONE 2016, 11, e0160519. [Google Scholar] [CrossRef] [PubMed]
- Cirmena, G.; Aliano, S.; Fugazza, G.; Bruzzone, R.; Garuti, A.; Bocciardi, R.; Bacigalupo, A.; Ravazzolo, R.; Ballestrero, A.; Sessarego, M. A BCR-JAK2 Fusion Gene as the Result of a t(9;22)(P24;Q11) in a Patient with Acute Myeloid Leukemia. Cancer Genet. Cytogenet. 2008, 183, 105–108. [Google Scholar] [CrossRef] [PubMed]
- Ryall, S.; Krishnatry, R.; Arnoldo, A.; Buczkowicz, P.; Mistry, M.; Siddaway, R.; Ling, C.; Pajovic, S.; Yu, M.; Rubin, J.B.; et al. Targeted Detection of Genetic Alterations Reveal the Prognostic Impact of H3K27M and MAPK Pathway Aberrations in Paediatric Thalamic Glioma. Acta Neuropathol. Commun. 2016, 4, 93. [Google Scholar] [CrossRef] [PubMed]
- Yokota, K.; Sasaki, H.; Okuda, K.; Shimizu, S.; Shitara, M.; Hikosaka, Y.; Moriyama, S.; Yano, M.; Fujii, Y. KIF5B/RET Fusion Gene in Surgically-Treated Adenocarcinoma of the Lung. Oncol. Rep. 2012, 28, 1187–1192. [Google Scholar] [CrossRef]
- Ju, Y.S.; Lee, W.-C.; Shin, J.-Y.; Lee, S.; Bleazard, T.; Won, J.-K.; Kim, Y.T.; Kim, J.-I.; Kang, J.-H.; Seo, J.-S. A Transforming KIF5B and RET Gene Fusion in Lung Adenocarcinoma Revealed from Whole-Genome and Transcriptome Sequencing. Genome Res. 2012, 22, 436–445. [Google Scholar] [CrossRef]
- Cuesta-Domínguez, Á.; Ortega, M.; Ormazábal, C.; Santos-Roncero, M.; Galán-Díez, M.; Steegmann, J.L.; Figuera, Á.; Arranz, E.; Vizmanos, J.L.; Bueren, J.A.; et al. Transforming and Tumorigenic Activity of JAK2 by Fusion to BCR: Molecular Mechanisms of Action of a Novel BCR-JAK2 Tyrosine-Kinase. PLoS ONE 2012, 7, e32451. [Google Scholar] [CrossRef]
- McWhirter, J.R.; Galasso, D.L.; Wang, J.Y. A Coiled-Coil Oligomerization Domain of Bcr Is Essential for the Transforming Function of Bcr-Abl Oncoproteins. Mol. Cell. Biol. 1993, 13, 7587–7595. [Google Scholar] [CrossRef] [PubMed]
- Roberts, K.G.; Li, Y.; Payne-Turner, D.; Harvey, R.C.; Yang, Y.-L.; Pei, D.; McCastlain, K.; Ding, L.; Lu, C.; Song, G.; et al. Targetable Kinase-Activating Lesions in Ph-like Acute Lymphoblastic Leukemia. N. Engl. J. Med. 2014, 371, 1005–1015. [Google Scholar] [CrossRef] [PubMed]
- Antonelli, M.; Badiali, M.; Moi, L.; Buttarelli, F.R.; Baldi, C.; Massimino, M.; Sanson, M.; Giangaspero, F. KIAA1549:BRAF Fusion Gene in Pediatric Brain Tumors of Various Histogenesis. Pediatr. Blood Cancer 2015, 62, 724–727. [Google Scholar] [CrossRef] [PubMed]
- Appay, R.; Fina, F.; Macagno, N.; Padovani, L.; Colin, C.; Barets, D.; Ordioni, J.; Scavarda, D.; Giangaspero, F.; Badiali, M.; et al. Duplications of KIAA1549 and BRAF Screening by Droplet Digital PCR from Formalin-Fixed Paraffin-Embedded DNA Is an Accurate Alternative for KIAA1549-BRAF Fusion Detection in Pilocytic Astrocytomas. Mod. Pathol. 2018, 31, 1490–1501. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Deng, X.; Zeng, X.; Peng, X. The Role of Mir-148a in Cancer. J. Cancer 2016, 7, 1233–1241. [Google Scholar] [CrossRef] [PubMed]
- Zhang, J.; Wang, X.; Wang, Y.; Peng, R.; Lin, Z.; Wang, Y.; Hu, B.; Wang, J.; Shi, G. Low Expression of MicroRNA-30c Promotes Prostate Cancer Cells Invasion Involved in Downregulation of KRAS Protein. Oncol. Lett. 2017, 14, 363–368. [Google Scholar] [CrossRef] [PubMed]
- Ahmed, E.A.; Rajendran, P.; Scherthan, H. The MicroRNA-202 as a Diagnostic Biomarker and a Potential Tumor Suppressor. Int. J. Mol. Sci. 2022, 23, 5870. [Google Scholar] [CrossRef] [PubMed]
- Lind, K.T.; Chatwin, H.V.; DeSisto, J.; Coleman, P.; Sanford, B.; Donson, A.M.; Davies, K.D.; Willard, N.; Ewing, C.A.; Knox, A.J.; et al. Novel RAF Fusions in Pediatric Low-Grade Gliomas Demonstrate MAPK Pathway Activation. J. Neuropathol. Exp. Neurol. 2021, 80, 1099–1107. [Google Scholar] [CrossRef]
- Helgager, J.; Lidov, H.G.; Mahadevan, N.R.; Kieran, M.W.; Ligon, K.L.; Alexandrescu, S. A Novel GIT2-BRAF Fusion in Pilocytic Astrocytoma. Diagn. Pathol. 2017, 12, 82. [Google Scholar] [CrossRef]
- Yan, L.; Ping, N.; Zhu, M.; Sun, A.; Xue, Y.; Ruan, C.; Drexler, H.G.; Macleod, R.A.F.; Wu, D.; Chen, S. Clinical, Immunophenotypic, Cytogenetic, and Molecular Genetic Features in 117 Adult Patients with Mixed-Phenotype Acute Leukemia Defined by WHO-2008 Classification. Haematologica 2012, 97, 1708–1712. [Google Scholar] [CrossRef]
- Mullighan, C.G.; Miller, C.B.; Radtke, I.; Phillips, L.A.; Dalton, J.; Ma, J.; White, D.; Hughes, T.P.; Le Beau, M.M.; Pui, C.-H.; et al. BCR-ABL1 Lymphoblastic Leukaemia Is Characterized by the Deletion of Ikaros. Nature 2008, 453, 110–114. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Murakami, K.; Tago, S.-i.; Takishita, S.; Morikawa, H.; Kojima, R.; Yokoyama, K.; Ogawa, M.; Fukushima, H.; Takamori, H.; Nannya, Y.; et al. Pathogenicity Prediction of Gene Fusion in Structural Variations: A Knowledge Graph-Infused Explainable Artificial Intelligence (XAI) Framework. Cancers 2024, 16, 1915. https://doi.org/10.3390/cancers16101915
Murakami K, Tago S-i, Takishita S, Morikawa H, Kojima R, Yokoyama K, Ogawa M, Fukushima H, Takamori H, Nannya Y, et al. Pathogenicity Prediction of Gene Fusion in Structural Variations: A Knowledge Graph-Infused Explainable Artificial Intelligence (XAI) Framework. Cancers. 2024; 16(10):1915. https://doi.org/10.3390/cancers16101915
Chicago/Turabian StyleMurakami, Katsuhiko, Shin-ichiro Tago, Sho Takishita, Hiroaki Morikawa, Rikuhiro Kojima, Kazuaki Yokoyama, Miho Ogawa, Hidehito Fukushima, Hiroyuki Takamori, Yasuhito Nannya, and et al. 2024. "Pathogenicity Prediction of Gene Fusion in Structural Variations: A Knowledge Graph-Infused Explainable Artificial Intelligence (XAI) Framework" Cancers 16, no. 10: 1915. https://doi.org/10.3390/cancers16101915
APA StyleMurakami, K., Tago, S. -i., Takishita, S., Morikawa, H., Kojima, R., Yokoyama, K., Ogawa, M., Fukushima, H., Takamori, H., Nannya, Y., Imoto, S., & Fuji, M. (2024). Pathogenicity Prediction of Gene Fusion in Structural Variations: A Knowledge Graph-Infused Explainable Artificial Intelligence (XAI) Framework. Cancers, 16(10), 1915. https://doi.org/10.3390/cancers16101915