Deep Learning for Elucidating Modifications to RNA—Status and Challenges Ahead
Abstract
:1. Introduction
2. Deep Learning for RNA Modifications
2.1. Features and Model Architecture
2.2. Incorporating RNA Secondary Structure
2.3. Perspective on Current Models
2.4. Model Performance and Choice of Background Set: The Hunt for Biologically Relevant Results
2.5. Further Considerations for Modelling Approach
3. Some Major Future Perspectives
3.1. Generalisability across Cell Types and Species
3.2. Focus on Model Interpretation
3.3. Extensions to Predictions on Non-Coding RNAs
3.4. Cooperative Contexts and Interplay with Other Modifications
4. Conclusions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Delaunay, S.; Helm, M.; Frye, M. RNA modifications in physiology and disease: Towards clinical applications. Nat. Rev. Genet. 2024, 25, 104–122. [Google Scholar] [CrossRef] [PubMed]
- Barbieri, I.; Kouzarides, T. Role of RNA modifications in cancer. Nat. Rev. Cancer 2020, 20, 303–322. [Google Scholar] [CrossRef] [PubMed]
- Gerstberger, S.; Hafner, M.; Tuschl, T. A census of human RNA-binding proteins. Nat. Rev. Genet. 2014, 15, 829–845. [Google Scholar] [CrossRef] [PubMed]
- Hentze, M.W.; Castello, A.; Schwarzl, T.; Preiss, T. A brave new world of RNA-binding proteins. Nat. Rev. Mol. Cell Biol. 2018, 19, 327–341. [Google Scholar] [CrossRef] [PubMed]
- Dominguez, D.; Freese, P.; Alexis, M.S.; Su, A.; Hochman, M.; Palden, T.; Bazile, C.; Lambert, N.J.; Van Nostrand, E.L.; Pratt, G.A.; et al. Sequence, structure, and context preferences of human RNA binding proteins. Mol. Cell 2018, 70, 854–867. [Google Scholar] [CrossRef] [PubMed]
- Ke, S.; Alemu, E.A.; Mertens, C.; Gantman, E.C.; Fak, J.J.; Mele, A.; Haripal, B.; Zucker-Scharff, I.; Moore, M.J.; Park, C.Y.; et al. A majority of m6A residues are in the last exons, allowing the potential for 3′ UTR regulation. Genes Dev. 2015, 29, 2037–2053. [Google Scholar] [CrossRef] [PubMed]
- Patil, D.P.; Pickering, B.F.; Jaffrey, S.R. Reading m6A in the transcriptome: M6A-binding proteins. Trends Cell Biol. 2018, 28, 113–127. [Google Scholar] [CrossRef] [PubMed]
- Eisenberg, E.; Levanon, E.Y. A-to-I RNA editing—immune protector and transcriptome diversifier. Nat. Rev. Genet. 2018, 19, 473–490. [Google Scholar] [CrossRef]
- Ule, J.; Jensen, K.B.; Ruggiu, M.; Mele, A.; Ule, A.; Darnell, R.B. CLIP identifies Nova-regulated RNA networks in the brain. Science 2003, 302, 1212–1215. [Google Scholar] [CrossRef]
- Dominissini, D.; Moshitch-Moshkovitz, S.; Schwartz, S.; Salmon-Divon, M.; Ungar, L.; Osenberg, S.; Cesarkas, K.; Jacob-Hirsch, J.; Amariglio, N.; Kupiec, M.; et al. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature 2012, 485, 201–206. [Google Scholar] [CrossRef]
- Linder, B.; Grozhik, A.V.; Olarerin-George, A.O.; Meydan, C.; Mason, C.E.; Jaffrey, S.R. Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome. Nat. Methods 2015, 12, 767–772. [Google Scholar] [CrossRef]
- König, J.; Zarnack, K.; Rot, G.; Curk, T.; Kayikci, M.; Zupan, B.; Turner, D.J.; Luscombe, N.M.; Ule, J. iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nat. Struct. Mol. Biol. 2010, 17, 909–915. [Google Scholar] [CrossRef] [PubMed]
- Van Nostrand, E.L.; Pratt, G.A.; Shishkin, A.A.; Gelboin-Burkhart, C.; Fang, M.Y.; Sundararaman, B.; Blue, S.M.; Nguyen, T.B.; Surka, C.; Elkins, K.; et al. Robust transcriptome-wide discovery of RNA-binding protein binding sites with enhanced CLIP (eCLIP). Nat. Methods 2016, 13, 508–514. [Google Scholar] [CrossRef] [PubMed]
- Van Nostrand, E.L.; Freese, P.; Pratt, G.A.; Wang, X.; Wei, X.; Xiao, R.; Blue, S.M.; Chen, J.Y.; Cody, N.A.; Dominguez, D.; et al. A large-scale binding and functional map of human RNA-binding proteins. Nature 2020, 583, 711–719. [Google Scholar] [CrossRef] [PubMed]
- Wheeler, E.C.; Van Nostrand, E.L.; Yeo, G.W. Advances and challenges in the detection of transcriptome-wide protein–RNA interactions. Wiley Interdiscip. Rev. Rna 2018, 9, e1436. [Google Scholar] [CrossRef] [PubMed]
- Rahman, R.; Xu, W.; Jin, H.; Rosbash, M. Identification of RNA-binding protein targets with HyperTRIBE. Nat. Protoc. 2018, 13, 1829–1849. [Google Scholar] [CrossRef] [PubMed]
- Meyer, K.D. DART-seq: An antibody-free method for global m6A detection. Nat. Methods 2019, 16, 1275–1280. [Google Scholar] [CrossRef] [PubMed]
- Ray, D.; Kazan, H.; Cook, K.B.; Weirauch, M.T.; Najafabadi, H.S.; Li, X.; Gueroussov, S.; Albu, M.; Zheng, H.; Yang, A.; et al. A compendium of RNA-binding motifs for decoding gene regulation. Nature 2013, 499, 172–177. [Google Scholar] [CrossRef] [PubMed]
- Lambert, N.; Robertson, A.; Jangi, M.; McGeary, S.; Sharp, P.A.; Burge, C.B. RNA Bind-n-Seq: Quantitative assessment of the sequence and structural binding specificity of RNA binding proteins. Mol. Cell 2014, 54, 887–900. [Google Scholar] [CrossRef]
- Dai, Q.; Zhang, L.S.; Sun, H.L.; Pajdzik, K.; Yang, L.; Ye, C.; Ju, C.W.; Liu, S.; Wang, Y.; Zheng, Z.; et al. Quantitative sequencing using BID-seq uncovers abundant pseudouridines in mammalian mRNA at base resolution. Nat. Biotechnol. 2023, 41, 344–354. [Google Scholar] [CrossRef]
- Liu, C.; Sun, H.; Yi, Y.; Shen, W.; Li, K.; Xiao, Y.; Li, F.; Li, Y.; Hou, Y.; Lu, B.; et al. Absolute quantification of single-base m6A methylation in the mammalian transcriptome using GLORI. Nat. Biotechnol. 2023, 41, 355–366. [Google Scholar] [CrossRef] [PubMed]
- Garalde, D.R.; Snell, E.A.; Jachimowicz, D.; Sipos, B.; Lloyd, J.H.; Bruce, M.; Pantic, N.; Admassu, T.; James, P.; Warland, A.; et al. Highly parallel direct RNA sequencing on an array of nanopores. Nat. Methods 2018, 15, 201–206. [Google Scholar] [CrossRef] [PubMed]
- Hendra, C.; Pratanwanich, P.N.; Wan, Y.K.; Goh, W.S.; Thiery, A.; Göke, J. Detection of m6A from direct RNA sequencing using a multiple instance learning framework. Nat. Methods 2022, 19, 1590–1598. [Google Scholar] [CrossRef]
- Mateos, P.A.; Sethi, A.; Ravindran, A.; Guarnacci, M.; Srivastava, A.; Xu, J.; Woodward, K.; Yuen, Z.; Mahmud, S.; Kanchi, M.; et al. Simultaneous identification of m6A and m5C reveals coordinated RNA modification at single-molecule resolution. bioRxiv 2022. [Google Scholar] [CrossRef]
- Angermueller, C.; Pärnamaa, T.; Parts, L.; Stegle, O. Deep learning for computational biology. Mol. Syst. Biol. 2016, 12, 878. [Google Scholar] [CrossRef] [PubMed]
- Zou, J.; Huss, M.; Abid, A.; Mohammadi, P.; Torkamani, A.; Telenti, A. A primer on deep learning in genomics. Nat. Genet. 2019, 51, 12–18. [Google Scholar] [CrossRef] [PubMed]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems. 2019. Available online: https://dl.acm.org/doi/10.5555/3454287.3455008 (accessed on 10 May 2024).
- Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. TensorFlow: A system for Large-Scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, USA, 2–4 November 2016; pp. 265–283. [Google Scholar]
- Horlacher, M.; Wagner, N.; Moyon, L.; Kuret, K.; Goedert, N.; Salvatore, M.; Ule, J.; Gagneur, J.; Winther, O.; Marsico, A. Towards In-Silico CLIP-seq: Predicting Protein-RNA Interaction via Sequence-to-Signal Learning. Genome Biol. 2022, 24, 180. [Google Scholar] [CrossRef]
- Xu, Y.; Zhu, J.; Huang, W.; Xu, K.; Yang, R.; Zhang, Q.C.; Sun, L. PrismNet: Predicting protein–RNA interaction using in vivo RNA structural information. Nucleic Acids Res. 2023, 51, W468–W477. [Google Scholar] [CrossRef]
- Zhang, S.; Zhou, J.; Hu, H.; Gong, H.; Chen, L.; Cheng, C.; Zeng, J. A deep learning framework for modeling structural features of RNA-binding protein targets. Nucleic Acids Res. 2016, 44, e32. [Google Scholar] [CrossRef]
- Laverty, K.U.; Jolma, A.; Pour, S.E.; Zheng, H.; Ray, D.; Morris, Q.; Hughes, T.R. PRIESSTESS: Interpretable, high-performing models of the sequence and structure preferences of RNA-binding proteins. Nucleic Acids Res. 2022, 50, e111. [Google Scholar] [CrossRef]
- Luo, Z.; Zhang, J.; Fei, J.; Ke, S. Deep learning modeling m6A deposition reveals the importance of downstream cis-element sequences. Nat. Commun. 2022, 13, 2720. [Google Scholar] [CrossRef]
- Grønning, A.G.B.; Doktor, T.K.; Larsen, S.J.; Petersen, U.S.S.; Holm, L.L.; Bruun, G.H.; Hansen, M.B.; Hartung, A.M.; Baumbach, J.; Andresen, B.S. DeepCLIP: Predicting the effect of mutations on protein–RNA binding with deep learning. Nucleic Acids Res. 2020, 48, 7099–7118. [Google Scholar] [CrossRef]
- Mukherjee, N.; Wessels, H.H.; Lebedeva, S.; Sajek, M.; Ghanbari, M.; Garzia, A.; Munteanu, A.; Yusuf, D.; Farazi, T.; Hoell, J.I.; et al. Deciphering human ribonucleoprotein regulatory networks. Nucleic Acids Res. 2019, 47, 570–581. [Google Scholar] [CrossRef] [PubMed]
- Stražar, M.; Žitnik, M.; Zupan, B.; Ule, J.; Curk, T. Orthogonal matrix factorization enables integrative analysis of multiple RNA binding proteins. Bioinformatics 2016, 32, 1527–1535. [Google Scholar] [CrossRef]
- Zhao, W.; Zhang, S.; Zhu, Y.; Xi, X.; Bao, P.; Ma, Z.; Kapral, T.H.; Chen, S.; Zagrovic, B.; Yang, Y.T.; et al. POSTAR3: An updated platform for exploring post-transcriptional regulation coordinated by RNA-binding proteins. Nucleic Acids Res. 2022, 50, D287–D294. [Google Scholar] [CrossRef] [PubMed]
- Tang, Y.; Chen, K.; Song, B.; Ma, J.; Wu, X.; Xu, Q.; Wei, Z.; Su, J.; Liu, G.; Rong, R.; et al. m6A-Atlas: A comprehensive knowledgebase for unraveling the N 6-methyladenosine (m6A) epitranscriptome. Nucleic Acids Res. 2021, 49, D134–D143. [Google Scholar] [CrossRef] [PubMed]
- Liang, Z.; Ye, H.; Ma, J.; Wei, Z.; Wang, Y.; Zhang, Y.; Huang, D.; Song, B.; Meng, J.; Rigden, D.J.; et al. m6A-Atlas v2. 0: Updated resources for unraveling the N 6-methyladenosine (m6A) epitranscriptome among multiple species. Nucleic Acids Res. 2024, 52, D194–D202. [Google Scholar] [CrossRef]
- Krakau, S.; Richard, H.; Marsico, A. PureCLIP: Capturing target-specific protein–RNA interaction footprints from single-nucleotide CLIP-seq data. Genome Biol. 2017, 18, 240. [Google Scholar] [CrossRef]
- Uren, P.J.; Bahrami-Samani, E.; Burns, S.C.; Qiao, M.; Karginov, F.V.; Hodges, E.; Hannon, G.J.; Sanford, J.R.; Penalva, L.O.; Smith, A.D. Site identification in high-throughput RNA–protein interaction data. Bioinformatics 2012, 28, 3013–3020. [Google Scholar] [CrossRef]
- Ji, Y.; Zhou, Z.; Liu, H.; Davuluri, R.V. DNABERT: Pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics 2021, 37, 2112–2120. [Google Scholar] [CrossRef]
- Sun, L.; Xu, K.; Huang, W.; Yang, Y.T.; Li, P.; Tang, L.; Xiong, T.; Zhang, Q.C. Predicting dynamic cellular protein–RNA interactions by deep learning using in vivo RNA structures. Cell Res. 2021, 31, 495–516. [Google Scholar] [CrossRef]
- Zhu, H.; Yang, Y.; Wang, Y.; Wang, F.; Huang, Y.; Chang, Y.; Wong, K.c.; Li, X. Dynamic characterization and interpretation for protein-RNA interactions across diverse cellular conditions using HDRNet. Nat. Commun. 2023, 14, 6824. [Google Scholar] [CrossRef] [PubMed]
- Pan, X.; Fang, Y.; Li, X.; Yang, Y.; Shen, H.B. RBPsuite: RNA-protein binding sites prediction suite based on deep learning. BMC Genom. 2020, 21, 884. [Google Scholar] [CrossRef] [PubMed]
- Yamada, K.; Hamada, M. Prediction of RNA–protein interactions using a nucleotide language model. Bioinform. Adv. 2022, 2, vbac023. [Google Scholar] [CrossRef] [PubMed]
- Zhang, J.; Liu, B.; Wang, Z.; Lehnert, K.; Gahegan, M. DeepPN: A deep parallel neural network based on convolutional neural network and graph convolutional network for predicting RNA-protein binding sites. BMC Bioinform. 2022, 23, 257. [Google Scholar] [CrossRef] [PubMed]
- Uhl, M.; Tran, V.D.; Heyl, F.; Backofen, R. RNAProt: An efficient and feature-rich RNA binding protein binding site predictor. GigaScience 2021, 10, giab054. [Google Scholar] [CrossRef] [PubMed]
- Ghanbari, M.; Ohler, U. Deep neural networks for interpreting RNA-binding protein target preferences. Genome Res. 2020, 30, 214–226. [Google Scholar] [CrossRef]
- Picardi, E.; D’Erchia, A.M.; Lo Giudice, C.; Pesole, G. REDIportal: A comprehensive database of A-to-I RNA editing events in humans. Nucleic Acids Res. 2017, 45, D750–D757. [Google Scholar] [CrossRef]
- Wang, J.; Ness, S.; Brown, R.; Yu, H.; Oyebamiji, O.; Jiang, L.; Sheng, Q.; Samuels, D.C.; Zhao, Y.Y.; Tang, J.; et al. EditPredict: Prediction of RNA editable sites with convolutional neural network. Genomics 2021, 113, 3864–3871. [Google Scholar] [CrossRef]
- Horlacher, M.; Cantini, G.; Hesse, J.; Schinke, P.; Goedert, N.; Londhe, S.; Moyon, L.; Marsico, A. A Systematic Benchmark of Machine Learning Methods for Protein-RNA Interaction Prediction. Briefings Bioinform. 2023, 24, bbad307. [Google Scholar] [CrossRef]
- Jaganathan, K.; Panagiotopoulou, S.K.; McRae, J.F.; Darbandi, S.F.; Knowles, D.; Li, Y.I.; Kosmicki, J.A.; Arbelaez, J.; Cui, W.; Schwartz, G.B.; et al. Predicting splicing from primary sequence with deep learning. Cell 2019, 176, 535–548. [Google Scholar] [CrossRef] [PubMed]
- Li, X.; Wang, K.; Lyu, Y.; Pan, H.; Zhang, J.; Stambolian, D.; Susztak, K.; Reilly, M.P.; Hu, G.; Li, M. Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis. Nat. Commun. 2020, 11, 2338. [Google Scholar] [CrossRef] [PubMed]
- Han, K.; Sheng, V.S.; Song, Y.; Liu, Y.; Qiu, C.; Ma, S.; Liu, Z. Deep semi-supervised learning for medical image segmentation: A review. Expert Syst. Appl. 2024, 245, 123052. [Google Scholar] [CrossRef]
- Han, H.; Talpur, B.A.; Liu, W.; Wang, L.; Ahmed, B.; Sarhan, N.; Awwad, E.M. RNA-RBP interactions recognition using multi-label learning and feature attention allocation. J. Cloud Comput. 2024, 13, 54. [Google Scholar] [CrossRef]
- Pan, X.; Rijnbeek, P.; Yan, J.; Shen, H.B. Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks. BMC Genom. 2018, 19, 511. [Google Scholar] [CrossRef] [PubMed]
- Trabelsi, A.; Chaabane, M.; Ben-Hur, A. Comprehensive evaluation of deep learning architectures for prediction of DNA/RNA sequence binding specificities. Bioinformatics 2019, 35, i269–i277. [Google Scholar] [CrossRef] [PubMed]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Wang, X.; Zhang, M.; Long, C.; Yao, L.; Zhu, M. Self-attention based neural network for predicting RNA-protein binding sites. IEEE/Acm Trans. Comput. Biol. Bioinform. 2022, 20, 1469–1479. [Google Scholar] [CrossRef]
- Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1798–1828. [Google Scholar] [CrossRef] [PubMed]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Maticzka, D.; Lange, S.J.; Costa, F.; Backofen, R. GraphProt: Modeling binding preferences of RNA-binding proteins. Genome Biol. 2014, 15, R17. [Google Scholar] [CrossRef]
- Uhl, M.; Tran, V.; Heyl, F.; Backofen, R. GraphProt2: A novel deep learning-based method for predicting binding sites of RNA-binding proteins. BioRxiv 2019. [Google Scholar] [CrossRef]
- Zhao, X.; Chang, F.; Lv, H.; Zou, G.; Zhang, B. A Novel Deep Learning Method for Predicting RNA-Protein Binding Sites. Appl. Sci. 2023, 13, 3247. [Google Scholar] [CrossRef]
- Gruber, A.R.; Lorenz, R.; Bernhart, S.H.; Neuböck, R.; Hofacker, I.L. The vienna RNA websuite. Nucleic Acids Res. 2008, 36, W70–W74. [Google Scholar] [CrossRef] [PubMed]
- Steffen, P.; Voß, B.; Rehmsmeier, M.; Reeder, J.; Giegerich, R. RNAshapes: An integrated RNA analysis package based on abstract shapes. Bioinformatics 2006, 22, 500–503. [Google Scholar] [CrossRef]
- Yan, Z.; Hamilton, W.L.; Blanchette, M. Graph neural representational learning of RNA secondary structures for predicting RNA-protein interactions. Bioinformatics 2020, 36, i276–i284. [Google Scholar] [CrossRef] [PubMed]
- Spitale, R.C.; Flynn, R.A.; Zhang, Q.C.; Crisalli, P.; Lee, B.; Jung, J.W.; Kuchelmeister, H.Y.; Batista, P.J.; Torre, E.A.; Kool, E.T.; et al. Structural imprints in vivo decode RNA regulatory mechanisms. Nature 2015, 519, 486–490. [Google Scholar] [CrossRef]
- Sun, L.; Fazal, F.M.; Li, P.; Broughton, J.P.; Lee, B.; Tang, L.; Huang, W.; Kool, E.T.; Chang, H.Y.; Zhang, Q.C. RNA structure maps across mammalian cellular compartments. Nat. Struct. Mol. Biol. 2019, 26, 322–330. [Google Scholar] [CrossRef]
- Chan, D.; Feng, C.; Spitale, R.C. Measuring RNA structure transcriptome-wide with icSHAPE. Methods 2017, 120, 85–90. [Google Scholar] [CrossRef] [PubMed]
- Hutvagner, G.; Zamore, P.D. A microRNA in a multiple-turnover RNAi enzyme complex. Science 2002, 297, 2056–2060. [Google Scholar] [CrossRef] [PubMed]
- Vaculík, O.; Chalupová, E.; Grešová, K.; Majtner, T.; Alexiou, P. Transfer Learning Allows Accurate RBP Target Site Prediction with Limited Sample Sizes. Biology 2023, 12, 1276. [Google Scholar] [CrossRef] [PubMed]
- Dalla-Torre, H.; Gonzalez, L.; Mendoza-Revilla, J.; Carranza, N.L.; Grzywaczewski, A.H.; Oteri, F.; Dallago, C.; Trop, E.; de Almeida, B.P.; Sirelkhatim, H.; et al. The nucleotide transformer: Building and evaluating robust foundation models for human genomics. bioRxiv 2023. [Google Scholar] [CrossRef]
- Arribas-Hernández, L.; Rennie, S.; Köster, T.; Porcelli, C.; Lewinski, M.; Staiger, D.; Andersson, R.; Brodersen, P. Principles of mRNA targeting via the Arabidopsis m6A-binding protein ECT2. eLife 2021, 10, e72375. [Google Scholar] [CrossRef]
- Uhl, M.; Houwaart, T.; Corrado, G.; Wright, P.R.; Backofen, R. Computational analysis of CLIP-seq data. Methods 2017, 118, 60–72. [Google Scholar] [CrossRef] [PubMed]
- Hanan, M.; Soreq, H.; Kadener, S. CircRNAs in the brain. RNA Biol. 2017, 14, 1028–1034. [Google Scholar] [CrossRef]
- Mateos, J.L.; Staiger, D. Toward a systems view on RNA-binding proteins and associated RNAs in plants: Guilt by association. Plant Cell 2023, 35, 1708–1726. [Google Scholar] [CrossRef] [PubMed]
- Lewinski, M.; Brüggemann, M.; Köster, T.; Reichel, M.; Bergelt, T.; Meyer, K.; König, J.; Zarnack, K.; Staiger, D. Mapping protein–RNA binding in plants with individual-nucleotide-resolution UV cross-linking and immunoprecipitation (plant iCLIP2). Nat. Protoc. 2024, 19, 1183–1234. [Google Scholar] [CrossRef]
- Peng, X.; Wang, X.; Guo, Y.; Ge, Z.; Li, F.; Gao, X.; Song, J. RBP-TSTL is a two-stage transfer learning framework for genome-scale prediction of RNA-binding proteins. Brief. Bioinform. 2022, 23, bbac215. [Google Scholar] [CrossRef]
- Zhang, J.; Yan, K.; Chen, Q.; Liu, B. PreRBP-TL: Prediction of species-specific RNA-binding proteins based on transfer learning. Bioinformatics 2022, 38, 2135–2143. [Google Scholar] [CrossRef]
- Arican, O.C.; Gumus, O. PredDRBP-MLP: Prediction of DNA-binding proteins and RNA-binding proteins by multilayer perceptron. Comput. Biol. Med. 2023, 164, 107317. [Google Scholar] [CrossRef] [PubMed]
- Jin, W.; Brannan, K.W.; Kapeli, K.; Park, S.S.; Tan, H.Q.; Gosztyla, M.L.; Mujumdar, M.; Ahdout, J.; Henroid, B.; Rothamel, K.; et al. HydRA: Deep-learning models for predicting RNA-binding capacity from protein interaction association context and protein sequence. Mol. Cell 2023, 83, 2595–2611. [Google Scholar] [CrossRef]
- Wang, J.; Horlacher, M.; Cheng, L.; Winther, O. DeepLocRNA: An interpretable deep learning model for predicting RNA subcellular localisation with domain-specific transfer-learning. Bioinformatics 2024, 40, btae065. [Google Scholar] [CrossRef] [PubMed]
- Ching, T.; Himmelstein, D.S.; Beaulieu-Jones, B.K.; Kalinin, A.A.; Do, B.T.; Way, G.P.; Ferrero, E.; Agapow, P.M.; Zietz, M.; Hoffman, M.M.; et al. Opportunities and obstacles for deep learning in biology and medicine. J. R. Soc. Interface 2018, 15, 20170387. [Google Scholar] [CrossRef]
- Shrikumar, A.; Greenside, P.; Kundaje, A. Learning important features through propagating activation differences. In Proceedings of the International Conference on Machine Learning. PMLR, Sydney, Australia, 6–11 August 2017; pp. 3145–3153. [Google Scholar]
- Shrikumar, A.; Tian, K.; Avsec, Ž.; Shcherbina, A.; Banerjee, A.; Sharmin, M.; Nair, S.; Kundaje, A. Technical note on transcription factor motif discovery from importance scores (TF-MoDISco) version 0.5. 6.5. arXiv 2018, arXiv:1811.00416. [Google Scholar]
- Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems; 2017; pp. 4768–4777. Available online: https://dl.acm.org/doi/10.5555/3295222.3295230 (accessed on 10 May 2024).
- Nair, S.; Shrikumar, A.; Schreiber, J.; Kundaje, A. fastISM: Performant in silico saturation mutagenesis for convolutional neural networks. Bioinformatics 2022, 38, 2397–2403. [Google Scholar] [CrossRef] [PubMed]
- Marchese, F.P.; Raimondi, I.; Huarte, M. The multidimensional mechanisms of long noncoding RNA function. Genome Biol. 2017, 18, 206. [Google Scholar] [CrossRef]
- Ferre, F.; Colantoni, A.; Helmer-Citterich, M. Revealing protein–lncRNA interaction. Brief. Bioinform. 2016, 17, 106–116. [Google Scholar] [CrossRef]
- Fatica, A.; Bozzoni, I. Long non-coding RNAs: New players in cell differentiation and development. Nat. Rev. Genet. 2014, 15, 7–21. [Google Scholar] [CrossRef]
- Akhtar, J.; Lugoboni, M.; Junion, G. m6A RNA modification in transcription regulation. Transcription 2021, 12, 266–276. [Google Scholar] [CrossRef] [PubMed]
- Zaccara, S.; Jaffrey, S.R. A unified model for the function of YTHDF proteins in regulating m6A-modified mRNA. Cell 2020, 181, 1582–1595. [Google Scholar] [CrossRef] [PubMed]
- Arribas-Hernández, L.; Rennie, S.; Schon, M.; Porcelli, C.; Enugutti, B.; Andersson, R.; Nodine, M.D.; Brodersen, P. The YTHDF proteins ECT2 and ECT3 bind largely overlapping target sets and influence target mRNA abundance, not alternative polyadenylation. eLife 2021, 10, e72377. [Google Scholar] [CrossRef] [PubMed]
- Lal, A.; Mazan-Mamczarz, K.; Kawai, T.; Yang, X.; Martindale, J.L.; Gorospe, M. Concurrent versus individual binding of HuR and AUF1 to common labile target mRNAs. EMBO J. 2004, 23, 3092–3102. [Google Scholar] [CrossRef] [PubMed]
- Hu, X.; Zou, Q.; Yao, L.; Yang, X. Survey of the binding preferences of RNA-binding proteins to RNA editing events. Genome Biol. 2022, 23, 169. [Google Scholar] [CrossRef] [PubMed]
- Weirick, T.; Militello, G.; Hosen, M.R.; John, D.; Moore IV, J.B.; Uchida, S. Investigation of RNA Editing Sites within Bound Regions of RNA-Binding Proteins. High-Throughput 2019, 8, 19. [Google Scholar] [CrossRef]
Name | Data | Description | Model | Ref |
---|---|---|---|---|
HDRNet | RBP | Sequence + in vivo RNASS (icSHAPE) + DNABERT [42], data shared with PrismNet [43], 101 bp regions, random assignment of positions in test/training. | Attention | [44] |
BERT-RBP | RBP | Sequence + RNASS + DNABERT [42], 101 bp regions, data as per RBPsuite [45], random assignment of positions in test/training. | Attention | [46] |
RBPnet | RBP | Sequence to signal mixture approach for bias correction, 300 bp windows, chromosome-wise splits to test/training. | CNN | [29] |
DeepPN | RBP | Sequence + RNASS, bound-genes sourced negatives, 501 bp regions, random assignment of positions in test/training. | CNN/GCN | [47] |
PrismNet | RBP | Sequence + in vivo RNASS, 101 bp regions, negatives with >40% icSHAPE coverage sampled from transcriptome, random assignment of positions in test/training. | CNN/attention | [43] |
RNAProt | RBP | Multiple variable features, inc. sequence, RNASS, conservation, etc., 81 bp regions, random assignment of positions in test/training. | RNN | [48] |
DeepCLIP | RBP | Sequence, matched-gene negatives for training, up to 75 bp regions, random assignment of positions in test/training. | CNN/RNN | [34] |
DeepRiPe | RBP | Multitask models covering 59 RBPs, Sequence (150 bp regions derived from 50 bp bins) and genomic feature information (250 bp regions), random assignment of bins in test/training. | multi-output CNN | [49] |
iM6A | mA | Sequence-based mA site prediction, surrounding unmethylated sites as negatives, human+mouse | CNN | [33] |
m6Anet | mA | Trained on nanopore signal for molecule-resolution mA prediction. | NN/MIL | [23] |
EditPredict | A-to-I | Sequence, predict A-to-I editing sites sourced from REDIportal [50], non-edited sites as negatives, up to 200 bp regions, multi-species. | CNN | [51] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rennie, S. Deep Learning for Elucidating Modifications to RNA—Status and Challenges Ahead. Genes 2024, 15, 629. https://doi.org/10.3390/genes15050629
Rennie S. Deep Learning for Elucidating Modifications to RNA—Status and Challenges Ahead. Genes. 2024; 15(5):629. https://doi.org/10.3390/genes15050629
Chicago/Turabian StyleRennie, Sarah. 2024. "Deep Learning for Elucidating Modifications to RNA—Status and Challenges Ahead" Genes 15, no. 5: 629. https://doi.org/10.3390/genes15050629
APA StyleRennie, S. (2024). Deep Learning for Elucidating Modifications to RNA—Status and Challenges Ahead. Genes, 15(5), 629. https://doi.org/10.3390/genes15050629