AI-Based Prediction of Gene Expression in Single-Cell and Multiscale Genomics and Transcriptomics
Abstract
1. Introduction
2. Single-Cell Sequencing Principles and Protocols
3. Machine Learning and Deep Learning Models Used for Gene Expression Levels Prediction
3.1. Machine Learning and Deep Learning Architectures in Omics Research
3.2. Deep Learning Frameworks for Sequence-to-Expression Prediction
| Deep Learning Model | Target Analysis | Prediction Output | Advantages | Disadvantages |
|---|---|---|---|---|
| DeepSEA [56,60] | Chromatin features such as TF binding and histone marks from large-scale chromatin profiling data | DNase hypersensitivity, TF-binding sites, and histone modifications | Simple CNN architecture, directly learning from sequence (automated learning) Scalability Predictive power Good performance on large datasets | Computational costs, expensive resources High-quality demands for training data Interpretability—black box limitations available for all DL models |
| DeepBind [26] | Protein binding to DNA/RNA (TFs, RBPs; works on both microarray and sequencing data) | Probability for TFs/RBPs of binding on sequences prediction of sequence specificity of protein binding | Simple CNN architecture with automatic feature learning Efficiency Performs well in predicting DNA/RNA protein-binding sites Efficiency in analyzing complex patterns | Costs and resources Black box configuration The simple CNN architecture has troubles capturing long-range genomic interactions |
| Basenji [58] | Epigenetic and transcriptional analysis | Quantitative prediction of gene expression profiles and epigenetic marks (CAGE, TFBS, and histone modifications) | CNN architecture, but capable of analyzing longer sequences, provides information on distant enhancers Noise reduction Multitasking | Struggles with very long-range interactions, lower resolution compared to transformer-based models Analyzes the regulatory processes, but does not interpret the effect of the mutations in coding regions |
| Basset [62] | Chromatin accessibility, such as DNase hypersensitivity | Binary classification for chromatin accessibility in 164 cell types using sequencing data (DNase-seq) | A CNN capable of multitasking Prediction of variant effect Highly accurate predictions compared to traditional ML algorithms Automated feature extraction Pretraining accelerates the learning of new data | Designed only for processing short sequences It is more limited due to the binary prediction, characteristic of the original CNN architecture, compared to Basenji Data dependency |
| Enformer [55] | Epigenomic profiles and gene expression | Predicts transcriptional activity and chromatin features across large genomic regions | Transformer-based model capable of analyzing large genomic sequences Captures enhancer-promoter relationships Variant effect prediction Infers cross-gene correlation | Often fails to correctly attribute the direction of the effect of a mutation (failure to assess cross-individual correlation) High computational resources Trained on a fixed set of data, hard to generalize Limited resolution |
| DanQ [63] | TFBS prediction, mapping chromatin accessibility and histone marks | Predicts chromatin features and TFBS at base level | Hybrid architecture, both CNN and BiLSTM More precise than DeepSEA in predicting certain regulatory markers Predictions directly from DNA sequence Captures local and long-range dependencies | Computational expenses Needs training on large sets of labeled data to obtain accurate predictions Black box limitations |
| GraphReg [68] | Gene expression via 3D chromatin structure | Makes use of GATs to incorporate chromatin interaction graphs | GAT that uses 3D chromatin data to infer superior predictions on enhancer–promoter interactions Captures massive long-range interactions Higher accuracy in the prediction of TFBS | Data dependency Limited resolution Sensitive to noisy data |
| ExPecto [56] | Infers cell-type specific predictions for gene expression levels directly from DNA sequence based on histone marks, TF binding, and chromatin accessibility mapping | Epigenomic effect prediction using DNA sequences; predicting cell-type specific gene expression and effects of genomic variants | Directly from sequence predictions Tissue-type specific predictions Variant effect prediction Generalization across any human population variant | Limited to short DNA sequences Lower accuracy than benchmark models Fails to explain cross-individual variability Fails to attribute the right direction of variant’s effect Complex architecture |
| Xpresso [57] | Analyzing the sequence elements located within +/− 1500 bp around a TSS (TFBS, TSS, and chromatin accessibility) can reveal the expected mRNA abundance for the target gene | Predicts transcriptional activity (mRNA abundance) using TSS annotations and CAGE | DL model that predicts mRNA abundance directly from DNA sequence Can infer significant predictions based only on promoter sequences and mRNA stability Simple structure Efficient Can quantify non-promoter contributions to gene expression | Narrow genomic window Incorrect attributions of the direction of the variant’s effect on gene expression (increase or decrease) Limited interpretability (both due to fixed training dataset and the restricted analysis of the model) |
| DeFine [65] | Prediction of cell-type-specific DNA binding of TFs based on TF ChIP-seq data | Classification of TF-DNA binding or unbinding in the context of a genomic variant and prediction of the functional effect of the altered sequence | Multi-modal integration—DNA sequence, chromatin accessibility, and histone marks High accuracy of TFBS specificity prediction Captures long-range interactions | Data dependency Limited resolution High computational resources demanded Interpretability |
3.3. Models Focusing on Epigenetic Analysis
3.4. Models for Transcriptomic Analysis of Gene Expression
3.5. Models Specifically Used for Single-Cell Studies
3.6. The Potential of Quantum Computing Approaches for Gene Expression Prediction
4. Insight into the Role of Spatial Transcriptomics in Phenotype Prediction
5. Translational Implications
6. Benefits and Limitations
| Methods Used | Benefits | Limitations |
|---|---|---|
| scRNA-seq |
|
|
| ST |
|
|
| DL platforms [173] |
|
|
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
References
- Jovic, D.; Liang, X.; Zeng, H.; Lin, L.; Xu, F.; Luo, Y. Single-cell RNA Sequencing Technologies and Applications: A Brief Overview. Clin. Transl. Med. 2022, 12, e694. [Google Scholar] [CrossRef]
- Kashima, Y.; Sakamoto, Y.; Kaneko, K.; Seki, M.; Suzuki, Y.; Suzuki, A. Single-Cell Sequencing Techniques from Individual to Multiomics Analyses. Exp. Mol. Med. 2020, 52, 1419–1427. [Google Scholar] [CrossRef]
- Li, X.; Wang, C.Y. From Bulk, Single-Cell to Spatial RNA Sequencing. Int. J. Oral Sci. 2021, 13, 36. [Google Scholar] [CrossRef] [PubMed]
- Zhu, C.; Preissl, S.; Ren, B. Single-Cell Multimodal Omics: The Power of Many. Nat. Methods 2020, 17, 11–14. [Google Scholar] [CrossRef] [PubMed]
- Cuomo, A.S.E.; Nathan, A.; Raychaudhuri, S.; MacArthur, D.G.; Powell, J.E. Single-Cell Genomics Meets Human Genetics. Nat. Rev. Genet. 2023, 24, 535–549. [Google Scholar] [CrossRef] [PubMed]
- Wasney, M.; Pott, S. Simultaneous Measurement of DNA Methylation and Nucleosome Occupancy in Single Cells Using ScNOMe-Seq. In Chromatin Accessibility; Humana: New York, NY, USA, 2023; Volume 2611. [Google Scholar]
- Liu, L.; Liu, C.; Quintero, A.; Wu, L.; Yuan, Y.; Wang, M.; Cheng, M.; Leng, L.; Xu, L.; Dong, G.; et al. Deconvolution of Single-Cell Multi-Omics Layers Reveals Regulatory Heterogeneity. Nat. Commun. 2019, 10, 470. [Google Scholar] [CrossRef]
- Bai, Y.; Deng, X.; Chen, D.; Han, S.; Lin, Z.; Li, Z.; Tong, W.; Li, J.; Wang, T.; Liu, X.; et al. Integrative Analysis Based on ATAC-Seq and RNA-Seq Reveals a Novel Oncogene PRPF3 in Hepatocellular Carcinoma. Clin. Epigenet. 2024, 16, 154. [Google Scholar] [CrossRef]
- Williams, C.G.; Lee, H.J.; Asatsuma, T.; Vento-Tormo, R.; Haque, A. An Introduction to Spatial Transcriptomics for Biomedical Research. Genome Med. 2022, 14, 68. [Google Scholar] [CrossRef]
- Trapnell, C.; Cacchiarelli, D.; Grimsby, J.; Pokharel, P.; Li, S.; Morse, M.; Lennon, N.J.; Livak, K.J.; Mikkelsen, T.S.; Rinn, J.L. The Dynamics and Regulators of Cell Fate Decisions Are Revealed by Pseudotemporal Ordering of Single Cells. Nat. Biotechnol. 2014, 32, 381–386. [Google Scholar] [CrossRef]
- Schiebinger, G.; Shu, J.; Tabaka, M.; Cleary, B.; Subramanian, V.; Solomon, A.; Gould, J.; Liu, S.; Lin, S.; Berube, P.; et al. Optimal-Transport Analysis of Single-Cell Gene Expression Identifies Developmental Trajectories in Reprogramming. Cell 2019, 176, 928–943.e22, Erratum in Cell 2019, 176, 1517. [Google Scholar] [CrossRef]
- Hou, W.; Ji, Z.; Chen, Z.; Wherry, E.J.; Hicks, S.C.; Ji, H. A Statistical Framework for Differential Pseudotime Analysis with Multiple Single-Cell RNA-Seq Samples. Nat. Commun. 2023, 14, 7286. [Google Scholar] [CrossRef]
- Kobayashi-Kirschvink, K.J.; Comiter, C.S.; Gaddam, S.; Joren, T.; Grody, E.I.; Ounadjela, J.R.; Zhang, K.; Ge, B.; Kang, J.W.; Xavier, R.J.; et al. Prediction of Single-Cell RNA Expression Profiles in Live Cells by Raman Microscopy with Raman2RNA. Nat. Biotechnol. 2024, 42, 1726–1734. [Google Scholar] [CrossRef]
- Chen, W.; Guillaume-Gentil, O.; Rainer, P.Y.; Gäbelein, C.G.; Saelens, W.; Gardeux, V.; Klaeger, A.; Dainese, R.; Zachara, M.; Zambelli, T.; et al. Live-Seq Enables Temporal Transcriptomic Recording of Single Cells. Nature 2022, 608, 733–740. [Google Scholar] [CrossRef]
- Keyl, P.; Bischoff, P.; Dernbach, G.; Bockmayr, M.; Fritz, R.; Horst, D.; Blüthgen, N.; Montavon, G.; Müller, K.R.; Klauschen, F. Single-Cell Gene Regulatory Network Prediction by Explainable AI. Nucleic Acids Res. 2023, 51, E20. [Google Scholar] [CrossRef]
- Elshewey, A.M. Enhancing Crop Yield Prediction Based on Dove Optimization Algorithm and Gradient Boosting Model. Signal Image Video Process. 2025, 19, 951. [Google Scholar] [CrossRef]
- El-Rashidy, N.; Tarek, Z.; Elshewey, A.M.; Shams, M.Y. Multitask Multilayer-Prediction Model for Predicting Mechanical Ventilation and the Associated Mortality Rate. Neural Comput. Appl. 2025, 37, 1321–1343. [Google Scholar] [CrossRef]
- Tarek, Z.; Shams, M.Y.; Towfek, S.K.; Alkahtani, H.K.; Ibrahim, A.; Abdelhamid, A.A.; Eid, M.M.; Khodadadi, N.; Abualigah, L.; Khafaga, D.S.; et al. An Optimized Model Based on Deep Learning and Gated Recurrent Unit for COVID-19 Death Prediction. Biomimetics 2023, 8, 552. [Google Scholar] [CrossRef] [PubMed]
- Li, Z.; Gao, E.; Zhou, J.; Han, W.; Xu, X.; Gao, X. Applications of Deep Learning in Understanding Gene Regulation. Cell Rep. Methods 2023, 3, 100384. [Google Scholar] [CrossRef]
- Nosrati, H.; Nosrati, M. Artificial Intelligence in Regenerative Medicine: Applications and Implications. Biomimetics 2023, 8, 442. [Google Scholar] [CrossRef]
- Schwessinger, R.; Deasy, J.; Woodruff, R.T.; Young, S.; Branson, K.M. Single-Cell Gene Expression Prediction from DNA Sequence at Large Contexts. BioRxiv 2023. [Google Scholar] [CrossRef]
- Buccitelli, C.; Selbach, M. MRNAs, Proteins and the Emerging Principles of Gene Expression Control. Nat. Rev. Genet. 2020, 21, 630–644. [Google Scholar] [CrossRef] [PubMed]
- Wang, D.; Eraslan, B.; Wieland, T.; Hallström, B.; Hopf, T.; Zolg, D.P.; Zecha, J.; Asplund, A.; Li, L.; Meng, C.; et al. A Deep Proteome and Transcriptome Abundance Atlas of 29 Healthy Human Tissues. Mol. Syst. Biol. 2019, 15, e8503. [Google Scholar] [CrossRef]
- Chick, J.M.; Munger, S.C.; Simecek, P.; Huttlin, E.L.; Choi, K.; Gatti, D.M.; Raghupathy, N.; Svenson, K.L.; Churchill, G.A.; Gygi, S.P. Defining the Consequences of Genetic Variation on a Proteome-Wide Scale. Nature 2016, 534, 500–505, Correction in Nature 2022, 606, E16. [Google Scholar] [CrossRef]
- Avsec, Ž.; Agarwal, V.; Visentin, D.; Ledsam, J.R.; Grabska-Barwinska, A.; Taylor, K.R.; Assael, Y.; Jumper, J.; Kohli, P.; Kelley, D.R. Effective Gene Expression Prediction from Sequence by Integrating Long-Range Interactions. Nat. Methods 2021, 18, 1196–1203. [Google Scholar] [CrossRef] [PubMed]
- Alipanahi, B.; Delong, A.; Weirauch, M.T.; Frey, B.J. Predicting the Sequence Specificities of DNA- and RNA-Binding Proteins by Deep Learning. Nat. Biotechnol. 2015, 33, 831–838. [Google Scholar] [CrossRef]
- Auerbach, B.J.; Hu, J.; Reilly, M.P.; Li, M. Applications of Single-Cell Genomics and Computational Strategies to Study Common Disease and Population-Level Variation. Genome Res. 2021, 31, 1728–1741. [Google Scholar] [CrossRef]
- Chen, G.; Ning, B.; Shi, T. Single-Cell RNA-Seq Technologies and Related Computational Data Analysis. Front. Genet. 2019, 10, 317. [Google Scholar] [CrossRef]
- Van de Sande, B.; Lee, J.S.; Mutasa-Gottgens, E.; Naughton, B.; Bacon, W.; Manning, J.; Wang, Y.; Pollard, J.; Mendez, M.; Hill, J.; et al. Applications of Single-Cell RNA Sequencing in Drug Discovery and Development. Nat. Rev. Drug Discov. 2023, 22, 496–520. [Google Scholar] [CrossRef]
- He, J.; Lin, L.; Chen, J. Practical Bioinformatics Pipelines for Single-Cell RNA-Seq Data Analysis. Biophys Rep. 2022, 8, 158–169. [Google Scholar] [PubMed]
- van den Brink, S.; Sage, F.; Vertesy, A.; Spanjaard, B.; Peterson-Maduro, J.; Baron, C.; Robin, C.; van Oudenaarden, A. Single-Cell Sequencing Reveals Dissociation-Induced Gene Expression in Tissue Subpopulations. Nat. Methods 2017, 14, 935–936. [Google Scholar] [CrossRef]
- Ding, J.; Adiconis, X.; Simmons, S.K.; Kowalczyk, M.S.; Hession, C.C.; Marjanovic, N.D.; Hughes, T.K.; Wadsworth, M.H.; Burks, T.; Nguyen, L.T.; et al. Systematic Comparison of Single-Cell and Single-Nucleus RNA-Sequencing Methods. Nat. Biotechnol. 2020, 38, 737–746, Correction in Nat. Biotechnol. 2020, 38, 756. [Google Scholar] [CrossRef] [PubMed]
- Picelli, S.; Faridani, O.R.; Björklund, Å.K.; Winberg, G.; Sagasser, S.; Sandberg, R. Full-Length RNA-Seq from Single Cells Using Smart-Seq2. Nat. Protoc. 2014, 9, 171–181. [Google Scholar] [CrossRef] [PubMed]
- Bageritz, J.; Raddi, G. Single-Cell RNA Sequencing with Drop-Seq. In Single Cell Methods. Methods in Molecular Biology; Humana: New York, NY, USA, 2019; Volume 1979, pp. 73–85. [Google Scholar]
- DeLaughter, D. The Use of the Fluidigm C1 for RNA Expression Analyses of Single Cells. Curr. Protoc. Mol. Biol. 2018, 122, e55. [Google Scholar] [CrossRef]
- Danielski, K. Guidance on Processing the 10× Genomics Single Cell Gene Expression Assay. In Methods in Molecular Biology; Humana: New York, NY, USA, 2022; Volume 2584. [Google Scholar]
- Sheng, K.; Cao, W.; Niu, Y.; Deng, Q.; Zong, C. Effective Detection of Variation in Single-Cell Transcriptomes Using MATQ-Seq. Nat. Methods 2017, 14, 267–270. [Google Scholar] [CrossRef] [PubMed]
- Gierahn, T.M.; Wadsworth, M.H.; Hughes, T.K.; Bryson, B.D.; Butler, A.; Satija, R.; Fortune, S.; Christopher Love, J.; Shalek, A.K. Seq-Well: Portable, Low-Cost Rna Sequencing of Single Cells at High Throughput. Nat. Methods 2017, 14, 395–398, Erratum in Nat. Methods 2017, 14, 752. [Google Scholar] [CrossRef]
- Liu, C.; Wu, T.; Fan, F.; Liu, Y.; Wu, L.; Junkin, M.; Wang, Z.; Yu, Y.; Wang, W.; Wei, W.; et al. A Portable and Cost-Effective Microfluidic System for Massively Parallel Single-Cell Transcriptome Profiling. BioRxiv 2019. [Google Scholar] [CrossRef]
- Kolodziejczyk, A.A.; Kim, J.K.; Svensson, V.; Marioni, J.C.; Teichmann, S.A. The Technology and Biology of Single-Cell RNA Sequencing. Mol. Cell 2015, 58, 610–620. [Google Scholar] [CrossRef]
- Keren-Shaul, H.; Kenigsberg, E.; Jaitin, D.A.; David, E.; Paul, F.; Tanay, A.; Amit, I. MARS-Seq2.0: An Experimental and Analytical Pipeline for Indexed Sorting Combined with Single-Cell RNA Sequencing. Nat. Protoc. 2019, 14, 1841–1862. [Google Scholar] [CrossRef]
- Hashimshony, T.; Wagner, F.; Sher, N.; Yanai, I. CEL-Seq: Single-Cell RNA-Seq by Multiplexed Linear Amplification. Cell Rep. 2012, 2, 666–673. [Google Scholar] [CrossRef]
- Juzenas, S.; Goda, K.; Kiseliovas, V.; Zvirblyte, J.; Quintinal-Villalonga, A.; Siurkus, J.; Nainys, J.; Mazutis, L. InDrops-2: A Flexible, Versatile and Cost-Efficient Droplet Microfluidic Approach for High-Throughput ScRNA-Seq of Fresh and Preserved Clinical Samples. Nucleic Acids Res. 2025, 53, gkae1312. [Google Scholar] [CrossRef]
- Heumos, L.; Schaar, A.C.; Lance, C.; Litinetskaya, A.; Drost, F.; Zappia, L.; Lücken, M.D.; Strobl, D.C.; Henao, J.; Curion, F.; et al. Best Practices for Single-Cell Analysis across Modalities. Nat. Rev. Genet. 2023, 24, 550–572. [Google Scholar] [CrossRef]
- Zheng, G.X.Y.; Terry, J.M.; Belgrader, P.; Ryvkin, P.; Bent, Z.W.; Wilson, R.; Ziraldo, S.B.; Wheeler, T.D.; McDermott, G.P.; Zhu, J.; et al. Massively Parallel Digital Transcriptional Profiling of Single Cells. Nat. Commun. 2017, 8, 14049. [Google Scholar] [CrossRef] [PubMed]
- Xu, X.; Zhang, Q.; Li, M.; Lin, S.; Liang, S.; Cai, L.; Zhu, H.; Su, R.; Yang, C. Microfluidic Single-Cell Multiomics Analysis. View 2023, 4, 20220034. [Google Scholar] [CrossRef]
- De Jonghe, J.; Kaminski, T.S.; Morse, D.B.; Tabaka, M.; Ellermann, A.L.; Kohler, T.N.; Amadei, G.; Handford, C.E.; Findlay, G.M.; Zernicka-Goetz, M.; et al. SpinDrop: A Droplet Microfluidic Platform to Maximise Single-Cell Sequencing Information Content. Nat. Commun. 2023, 14, 4788. [Google Scholar] [CrossRef] [PubMed]
- Hong, R.; Koga, Y.; Bandyadka, S.; Leshchyk, A.; Wang, Y.; Akavoor, V.; Cao, X.; Sarfraz, I.; Wang, Z.; Alabdullatif, S.; et al. Comprehensive Generation, Visualization, and Reporting of Quality Control Metrics for Single-Cell RNA Sequencing Data. Nat. Commun. 2022, 13, 1688. [Google Scholar] [CrossRef] [PubMed]
- Zhao, X.; Du, A.; Qiu, P. ScMODD: A Model-Driven Algorithm for Doublet Identification in Single-Cell RNA-Sequencing Data. Front. Syst. Biol. 2022, 2, 1082309. [Google Scholar] [CrossRef]
- Wolock, S.L.; Lopez, R.; Klein, A.M. Scrublet: Computational Identification of Cell Doublets in Single-Cell Transcriptomic Data. Cell Syst. 2019, 8, 281–291.e9. [Google Scholar] [CrossRef]
- Bais, A.S.; Kostka, D. Gene Expression Scds: Computational Annotation of Doublets in Single-Cell RNA Sequencing Data. Bioinformatics 2020, 36, 1150–1158. [Google Scholar] [CrossRef]
- Erfanian, N.; Heydari, A.A.; Feriz, A.M.; Iañez, P.; Derakhshani, A.; Ghasemigol, M.; Farahpour, M.; Razavi, S.M.; Nasseri, S.; Safarpour, H.; et al. Deep Learning Applications in Single-Cell Genomics and Transcriptomics Data Analysis. Biomed. Pharmacother. 2023, 165, 115077. [Google Scholar] [CrossRef]
- Shen, X.; Jiang, C.; Wen, Y.; Li, C.; Lu, Q. A Brief Review on Deep Learning Applications in Genomic Studies. Front. Syst. Biol. 2022, 2, 877717. [Google Scholar] [CrossRef]
- Yue, T.; Wang, Y.; Zhang, L.; Gu, C.; Xue, H.; Wang, W.; Lyu, Q.; Dun, Y. Deep Learning for Genomics: From Early Neural Nets to Modern Large Language Models. Int. J. Mol. Sci. 2023, 24, 15858. [Google Scholar] [CrossRef]
- Ramprasad, P.; Pai, N.; Pan, W. Enhancing Personalized Gene Expression Prediction from DNA Sequences Using Genomic Foundation Models. Hum. Genet. Genom. Adv. 2024, 5, 100347. [Google Scholar] [CrossRef]
- Zhou, J.; Theesfeld, C.L.; Yao, K.; Chen, K.M.; Wong, A.K.; Troyanskaya, O.G. Deep Learning Sequence-Based Ab Initio Prediction of Variant Effects on Expression and Disease Risk. Nat. Genet. 2018, 50, 1171–1179. [Google Scholar] [CrossRef]
- Agarwal, V.; Shendure, J. Predicting MRNA Abundance Directly from Genomic Sequence Using Deep Convolutional Neural Networks. Cell Rep. 2020, 31, 107663. [Google Scholar] [CrossRef] [PubMed]
- Kelley, D.R. Cross-Species Regulatory Sequence Activity Prediction. PLoS Comput. Biol. 2020, 16, e1008050. [Google Scholar] [CrossRef]
- Zeng, H.; Edwards, M.D.; Liu, G.; Gifford, D.K. Convolutional Neural Network Architectures for Predicting DNA-Protein Binding. Bioinformatics 2016, 32, i121–i127. [Google Scholar] [CrossRef] [PubMed]
- Zhou, J.; Troyanskaya, O.G. Predicting Effects of Noncoding Variants with Deep Learning-Based Sequence Model. Nat. Methods 2015, 12, 931–934. [Google Scholar] [CrossRef] [PubMed]
- Hu, X.; Fernie, A.R.; Yan, J. Deep Learning in Regulatory Genomics: From Identification to Design. Curr. Opin. Biotechnol. 2023, 79, 102887. [Google Scholar] [CrossRef]
- Kelley, D.R.; Snoek, J.; Rinn, J.L. Basset: Learning the Regulatory Code of the Accessible Genome with Deep Convolutional Neural Networks. Genome Res. 2016, 26, 990–999. [Google Scholar] [CrossRef]
- Quang, D.; Xie, X. DanQ: A Hybrid Convolutional and Recurrent Deep Neural Network for Quantifying the Function of DNA Sequences. Nucleic Acids Res. 2016, 44, e107. [Google Scholar] [CrossRef]
- Zeng, H.; Gifford, D.K. Predicting the Impact of Non-Coding Variants on DNA Methylation. Nucleic Acids Res. 2017, 45, e99. [Google Scholar] [CrossRef]
- Wang, M.; Tai, C.; Weinan, E.; Wei, L. DeFine: Deep Convolutional Neural Networks Accurately Quantify Intensities of Transcription Factor-DNA Binding and Facilitate Evaluation of Functional Non-Coding Variants. Nucleic Acids Res. 2018, 46, E69. [Google Scholar] [CrossRef] [PubMed]
- Fudenberg, G.; Kelley, D.R.; Pollard, K.S. Predicting 3D Genome Folding from DNA Sequence with Akita. Nat. Methods 2020, 17, 1111–1117. [Google Scholar] [CrossRef]
- Zhou, J. Sequence-Based Modeling of Three-Dimensional Genome Architecture from Kilobase to Chromosome Scale. Nat. Genet. 2022, 54, 725–734. [Google Scholar] [CrossRef]
- Karbalayghareh, A.; Sahin, M.; Leslie, C.S. Chromatin Interaction-Aware Gene Regulatory Modeling with Graph Attention Networks. Genome Res. 2022, 32, 930–944. [Google Scholar] [CrossRef] [PubMed]
- Dalla-Torre, H.; Gonzalez, L.; Mendoza-Revilla, J.; Lopez Carranza, N.; Grzywaczewski, A.H.; Oteri, F.; Dallago, C.; Trop, E.; de Almeida, B.P.; Sirelkhatim, H.; et al. Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics. Nat. Methods 2025, 22, 287–297. [Google Scholar] [CrossRef]
- Sasse, A.; Ng, B.; Spiro, A.E.; Tasaki, S.; Bennett, D.A.; Gaiteri, C.; De Jager, P.L.; Chikina, M.; Mostafavi, S. Benchmarking of Deep Neural Networks for Predicting Personal Gene Expression from DNA Sequence Highlights Shortcomings. Nat. Genet. 2023, 55, 2060–2064. [Google Scholar] [CrossRef]
- Mai, J.; Lu, M.; Gao, Q.; Zeng, J.; Xiao, J. Transcriptome-Wide Association Studies: Recent Advances in Methods, Applications and Available Databases. Commun. Biol. 2023, 6, 899. [Google Scholar] [CrossRef]
- Xie, R.; Quitadamo, A.; Cheng, J.; Shi, X. A Predictive Model of Gene Expression Using a Deep Learning Framework. In Proceedings of the 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Shenzhen, China, 15–18 December 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 676–681. [Google Scholar]
- Tibshiranit, R. Regression Shrinkage and Selection via the Lasso. J. R. Statist. Soc. B 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Yang, Y.; Pe’er, D. REUNION: Transcription Factor Binding Prediction and Regulatory Association Inference from Single-Cell Multi-Omics Data. Bioinformatics 2024, 40, i567–i575. [Google Scholar] [CrossRef]
- Lin, X.; Jiang, S.; Gao, L.; Wei, Z.; Wang, J. MultiSC: A Deep Learning Pipeline for Analyzing Multiomics Single-Cell Data. Brief. Bioinform. 2024, 25, bbae492. [Google Scholar] [CrossRef]
- Levy, J.J.; Titus, A.J.; Petersen, C.L.; Chen, Y.; Salas, L.A.; Christensen, B.C. MethylNet: An Automated and Modular Deep Learning Approach for DNA Methylation Analysis. BMC Bioinform. 2020, 21, 108. [Google Scholar] [CrossRef]
- Wang, Y.; Liu, T.; Xu, D.; Shi, H.; Zhang, C.; Mo, Y.Y.; Wang, Z. Predicting DNA Methylation State of CpG Dinucleotide Using Genome Topological Features and Deep Networks. Sci. Rep. 2016, 6, 19598. [Google Scholar] [CrossRef]
- Ni, P.; Huang, N.; Zhang, Z.; Wang, D.P.; Liang, F.; Miao, Y.; Xiao, C.L.; Luo, F.; Wang, J. DeepSignal: Detecting DNA Methylation State from Nanopore Sequencing Reads Using Deep-Learning. Bioinformatics 2019, 35, 4586–4595. [Google Scholar] [CrossRef]
- Minnoye, L.; Taskiran, I.I.; Mauduit, D.; Fazio, M.; van Aerschot, L.; Hulselmans, G.; Christiaens, V.; Makhzami, S.; Seltenhammer, M.; Karras, P.; et al. Cross-Species Analysis of Enhancer Logic Using Deep Learning. Genome Res. 2020, 31, 1815–1834. [Google Scholar] [CrossRef]
- Zhou, J.; Zhang, B.; Li, H.; Zhou, L.; Li, Z.; Long, Y.; Han, W.; Wang, M.; Cui, H.; Li, J.; et al. Annotating TSSs in Multiple Cell Types Based on DNA Sequence and RNA-Seq Data via DeeReCT-TSS. Genom. Proteom. Bioinform. 2022, 20, 959–973. [Google Scholar] [CrossRef] [PubMed]
- Umarov, R.; Kuwahara, H.; Li, Y.; Gao, X.; Solovyev, V. Promoter Analysis and Prediction in the Human Genome Using Sequence-Based Deep Learning Models. Bioinformatics 2019, 35, 2730–2737. [Google Scholar] [CrossRef]
- Zhang, Z.; Pan, Z.; Ying, Y.; Xie, Z.; Adhikari, S.; Phillips, J.; Carstens, R.P.; Black, D.L.; Wu, Y.; Xing, Y. Deep-Learning Augmented RNA-Seq Analysis of Transcript Splicing. Nat. Methods 2019, 16, 307–310. [Google Scholar] [CrossRef]
- Jaganathan, K.; Kyriazopoulou Panagiotopoulou, S.; McRae, J.F.; Darbandi, S.F.; Knowles, D.; Li, Y.I.; Kosmicki, J.A.; Arbelaez, J.; Cui, W.; Schwartz, G.B.; et al. Predicting Splicing from Primary Sequence with Deep Learning. Cell 2019, 176, 535–548.e24. [Google Scholar] [CrossRef] [PubMed]
- Zeng, T.; Li, Y.I. Predicting RNA Splicing from DNA Sequence Using Pangolin. Genome Biol. 2022, 23, 103. [Google Scholar] [CrossRef] [PubMed]
- Cheng, S.; Guo, M.; Wang, C.; Liu, X.; Liu, Y.; Wu, X. MiRTDL: A Deep Learning Approach for MiRNA Target Prediction. IEEE ACM Trans. Comput. Biol. Bioinform. 2015, 13, 1161–1169. [Google Scholar] [CrossRef]
- Eisele, A.S.; Tarbier, M.; Dormann, A.A.; Pelechano, V.; Suter, D.M. Gene-Expression Memory-Based Prediction of Cell Lineages from ScRNA-Seq Datasets. Nat. Commun. 2024, 15, 2744, Correction in Nat. Commun. 2024, 15, 4752. [Google Scholar] [CrossRef]
- Malekpour, S.A.; Haghverdi, L.; Sadeghi, M. Single-Cell Multi-Omics Analysis Identifies Context-Specific Gene Regulatory Gates and Mechanisms. Brief. Bioinform. 2024, 25, bbae180. [Google Scholar] [CrossRef]
- Ardakani, F.B.; Kattler, K.; Heinen, T.; Schmidt, F.; Feuerborn, D.; Gasparoni, G.; Lepikhov, K.; Nell, P.; Hengstler, J.; Walter, J.; et al. Prediction of Single-Cell Gene Expression for Transcription Factor Analysis. Gigascience 2020, 9, giaa113. [Google Scholar] [CrossRef]
- Zhang, J.; Larschan, E.; Bigness, J.; Singh, R. ScNODE: Generative Model for Temporal Single Cell Transcriptomic Data Prediction. Bioinformatics 2024, 40, ii146–ii154. [Google Scholar] [CrossRef]
- Hossain, I.; Fanfani, V.; Fischer, J.; Quackenbush, J.; Burkholz, R. Biologically Informed NeuralODEs for Genome-Wide Regulatory Dynamics. Genome Biol. 2024, 25, 127. [Google Scholar] [CrossRef]
- Eraslan, G.; Simon, L.M.; Mircea, M.; Mueller, N.S.; Theis, F.J. Single-Cell RNA-Seq Denoising Using a Deep Count Autoencoder. Nat. Commun. 2019, 10, 390. [Google Scholar] [CrossRef] [PubMed]
- Amodio, M.; van Dijk, D.; Srinivasan, K.; Chen, W.S.; Mohsen, H.; Moon, K.R.; Campbell, A.; Zhao, Y.; Wang, X.; Venkataswamy, M.; et al. Exploring Single-Cell Data with Deep Multitasking Neural Networks. Nat. Methods 2019, 16, 1139–1145. [Google Scholar] [CrossRef] [PubMed]
- Talwar, D.; Mongia, A.; Sengupta, D.; Majumdar, A. AutoImpute: Autoencoder Based Imputation of Single-Cell RNA-Seq Data. Sci. Rep. 2018, 8, 16329. [Google Scholar] [CrossRef]
- Mongia, A.; Sengupta, D.; Majumdar, A. DeepMc: Deep Matrix Completion for Imputation of Single Cell RNA-Seq Data. J. Comput. Biol. 2018, 27, 1011–1019. [Google Scholar] [CrossRef] [PubMed]
- Arisdakessian, C.; Poirion, O.; Yunits, B.; Zhu, X.; Garmire, L.X. DeepImpute: An Accurate, Fast, and Scalable Deep Neural Network Method to Impute Single-Cell RNA-Seq Data. Genome Biol. 2019, 20, 211. [Google Scholar] [CrossRef]
- Lopez, R.; Regier, J.; Cole, M.B.; Jordan, M.I.; Yosef, N. Deep Generative Modeling for Single-Cell Transcriptomics. Nat. Methods 2018, 15, 1053–1058. [Google Scholar] [CrossRef]
- Zou, B.; Zhang, T.; Zhou, R.; Jiang, X.; Yang, H.; Jin, X.; Bai, Y. DeepMNN: Deep Learning-Based Single-Cell RNA Sequencing Data Batch Correction Using Mutual Nearest Neighbors. Front. Genet. 2021, 12, 708981. [Google Scholar] [CrossRef]
- Wang, T.; Johnson, T.S.; Shao, W.; Lu, Z.; Helm, B.R.; Zhang, J.; Huang, K. BERMUDA: A Novel Deep Transfer Learning Method for Single-Cell RNA Sequencing Batch Correction Reveals Hidden High-Resolution Cellular Subtypes. Genome Biol. 2019, 20, 165. [Google Scholar] [CrossRef]
- Li, X.; Wang, K.; Lyu, Y.; Pan, H.; Zhang, J.; Stambolian, D.; Susztak, K.; Reilly, M.P.; Hu, G.; Li, M. Deep Learning Enables Accurate Clustering with Batch Effect Removal in Single-Cell RNA-Seq Analysis. Nat. Commun. 2020, 11, 2338. [Google Scholar] [CrossRef]
- Qiu, P. Embracing the Dropouts in Single-Cell RNA-Seq Analysis. Nat. Commun. 2020, 11, 1169. [Google Scholar] [CrossRef]
- van Dijk, D.; Sharma, R.; Nainys, J.; Yim, K.; Kathail, P.; Carr, A.J.; Burdziak, C.; Moon, K.R.; Chaffer, C.L.; Pattabiraman, D.; et al. Recovering Gene Interactions from Single-Cell Data Using Data Diffusion. Cell 2018, 174, 716–729.e27. [Google Scholar] [CrossRef]
- Huang, M.; Wang, J.; Torre, E.; Dueck, H.; Shaffer, S.; Bonasio, R.; Murray, J.I.; Raj, A.; Li, M.; Zhang, N.R. SAVER: Gene Expression Recovery for Single-Cell RNA Sequencing. Nat. Methods 2018, 15, 539–542. [Google Scholar] [CrossRef] [PubMed]
- Ding, J.; Condon, A.; Shah, S.P. Interpretable Dimensionality Reduction of Single Cell Transcriptome Data with Deep Generative Models. Nat. Commun. 2018, 9, 2002. [Google Scholar] [CrossRef] [PubMed]
- Märtens, K.; Yau, C. BasisVAE: Translation-Invariant Feature-Level Clustering with Variational Autoencoders. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Online, 26–28 August 2020. [Google Scholar]
- Wang, J.; Xia, J.; Wang, H.; Su, Y.; Zheng, C.H. ScDCCA: Deep Contrastive Clustering for Single-Cell RNA-Seq Data Based on Auto-Encoder Network. Brief. Bioinform. 2023, 24, bbac625. [Google Scholar] [CrossRef] [PubMed]
- Tian, T.; Wan, J.; Song, Q.; Wei, Z. Clustering Single-Cell RNA-Seq Data with a Model-Based Deep Learning Approach. Nat. Mach. Intell. 2019, 1, 191–198. [Google Scholar] [CrossRef]
- Jin, S.; Guerrero-Juarez, C.F.; Zhang, L.; Chang, I.; Ramos, R.; Kuan, C.H.; Myung, P.; Plikus, M.V.; Nie, Q. Inference and Analysis of Cell-Cell Communication Using CellChat. Nat. Commun. 2021, 12, 1088. [Google Scholar] [CrossRef]
- Efremova, M.; Vento-Tormo, M.; Teichmann, S.A.; Vento-Tormo, R. CellPhoneDB: Inferring Cell–Cell Communication from Combined Expression of Multi-Subunit Ligand–Receptor Complexes. Nat. Protoc. 2020, 15, 1484–1506. [Google Scholar] [CrossRef] [PubMed]
- Zuo, C.; Chen, L. Deep-Joint-Learning Analysis Model of Single Cell Transcriptome and Open Chromatin Accessibility Data. Brief. Bioinform. 2021, 22, bbaa287. [Google Scholar] [CrossRef] [PubMed]
- Ma, A.; Wang, X.; Li, J.; Wang, C.; Xiao, T.; Liu, Y.; Cheng, H.; Wang, J.; Li, Y.; Chang, Y.; et al. Single-Cell Biological Network Inference Using a Heterogeneous Graph Transformer. Nat. Commun. 2023, 14, 964. [Google Scholar] [CrossRef]
- Cui, H.; Maan, H.; Vladoiu, M.C.; Zhang, J.; Taylor, M.D.; Wang, B. DeepVelo: Deep Learning Extends RNA Velocity to Multi-Lineage Systems with Cell-Specific Kinetics. Genome Biol. 2024, 25, 27. [Google Scholar] [CrossRef] [PubMed]
- Jiang, Q.; Chen, S.; Chen, X.; Jiang, R. Gene Expression ScPRAM Accurately Predicts Single-Cell Gene Expression Perturbation Response Based on Attention Mechanism. Bioinformatics 2024, 40, btae265. [Google Scholar] [CrossRef]
- Kana, O.; Nault, R.; Filipovic, D.; Marri, D.; Zacharewski, T.; Bhattacharya, S. Generative Modeling of Single-Cell Gene Expression for Dose-Dependent Chemical Perturbations. Patterns 2023, 4, 100817. [Google Scholar] [CrossRef]
- Lotfollahi, M.; Wolf, F.A.; Theis, F.J. ScGen Predicts Single-Cell Perturbation Responses. Nat. Methods 2019, 16, 715–721. [Google Scholar] [CrossRef]
- Bunne, C.; Stark, S.G.; Gut, G.; del Castillo, J.S.; Levesque, M.; Lehmann, K.V.; Pelkmans, L.; Krause, A.; Rätsch, G. Learning Single-Cell Perturbation Responses Using Neural Optimal Transport. Nat. Methods 2023, 20, 1759–1768, Correction in Nat. Methods 2023, 20, 1830. [Google Scholar] [CrossRef]
- Mao, Y.; Lin, Y.Y.; Wong, N.K.Y.; Volik, S.; Sar, F.; Collins, C.; Ester, M. Phenotype Prediction from Single-Cell RNA-Seq Data Using Attention-Based Neural Networks. Bioinformatics 2024, 40, btae067. [Google Scholar] [CrossRef]
- Nałecz-Charkiewicz, K.; Charkiewicz, K.; Nowak, R.M. Quantum Computing in Bioinformatics: A Systematic Review Mapping. Brief. Bioinform. 2024, 25, bbae391. [Google Scholar] [CrossRef]
- Roman-Vicharra, C.; Cai, J.J. Quantum Gene Regulatory Networks. Npj Quantum Inf. 2023, 9, 67. [Google Scholar] [CrossRef]
- Kubacki, M.; Niranjan, M. Quantum Annealing-Based Clustering of Single Cell RNA-Seq Data. Brief. Bioinform. 2023, 24, bbad377. [Google Scholar] [CrossRef] [PubMed]
- Ghosh, A.; Fuad, M.M.; Bhattacharjee, S. Empirical Quantum Advantage Analysis of Quantum Kernel in Gene Expression Data. arXiv 2024, arXiv:2411.07276. [Google Scholar] [CrossRef]
- Repetto, V.; Ceroni, E.G.; Buonaiuto, G.; D’Aurizio, R. Quantum Enhanced Stratification of Breast Cancer: Exploring Quantum Expressivity for Real Omics Data. Quantum Mach. Intell. 2025, 7, 81. [Google Scholar] [CrossRef]
- Rossini, M.; Weidner, F.M.; Ankerhold, J.; Kestler, H.A. A Novel Quantum Algorithm for Efficient Attractor Search in Gene Regulatory Networks. Patterns 2025, 6, 101295. [Google Scholar] [CrossRef]
- Tran, K.A.; Kondrashova, O.; Bradley, A.; Williams, E.D.; Pearson, J.V.; Waddell, N. Deep Learning in Cancer Diagnosis, Prognosis and Treatment Selection. Genome Med. 2021, 13, 152. [Google Scholar] [CrossRef] [PubMed]
- Xia, C.; Babcock, H.P.; Moffitt, J.R.; Zhuang, X. Multiplexed Detection of RNA Using MERFISH and Branched DNA Amplification. Sci. Rep. 2019, 9, 7721. [Google Scholar] [CrossRef] [PubMed]
- Eng, C.H.L.; Lawson, M.; Zhu, Q.; Dries, R.; Koulena, N.; Takei, Y.; Yun, J.; Cronin, C.; Karp, C.; Yuan, G.C.; et al. Transcriptome-Scale Super-Resolved Imaging in Tissues by RNA SeqFISH+. Nature 2019, 568, 235–239. [Google Scholar] [CrossRef]
- Nguyen, H.Q.; Chattoraj, S.; Castillo, D.; Nguyen, S.C.; Nir, G.; Lioutas, A.; Hershberg, E.A.; Martins, N.M.C.; Reginato, P.L.; Hannan, M.; et al. 3D Mapping and Accelerated Super-Resolution Imaging of the Human Genome Using in Situ Sequencing. Nat. Methods 2020, 17, 822–832. [Google Scholar] [CrossRef]
- Longo, S.K.; Guo, M.G.; Ji, A.L.; Khavari, P.A. Integrating Single-Cell and Spatial Transcriptomics to Elucidate Intercellular Tissue Dynamics. Nat. Rev. Genet. 2021, 22, 627–644. [Google Scholar] [CrossRef] [PubMed]
- Vandereyken, K.; Sifrim, A.; Thienpont, B.; Voet, T. Methods and Applications for Single-Cell and Spatial Multi-Omics. Nat. Rev. Genet. 2023, 24, 494–515. [Google Scholar] [CrossRef]
- Kleshchevnikov, V.; Shmatko, A.; Dann, E.; Aivazidis, A.; King, H.W.; Li, T.; Elmentaite, R.; Lomakin, A.; Kedlian, V.; Gayoso, A.; et al. Cell2location Maps Fine-Grained Cell Types in Spatial Transcriptomics. Nat. Biotechnol. 2022, 40, 661–671. [Google Scholar] [CrossRef]
- Rodriques, S.G.; Stickels, R.R.; Goeva, A.; Martin, C.A.; Murray, E.; Vanderburg, C.R.; Welch, J.; Chen, L.M.; Chen, F.; Macosko, E.Z. Slide-Seq: A Scalable Technology for Measuring Genome-Wide Expression at High Spatial Resolution. Science 2019, 363, 1463–1467. [Google Scholar] [CrossRef]
- Song, Q.; Su, J. DSTG: Deconvoluting Spatial Transcriptomics Data through Graph-Based Artificial Intelligence. Brief. Bioinform. 2021, 22, bbaa414. [Google Scholar] [CrossRef]
- Sun, E.D.; Ma, R.; Navarro Negredo, P.; Brunet, A.; Zou, J. TISSUE: Uncertainty-Calibrated Prediction of Single-Cell Spatial Transcriptomics Improves Downstream Analyses. Nat. Methods 2024, 21, 444–454. [Google Scholar] [CrossRef] [PubMed]
- Li, B.; Zhang, Y.; Wang, Q.; Zhang, C.; Li, M.; Wang, G.; Song, Q. Gene Expression Prediction from Histology Images via Hypergraph Neural Networks. Brief. Bioinform. 2024, 25, bbae500. [Google Scholar] [CrossRef]
- Pham, D.; Tan, X.; Balderson, B.; Xu, J.; Grice, L.F.; Yoon, S.; Willis, E.F.; Tran, M.; Lam, P.Y.; Raghubar, A.; et al. Robust Mapping of Spatiotemporal Trajectories and Cell–Cell Interactions in Healthy and Diseased Tissues. Nat. Commun. 2023, 14, 7739. [Google Scholar] [CrossRef] [PubMed]
- He, B.; Bergenstråhle, L.; Stenbeck, L.; Abid, A.; Andersson, A.; Borg, Å.; Maaskola, J.; Lundeberg, J.; Zou, J. Integrating Spatial Gene Expression and Breast Tumour Morphology via Deep Learning. Nat. Biomed. Eng. 2020, 4, 827–834. [Google Scholar] [CrossRef]
- Pizurica, M.; Zheng, Y.; Carrillo-Perez, F.; Noor, H.; Yao, W.; Wohlfart, C.; Vladimirova, A.; Marchal, K.; Gevaert, O. Digital Profiling of Gene Expression from Histology Images with Linearized Attention. Nat. Commun. 2024, 15, 9886. [Google Scholar] [CrossRef] [PubMed]
- Pang, M.; Su, K.; Li, M. Leveraging Information in Spatial Transcriptomics to Predict Super-Resolution Gene Expression from Histology Images in Tumors. BioRxiv 2021. [Google Scholar] [CrossRef]
- Hoang, D.T.; Dinstag, G.; Shulman, E.D.; Hermida, L.C.; Ben-Zvi, D.S.; Elis, E.; Caley, K.; Sammut, S.J.; Sinha, S.; Sinha, N.; et al. A Deep-Learning Framework to Predict Cancer Treatment Response from Histopathology Images through Imputed Transcriptomics. Nat. Cancer 2024, 5, 1305–1317. [Google Scholar] [CrossRef]
- Zeng, Y.; Wei, Z.; Yu, W.; Yin, R.; Yuan, Y.; Li, B.; Tang, Z.; Lu, Y.; Yang, Y. Spatial Transcriptomics Prediction from Histology Jointly through Transformer and Graph Neural Networks. Brief. Bioinform. 2022, 23, bbac297. [Google Scholar] [CrossRef]
- Jia, Y.; Liu, J.; Chen, L.; Zhao, T.; Wang, Y. THItoGene: A Deep Learning Method for Predicting Spatial Transcriptomics from Histological Images. Brief. Bioinform. 2024, 25, bbad464. [Google Scholar] [CrossRef]
- Qu, H.; Zhou, M.; Yan, Z.; Wang, H.; Rustgi, V.K.; Zhang, S.; Gevaert, O.; Metaxas, D.N. Genetic Mutation and Biological Pathway Prediction Based on Whole Slide Images in Breast Carcinoma Using Deep Learning. NPJ Precis. Oncol. 2021, 5, 87. [Google Scholar] [CrossRef]
- Zheng, H.; Momeni, A.; Cedoz, P.L.; Vogel, H.; Gevaert, O. Whole Slide Images Reflect DNA Methylation Patterns of Human Tumors. NPJ Genom. Med. 2020, 5, 11. [Google Scholar] [CrossRef]
- Carrillo-Perez, F.; Pizurica, M.; Ozawa, M.G.; Vogel, H.; West, R.B.; Kong, C.S.; Herrera, L.J.; Shen, J.; Gevaert, O. Synthetic Whole-Slide Image Tile Generation with Gene Expression Profile-Infused Deep Generative Models. Cell Rep. Methods 2023, 3, 100534. [Google Scholar] [CrossRef] [PubMed]
- Zeng, Q.; Klein, C.; Caruso, S.; Maille, P.; Laleh, N.G.; Sommacale, D.; Laurent, A.; Amaddeo, G.; Gentien, D.; Rapinat, A.; et al. Artificial Intelligence Predicts Immune and Inflammatory Gene Signatures Directly from Hepatocellular Carcinoma Histology. J. Hepatol. 2022, 77, 116–127. [Google Scholar] [CrossRef]
- Zhao, Y.; Alizadeh, E.; Taha, H.B.; Liu, Y.; Xu, M.; Mahoney, J.M.; Li, S. Inferring Single-Cell Spatial Gene Expression with Tissue Morphology via Explainable Deep Learning. BioRxiv 2024. [Google Scholar] [CrossRef]
- Elosua-Bayes, M.; Nieto, P.; Mereu, E.; Gut, I.; Heyn, H. SPOTlight: Seeded NMF Regression to Deconvolute Spatial Transcriptomics Spots with Single-Cell Transcriptomes. Nucleic Acids Res. 2021, 49, E50. [Google Scholar] [CrossRef] [PubMed]
- Hao, M.; Luo, E.; Chen, Y.; Wu, Y.; Li, C.; Chen, S.; Gao, H.; Bian, H.; Gu, J.; Wei, L.; et al. STEM Enables Mapping of Single-Cell and Spatial Transcriptomics Data with Transfer Learning. Commun. Biol. 2024, 7, 56. [Google Scholar] [CrossRef]
- Chen, H.; Lee, Y.J.; Ovando-Ricardez, J.A.; Rosas, L.; Rojas, M.; Mora, A.L.; Bar-Joseph, Z.; Lugo-Martinez, J. Recovering Single-Cell Expression Profiles from Spatial Transcriptomics with scResolve. Cell Rep. Methods 2024, 4, 100864. [Google Scholar] [CrossRef]
- Li, X.; Xiao, C.; Qi, J.; Xue, W.; Xu, X.; Mu, Z.; Zhang, J.; Li, C.Y.; Ding, W. STellaris: A Web Server for Accurate Spatial Mapping of Single Cells Based on Spatial Transcriptomics Data. Nucleic Acids Res. 2023, 51, W560–W568. [Google Scholar] [CrossRef] [PubMed]
- Hao, M.; Hua, K.; Zhang, X. SOMDE: A Scalable Method for Identifying Spatially Variable Genes with Self-Organizing Map. Bioinformatics 2021, 37, 4392–4398. [Google Scholar] [CrossRef] [PubMed]
- Zhang, K.; Feng, W.; Wang, P. Identification of Spatially Variable Genes with Graph Cuts. Nat. Commun. 2022, 13, 5488. [Google Scholar] [CrossRef]
- Xu, H.; Fu, H.; Long, Y.; Ang, K.S.; Sethi, R.; Chong, K.; Li, M.; Uddamvathanak, R.; Lee, H.K.; Ling, J.; et al. Unsupervised Spatially Embedded Deep Representation of Spatial Transcriptomics. Genome Med. 2024, 16, 12. [Google Scholar] [CrossRef]
- Xu, Y.; McCord, R.P. CoSTA: Unsupervised Convolutional Neural Network Learning for Spatial Transcriptomics Analysis. BMC Bioinform. 2021, 22, 397. [Google Scholar] [CrossRef]
- Dong, K.; Zhang, S. Deciphering Spatial Domains from Spatially Resolved Transcriptomics with an Adaptive Graph Attention Auto-Encoder. Nat. Commun. 2022, 13, 1739. [Google Scholar] [CrossRef]
- Yuan, Y.; Bar-Joseph, Z. GCNG: Graph Convolutional Networks for Inferring Gene Interaction from Spatial Transcriptomics Data. Genome Biol. 2020, 21, 300. [Google Scholar] [CrossRef]
- Li, Y.; Stanojevic, S.; Garmire, L.X. Emerging Artificial Intelligence Applications in Spatial Transcriptomics Analysis. Comput. Struct. Biotechnol. J. 2022, 20, 2895–2908. [Google Scholar] [CrossRef] [PubMed]
- Kaur, H.; Heiser, C.N.; McKinley, E.T.; Ventura-Antunes, L.; Harris, C.R.; Roland, J.T.; Farrow, M.A.; Selden, H.J.; Pingry, E.L.; Moore, J.F.; et al. Consensus Tissue Domain Detection in Spatial Omics Data Using Multiplex Image Labeling with Regional Morphology (MILWRM). Commun. Biol. 2024, 7, 1295. [Google Scholar] [CrossRef] [PubMed]
- Hoseini, S.S.; Dewar, R. Empowering Healthcare Professionals with No-Code Artificial Intelligence Platforms for Model Development, a Practical Demonstration for Pathology. Discoveries 2024, 12, e182. [Google Scholar] [CrossRef] [PubMed]
- Tagra, H.; Batra, P. Dentistry 4.0: A Whole New Paradigm. Discoveries 2021, 4, e19. [Google Scholar] [CrossRef]
- Kaushik, R.; Rapaka, R. A Patient-Centered Perspectives and Future Directions in AI-Powered Teledentistry. Discoveries 2024, 12, e199. [Google Scholar] [CrossRef]
- Edpuganti, S.; Shamim, A.; Gangolli, V.H.; Weerasekara, R.A.D.K.N.W.; Yellamilli, A. Artificial Intelligence in Cardiovascular Imaging: Current Landscape, Clinical Impact, and Future Directions. Discoveries 2025, 13, e211. [Google Scholar] [CrossRef]
- Zhao, Y.; Bucur, O.; Irshad, H.; Chen, F.; Weins, A.; Stancu, A.L.; Oh, E.Y.; Distasio, M.; Torous, V.; Glass, B.; et al. Nanoscale Imaging of Clinical Specimens Using Pathology-Optimized Expansion Microscopy. Nat. Biotechnol. 2017, 35, 757–764. [Google Scholar] [CrossRef]
- Chen, J.; Wang, Y.; Ko, J. Single-Cell and Spatially Resolved Omics: Advances and Limitations. J. Pharm. Anal. 2023, 13, 833–835. [Google Scholar] [CrossRef]
- Fan, Q.; Wang, Y.; Cheng, J.; Pan, B.; Zang, X.; Liu, R.; Deng, Y. Single-Cell RNA-Seq Reveals T Cell Exhaustion and Immune Response Landscape in Osteosarcoma. Front. Immunol. 2024, 15, 1362970. [Google Scholar] [CrossRef]
- Antcliffe, D.B.; Harte, E.; Hussain, H.; Jiménez, B.; Browning, C.; Gordon, A.C. Metabolic Septic Shock Sub-Phenotypes, Stability over Time and Association with Clinical Outcome. Intensive Care Med. 2025, 51, 529–541. [Google Scholar] [CrossRef]
- Yang, J.; Ou, F.; Li, B.; Zeng, L.; Chen, Q.; Gan, H.; Yu, J.; Guo, Q.; Feng, J.; Zhang, J. Machine Learning Based Screening of Biomarkers Associated with Cell Death and Immunosuppression of Multiple Life Stages Sepsis Populations. Sci. Rep. 2025, 15, 30302. [Google Scholar] [CrossRef]
- Guo, F.; Zhu, X.; Wu, Z.; Zhu, L.; Wu, J.; Zhang, F. Clinical Applications of Machine Learning in the Survival Prediction and Classification of Sepsis: Coagulation and Heparin Usage Matter. J. Transl. Med. 2022, 20, 265. [Google Scholar] [CrossRef]
- Gui, Y.; He, X.; Yu, J.; Jing, J. Artificial Intelligence-Assisted Transcriptomic Analysis to Advance Cancer Immunotherapy. J. Clin. Med. 2023, 12, 1279. [Google Scholar] [CrossRef]
- Ravindran, U.; Gunavathi, C. Deep Learning Assisted Cancer Disease Prediction from Gene Expression Data Using WT-GAN. BMC Med. Inform. Decis. Mak. 2024, 24, 311. [Google Scholar] [CrossRef]
- Yan, G.; Mingyang, G.; Wei, S.; Hongping, L.; Liyuan, Q.; Ailan, L.; Xiaomei, K.; Huilan, Z.; Juanjuan, Z.; Yan, Q. Diagnosis and Typing of Leukemia Using a Single Peripheral Blood Cell through Deep Learning. Cancer Sci. 2025, 116, 533–543. [Google Scholar] [CrossRef] [PubMed]
- Lähnemann, D.; Köster, J.; Szczurek, E.; McCarthy, D.J.; Hicks, S.C.; Robinson, M.D.; Vallejos, C.A.; Campbell, K.R.; Beerenwinkel, N.; Mahfouz, A.; et al. Eleven Grand Challenges in Single-Cell Data Science. Genome Biol. 2020, 21, 31. [Google Scholar] [CrossRef]
- Choudhary, S.; Satija, R. Comparison and Evaluation of Statistical Error Models for ScRNA-Seq. Genome Biol. 2022, 23, 27. [Google Scholar] [CrossRef] [PubMed]
- Hwang, H.; Jeon, H.; Yeo, N.; Baek, D. Big Data and Deep Learning for RNA Biology. Exp. Mol. Med. 2024, 56, 1293–1321. [Google Scholar] [CrossRef] [PubMed]
- Bhandari, N.; Khare, S.; Walambe, R.; Kotecha, K. Comparison of Machine Learning and Deep Learning Techniques in Promoter Prediction across Diverse Species. PeerJ Comput. Sci. 2021, 7, e365. [Google Scholar] [CrossRef]



| Model | Target Analysis | Prediction Output | Advantages | Disadvantages |
|---|---|---|---|---|
| CpGenie [64] | CpG methylation status | Infers predictions regarding CpG sites based on regulatory regions (enhancers, promoters, etc.) | CNN architecture capable of single-nucleotide resolutions (high sensitivity) Learns directly from DNA sequence and can produce predictions for the effect of previously unknown variants Variant prioritization | Not suitable for analyzing long-distance genomic interactions Lower performance in CpG-poor regions Data dependency It is an older model and has a less accurate performance compared to modern, transformer-based frameworks |
| MethylNet [76] | Methylation status at CpG sites and non-CpG sites | Enhanced prediction accuracy of methylation sites using additional chromatin data | Structured on VAEs, it can perform both supervised and unsupervised learning Multitasking model Can capture nonlinear relationships between CpG islands High-fidelity data | Requires advanced computational resources Limited interpretability Sensitive to batch effects |
| DeepMethyl [77] | DNA methylation status (CpG and non-CpG sites) | A hybrid model (CNN-RNN) that evaluates the methylation levels across both CpG and non-CpG regions | Integrates 3D genomic data Has integrated denoising autoencoders It performs better than benchmark ML models | In the absence of data about methylation status of the neighboring regions, its accuracy decreases Limited genomic coverage Data dependency Interpretability issues |
| DeepSignal [78] | CpG methylation | Distinguishes between methylated and non-methylated CpG based on nanopore sequencing data | Hybrid CNN and BiLSTM architecture which ensures a dual-feature extraction with a higher accuracy than traditional ML algorithms Very efficient for low-depth sequencing data Performs better than bisulfite sequencing | High computational demand It is outperformed by newer models Requires retraining Black box configuration limitations Data dependency |
| Framework | Architecture | Application | Reference |
|---|---|---|---|
| LASSO, XGboost, GBM, Boruta, CoxBoost, survival-SVM | ML algorithms | Predicting T cell exhaustion signature from genes in osteosarcoma | Fan et al. [164] |
| SpiRiT | ViT | Predicting spatial gene expression profiles for human breast cancer | Zhao et al. [145] |
| DSTG | Graph-based convolutional network | Deconvolution of pancreatic cancer tissue sections | Song and Su [131] |
| STEM | MLP encoder Transfer learning | Characterization of tumor microenvironment on single-cell scale based on datasets of squamous cell carcinoma; retrieving gene expression patterns from liver sections | Hao et al. [147] |
| iMAP | GANs and deep autoencoders | Removes batch effects and identifies batch-specific cells | Gui et al. [168] |
| DARTS | DNN and a Bayesian hypothesis testing statistical model | Useful for studying the development of embryos and cancer metastasis | Zhang et al. [82] |
| WT-GAN | GAN | Predicting cancer types based on MGCED datasets | Ravindran and Gunavathi [169] |
| Seurat [82] | ML algorithm | Characterization of epithelial cell lineages in lung adenocarcinoma | Zhang et al. [82] |
| PMG training framework | 5 PMGs ViT-L model, Segment Anything Model, and ResNeXt framework | Differentiation between benign and malignant cells and leukemia typing | Yan et al. [170] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Pălăștea, E.A.; Matache, I.-M.; Radu, E.; Henegariu, O.; Bucur, O. AI-Based Prediction of Gene Expression in Single-Cell and Multiscale Genomics and Transcriptomics. Int. J. Mol. Sci. 2026, 27, 801. https://doi.org/10.3390/ijms27020801
Pălăștea EA, Matache I-M, Radu E, Henegariu O, Bucur O. AI-Based Prediction of Gene Expression in Single-Cell and Multiscale Genomics and Transcriptomics. International Journal of Molecular Sciences. 2026; 27(2):801. https://doi.org/10.3390/ijms27020801
Chicago/Turabian StylePălăștea, Ema Andreea, Irina-Mihaela Matache, Eugen Radu, Octavian Henegariu, and Octavian Bucur. 2026. "AI-Based Prediction of Gene Expression in Single-Cell and Multiscale Genomics and Transcriptomics" International Journal of Molecular Sciences 27, no. 2: 801. https://doi.org/10.3390/ijms27020801
APA StylePălăștea, E. A., Matache, I.-M., Radu, E., Henegariu, O., & Bucur, O. (2026). AI-Based Prediction of Gene Expression in Single-Cell and Multiscale Genomics and Transcriptomics. International Journal of Molecular Sciences, 27(2), 801. https://doi.org/10.3390/ijms27020801

