Integration of Bulk and Single-Cell RNA Sequencing Analyses in Biomedicine
Abstract
1. Introduction
2. Methods of Transcriptome Analysis
2.1. Bulk RNA Sequencing: Tissue-Level Profiling
2.2. Single-Cell RNA Sequencing: Diversity of Cell Types
2.3. Strengths and Limitations of Each Approach
3. Data Integration
3.1. Importance of Integrating Bulk and Single-Cell RNA Sequencing Data
3.2. Major Approaches to Integrating Single-Cell and Bulk RNA Sequencing Data: Deconvolution and the Pseudobulk Strategy
3.3. Practical Applications of Integrative Analysis
4. Bioinformatic Tools for Integration
4.1. Deconvolution Methods
Practical Considerations for Method Selection and Unresolved Challenges
4.2. scRNAseq Reference Preparation Tools for Bulk–Single-Cell Integration
4.3. Example of an Analytical Pipeline for Integrating Bulk RNAseq and scRNAseq Data
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wang, Z.; Gerstein, M.; Snyder, M. RNA-Seq: A revolutionary tool for transcriptomics. Nat. Rev. Genet. 2009, 10, 57–63. [Google Scholar] [CrossRef] [PubMed]
- Conesa, A.; Madrigal, P.; Tarazona, S.; Gomez-Cabrero, D.; Cervera, A.; McPherson, A.; Szcześniak, M.W.; Gaffney, D.J.; Elo, L.L.; Zhang, X. A survey of best practices for RNA-seq data analysis. Genome Biol. 2016, 17, 13, Erratum in Genome Biol. 2016, 17, 181. [Google Scholar] [CrossRef] [PubMed]
- Tang, F.; Barbacioru, C.; Wang, Y.; Nordman, E.; Lee, C.; Xu, N.; Wang, X.; Bodeau, J.; Tuch, B.B.; Siddiqui, A. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods 2009, 6, 377–382. [Google Scholar] [CrossRef] [PubMed]
- Lähnemann, D.; Köster, J.; Szczurek, E.; McCarthy, D.J.; Hicks, S.C.; Robinson, M.D.; Vallejos, C.A.; Campbell, K.R.; Beerenwinkel, N.; Mahfouz, A. Eleven grand challenges in single-cell data science. Genome Biol. 2020, 21, 31. [Google Scholar] [CrossRef]
- Avila Cobos, F.; Vandesompele, J.; Mestdagh, P.; De Preter, K. Computational deconvolution of transcriptomics data from mixed cell populations. Bioinformatics 2018, 34, 1969–1979. [Google Scholar] [CrossRef]
- Newman, A.M.; Steen, C.B.; Liu, C.L.; Gentles, A.J.; Chaudhuri, A.A.; Scherer, F.; Khodadoust, M.S.; Esfahani, M.S.; Luca, B.A.; Steiner, D. Determining cell type abundance and expression from bulk tissues with digital cytometry. Nat. Biotechnol. 2019, 37, 773–782. [Google Scholar] [CrossRef]
- Chu, T.; Wang, Z.; Pe’er, D.; Danko, C.G. Cell type and gene expression deconvolution with BayesPrism enables Bayesian integrative analysis across bulk and single-cell RNA sequencing in oncology. Nat. Cancer 2022, 3, 505–517. [Google Scholar] [CrossRef]
- Merotto, L.; Zopoglou, M.; Zackl, C.; Finotello, F. Next-generation deconvolution of transcriptomic data to investigate the tumor microenvironment. Int. Rev. Cell Mol. Biol. 2024, 382, 103–143. [Google Scholar] [CrossRef]
- Zaitsev, K.; Bambouskova, M.; Swain, A.; Artyomov, M.N. Complete deconvolution of cellular mixtures based on linearity of transcriptional signatures. Nat. Commun. 2019, 10, 2209. [Google Scholar] [CrossRef]
- Plattner, C.; Finotello, F.; Rieder, D. Deconvoluting tumor-infiltrating immune cells from RNA-seq data using quanTIseq. In Methods in Enzymology; Elsevier: Amsterdam, The Netherlands, 2020; Volume 636, pp. 261–285. [Google Scholar] [CrossRef]
- Kelley, K.W.; Nakao-Inoue, H.; Molofsky, A.V.; Oldham, M.C. Variation among intact tissue samples reveals the core transcriptional features of human CNS cell classes. Nat. Neurosci. 2018, 21, 1171–1184. [Google Scholar] [CrossRef]
- Regev, A.; Teichmann, S.A.; Lander, E.S.; Amit, I.; Benoist, C.; Birney, E.; Bodenmiller, B.; Campbell, P.; Carninci, P.; Clatworthy, M. The human cell atlas. eLife 2017, 6, e27041. [Google Scholar] [CrossRef]
- Denisenko, E.; Guo, B.B.; Jones, M.; Hou, R.; De Kock, L.; Lassmann, T.; Poppe, D.; Clément, O.; Simmons, R.K.; Lister, R. Systematic assessment of tissue dissociation and storage biases in single-cell and single-nucleus RNA-seq workflows. Genome Biol. 2020, 21, 130. [Google Scholar] [CrossRef] [PubMed]
- Haque, A.; Engel, J.; Teichmann, S.A.; Lönnberg, T. A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications. Genome Med. 2017, 9, 75. [Google Scholar] [CrossRef] [PubMed]
- Steen, C.B.; Liu, C.L.; Alizadeh, A.A.; Newman, A.M. Profiling cell type abundance and expression in bulk tissues with CIBERSORTx. In Stem Cell Transcriptional Networks: Methods and Protocols; Springer: Berlin/Heidelberg, Germany, 2020; pp. 135–157. [Google Scholar] [CrossRef]
- Charoentong, P.; Finotello, F.; Angelova, M.; Mayer, C.; Efremova, M.; Rieder, D.; Hackl, H.; Trajanoski, Z. Pan-cancer immunogenomic analyses reveal genotype-immunophenotype relationships and predictors of response to checkpoint blockade. Cell Rep. 2017, 18, 248–262. [Google Scholar] [CrossRef]
- Petitprez, F.; Meylan, M.; de Reyniès, A.; Sautès-Fridman, C.; Fridman, W.H. The tumor microenvironment in the response to immune checkpoint blockade therapies. Front. Immunol. 2020, 11, 784. [Google Scholar] [CrossRef] [PubMed]
- Johnson, T.S.; Xiang, S.; Dong, T.; Huang, Z.; Cheng, M.; Wang, T.; Yang, K.; Ni, D.; Huang, K.; Zhang, J. Combinatorial analyses reveal cellular composition changes have different impacts on transcriptomic changes of cell type specific genes in Alzheimer’s Disease. Sci. Rep. 2021, 11, 353. [Google Scholar] [CrossRef]
- Wu, X.; Zhao, X.; Xiong, Y.; Zheng, M.; Zhong, C.; Zhou, Y. Deciphering cell-type-specific gene expression signatures of cardiac diseases through reconstruction of bulk transcriptomes. Front. Cell Dev. Biol. 2022, 10, 792774. [Google Scholar] [CrossRef]
- Chen, G.; Ning, B.; Shi, T. Single-cell RNA-seq technologies and related computational data analysis. Front. Genet. 2019, 10, 317. [Google Scholar] [CrossRef]
- Cai, B.; Zhang, J.; Li, H.; Su, C.; Zhao, H. Statistical inference of cell-type proportions estimated from bulk expression data. J. Am. Stat. Assoc. 2024, 119, 2521–2532. [Google Scholar] [CrossRef]
- Tzec-Interián, J.A.; González-Padilla, D.; Góngora-Castillo, E.B. Bioinformatics perspectives on transcriptomics: A comprehensive review of bulk and single-cell RNA sequencing analyses. Quant. Biol. 2025, 13, e78. [Google Scholar] [CrossRef]
- Li, X.; Wang, C.-Y. From bulk, single-cell to spatial RNA sequencing. Int. J. Oral Sci. 2021, 13, 36. [Google Scholar] [CrossRef] [PubMed]
- Mou, Z.; Harries, L.W. Integration of single-cell and bulk RNA-sequencing data reveals the prognostic potential of epithelial gene markers for prostate cancer. Mol. Oncol. 2025, 19, 1811–1835. [Google Scholar] [CrossRef] [PubMed]
- Mortazavi, A.; Williams, B.A.; McCue, K.; Schaeffer, L.; Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 2008, 5, 621–628. [Google Scholar] [CrossRef] [PubMed]
- Nagalakshmi, U.; Wang, Z.; Waern, K.; Shou, C.; Raha, D.; Gerstein, M.; Snyder, M. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 2008, 320, 1344–1349. [Google Scholar] [CrossRef]
- Borisov, N.; Tkachev, V.; Simonov, A.; Sorokin, M.; Kim, E.; Kuzmin, D.; Karademir-Yilmaz, B.; Buzdin, A. Uniformly shaped harmonization combines human transcriptomic data from different platforms while retaining their biological properties and differential gene expression patterns. Front. Mol. Biosci. 2023, 10, 1237129. [Google Scholar] [CrossRef]
- Modi, A.; Vai, S.; Caramelli, D.; Lari, M. The Illumina sequencing protocol and the NovaSeq 6000 system. In Bacterial Pangenomics: Methods and Protocols; Springer: Berlin/Heidelberg, Germany, 2021; pp. 15–42. [Google Scholar] [CrossRef]
- Lu, H.; Giordano, F.; Ning, Z. Oxford Nanopore MinION sequencing and genome assembly. Genom. Proteom. Bioinform. 2016, 14, 265–279. [Google Scholar] [CrossRef]
- Rhoads, A.; Au, K.F. PacBio sequencing and its applications. Genom. Proteom. Bioinform. 2015, 13, 278–289. [Google Scholar] [CrossRef]
- Khilal, N.; Suntsova, M.; Knyazev, D.; Guryanova, A.; Kovaleva, T.; Sorokin, M.; Buzdin, A.; Katkova, N. Adaptation and Experimental Validation of Clinical RNA Sequencing Protocol Oncobox for MGI DNBSEQ-G50 Platform. Biochem. (Mosc.) Suppl. Ser. B Biomed. Chem. 2023, 17, 172–182. [Google Scholar] [CrossRef]
- Dobin, A.; Davis, C.A.; Schlesinger, F.; Drenkow, J.; Zaleski, C.; Jha, S.; Batut, P.; Chaisson, M.; Gingeras, T.R. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 2013, 29, 15–21. [Google Scholar] [CrossRef]
- Stark, R.; Grzelak, M.; Hadfield, J. RNA sequencing: The teenage years. Nat. Rev. Genet. 2019, 20, 631–656. [Google Scholar] [CrossRef]
- Jeon, H.; Xie, J.; Jeon, Y.; Jung, K.J.; Gupta, A.; Chang, W.; Chung, D. Statistical power analysis for designing bulk, single-cell, and spatial transcriptomics experiments: Review, tutorial, and perspectives. Biomolecules 2023, 13, 221. [Google Scholar] [CrossRef] [PubMed]
- Choi, J.; Hyun, J.; Hyun, J.; Kim, J.-H.; Lee, J.H.; Bang, D. Cost and time-efficient construction of a 3′-end mRNA library from unpurified bulk RNA in a single tube. Exp. Mol. Med. 2024, 56, 453–460. [Google Scholar] [CrossRef] [PubMed]
- Buzdin, A.; Sorokin, M.; Garazha, A.; Glusker, A.; Aleshin, A.; Poddubskaya, E.; Sekacheva, M.; Kim, E.; Gaifullin, N.; Giese, A. RNA sequencing for research and diagnostics in clinical oncology. Semin. Cancer Biol. 2020, 60, 311–323. [Google Scholar] [CrossRef] [PubMed]
- Suntsova, M.; Gaifullin, N.; Allina, D.; Reshetun, A.; Li, X.; Mendeleeva, L.; Surin, V.; Sergeeva, A.; Spirin, P.; Prassolov, V. Atlas of RNA sequencing profiles for normal human tissues. Sci. Data 2019, 6, 36. [Google Scholar] [CrossRef] [PubMed]
- Weinstein, J.N.; Collisson, E.A.; Mills, G.B.; Shaw, K.R.; Ozenberger, B.A.; Ellrott, K.; Shmulevich, I.; Sander, C.; Stuart, J.M. The cancer genome atlas pan-cancer analysis project. Nat. Genet. 2013, 45, 1113–1120. [Google Scholar] [CrossRef]
- Consortium, G. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 2020, 369, 1318–1330. [Google Scholar] [CrossRef]
- Barrett, T.; Wilhite, S.E.; Ledoux, P.; Evangelista, C.; Kim, I.F.; Tomashevsky, M.; Marshall, K.A.; Phillippy, K.H.; Sherman, P.M.; Holko, M. NCBI GEO: Archive for functional genomics data sets—Update. Nucleic Acids Res. 2012, 41, D991–D995. [Google Scholar] [CrossRef]
- Ståhl, P.L.; Salmén, F.; Vickovic, S.; Lundmark, A.; Navarro, J.F.; Magnusson, J.; Giacomello, S.; Asp, M.; Westholm, J.O.; Huss, M. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science 2016, 353, 78–82. [Google Scholar] [CrossRef]
- Quail, D.F.; Joyce, J.A. Microenvironmental regulation of tumor progression and metastasis. Nat. Med. 2013, 19, 1423–1437. [Google Scholar] [CrossRef]
- Sorokin, M.; Buzdin, A.A.; Guryanova, A.; Efimov, V.; Suntsova, M.V.; Zolotovskaia, M.A.; Koroleva, E.V.; Sekacheva, M.I.; Tkachev, V.S.; Garazha, A. Large-scale assessment of pros and cons of autopsy-derived or tumor-matched tissues as the norms for gene expression analysis in cancers. Comput. Struct. Biotechnol. J. 2023, 21, 3964–3986. [Google Scholar] [CrossRef]
- Svensson, V.; Vento-Tormo, R.; Teichmann, S.A. Exponential scaling of single-cell RNA-seq in the past decade. Nat. Protoc. 2018, 13, 599–604. [Google Scholar] [CrossRef] [PubMed]
- Trapnell, C.; Cacchiarelli, D.; Grimsby, J.; Pokharel, P.; Li, S.; Morse, M.; Lennon, N.J.; Livak, K.J.; Mikkelsen, T.S.; Rinn, J.L. Pseudo-temporal ordering of individual cells reveals dynamics and regulators of cell fate decisions. Nat. Biotechnol. 2014, 32, 381. [Google Scholar] [CrossRef] [PubMed]
- Shalek, A.K.; Satija, R.; Shuga, J.; Trombetta, J.J.; Gennert, D.; Lu, D.; Chen, P.; Gertner, R.S.; Gaublomme, J.T.; Yosef, N. Single-cell RNA-seq reveals dynamic paracrine control of cellular variation. Nature 2014, 510, 363–369. [Google Scholar] [CrossRef] [PubMed]
- Nieto, M.A.; Huang, R.Y.-J.; Jackson, R.A.; Thiery, J.P. EMT: 2016. Cell 2016, 166, 21–45. [Google Scholar] [CrossRef]
- Marusyk, A.; Almendro, V.; Polyak, K. Intra-tumour heterogeneity: A looking glass for cancer? Nat. Rev. Cancer 2012, 12, 323–334. [Google Scholar] [CrossRef]
- Kim, E.L.; Sorokin, M.; Kantelhardt, S.R.; Kalasauskas, D.; Sprang, B.; Fauss, J.; Ringel, F.; Garazha, A.; Albert, E.; Gaifullin, N. Intratumoral heterogeneity and longitudinal changes in gene expression predict differential drug sensitivity in newly diagnosed and recurrent glioblastoma. Cancers 2020, 12, 520. [Google Scholar] [CrossRef]
- Armingol, E.; Officer, A.; Harismendy, O.; Lewis, N.E. Deciphering cell–cell interactions and communication from gene expression. Nat. Rev. Genet. 2021, 22, 71–88. [Google Scholar] [CrossRef]
- Newman, A.M.; Liu, C.L.; Green, M.R.; Gentles, A.J.; Feng, W.; Xu, Y.; Hoang, C.D.; Diehn, M.; Alizadeh, A.A. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods 2015, 12, 453–457. [Google Scholar] [CrossRef]
- Racle, J.; De Jonge, K.; Baumgaertner, P.; Speiser, D.E.; Gfeller, D. Simultaneous enumeration of cancer and immune cell types from bulk tumor gene expression data. eLife 2017, 6, e26476. [Google Scholar] [CrossRef]
- Racle, J.; Gfeller, D. EPIC: A tool to estimate the proportions of different cell types from bulk gene expression data. In Bioinformatics for Cancer Immunotherapy: Methods and Protocols; Springer: Berlin/Heidelberg, Germany, 2020; pp. 233–248. [Google Scholar] [CrossRef]
- Jew, B.; Alvarez, M.; Rahmani, E.; Miao, Z.; Ko, A.; Garske, K.M.; Sul, J.H.; Pietiläinen, K.H.; Pajukanta, P.; Halperin, E. Accurate estimation of cell composition in bulk expression through robust integration of single-cell information. Nat. Commun. 2020, 11, 1971, Correction in Nat. Commun. 2020, 11, 2891. [Google Scholar] [CrossRef]
- Islam, S.; Zeisel, A.; Joost, S.; La Manno, G.; Zajac, P.; Kasper, M.; Lönnerberg, P.; Linnarsson, S. Quantitative single-cell RNA-seq with unique molecular identifiers. Nat. Methods 2014, 11, 163–166. [Google Scholar] [CrossRef] [PubMed]
- Hagemann-Jensen, M.; Ziegenhain, C.; Chen, P.; Ramsköld, D.; Hendriks, G.-J.; Larsson, A.J.; Faridani, O.R.; Sandberg, R. Single-cell RNA counting at allele and isoform resolution using Smart-seq3. Nat. Biotechnol. 2020, 38, 708–714. [Google Scholar] [CrossRef] [PubMed]
- Picelli, S.; Faridani, O.R.; Björklund, Å.K.; Winberg, G.; Sagasser, S.; Sandberg, R. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 2014, 9, 171–181. [Google Scholar] [CrossRef] [PubMed]
- Macosko, E.Z.; Basu, A.; Satija, R.; Nemesh, J.; Shekhar, K.; Goldman, M.; Tirosh, I.; Bialas, A.R.; Kamitaki, N.; Martersteck, E.M. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 2015, 161, 1202–1214. [Google Scholar] [CrossRef]
- Wang, T.; Roach, M.J.; Harvey, K.; Morlanes, J.E.; Kiedik, B.; Al-Eryani, G.; Greenwald, A.; Kalavros, N.; Dezem, F.S.; Ma, Y. snPATHO-seq, a versatile FFPE single-nucleus RNA sequencing method to unlock pathology archives. Commun. Biol. 2024, 7, 1340. [Google Scholar] [CrossRef]
- Villani, A.-C.; Satija, R.; Reynolds, G.; Sarkizova, S.; Shekhar, K.; Fletcher, J.; Griesbeck, M.; Butler, A.; Zheng, S.; Lazo, S. Single-cell RNA-seq reveals new types of human blood dendritic cells, monocytes, and progenitors. Science 2017, 356, eaah4573. [Google Scholar] [CrossRef]
- Miller, B.C.; Sen, D.R.; Al Abosy, R.; Bi, K.; Virkud, Y.V.; LaFleur, M.W.; Yates, K.B.; Lako, A.; Felt, K.; Naik, G.S. Subsets of exhausted CD8+ T cells differentially mediate tumor control and respond to checkpoint blockade. Nat. Immunol. 2019, 20, 326–336, Correction in Nat. Immunol. 2019, 20, 1556. [Google Scholar] [CrossRef]
- Cuevas-Diaz Duran, R.; Wei, H.; Wu, J. Data normalization for addressing the challenges in the analysis of single-cell transcriptomic datasets. BMC Genom. 2024, 25, 444. [Google Scholar] [CrossRef]
- Kharchenko, P.V.; Silberstein, L.; Scadden, D.T. Bayesian approach to single-cell differential expression analysis. Nat. Methods 2014, 11, 740–742. [Google Scholar] [CrossRef]
- Svensson, V. Droplet scRNA-seq is not zero-inflated. Nat. Biotechnol. 2020, 38, 147–150. [Google Scholar] [CrossRef]
- van den Brink, S.C.; Sage, F.; Vértesy, Á.; Spanjaard, B.; Peterson-Maduro, J.; Baron, C.S.; Robin, C.; Van Oudenaarden, A. Single-cell sequencing reveals dissociation-induced gene expression in tissue subpopulations. Nat. Methods 2017, 14, 935–936. [Google Scholar] [CrossRef]
- Habib, N.; Avraham-Davidi, I.; Basu, A.; Burks, T.; Shekhar, K.; Hofree, M.; Choudhury, S.R.; Aguet, F.; Gelfand, E.; Ardlie, K. Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat. Methods 2017, 14, 955–958. [Google Scholar] [CrossRef]
- Lake, B.B.; Ai, R.; Kaeser, G.E.; Salathia, N.S.; Yung, Y.C.; Liu, R.; Wildberg, A.; Gao, D.; Fung, H.-L.; Chen, S. Neuronal subtypes and diversity revealed by single-nucleus RNA sequencing of the human brain. Science 2016, 352, 1586–1590. [Google Scholar] [CrossRef] [PubMed]
- Zheng, G.X.; Terry, J.M.; Belgrader, P.; Ryvkin, P.; Bent, Z.W.; Wilson, R.; Ziraldo, S.B.; Wheeler, T.D.; McDermott, G.P.; Zhu, J. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 2017, 8, 14049. [Google Scholar] [CrossRef]
- Manzoor, F.; Tsurgeon, C.A.; Gupta, V. Exploring RNA-Seq data analysis through visualization techniques and tools: A systematic review of opportunities and limitations for clinical applications. Bioengineering 2025, 12, 56. [Google Scholar] [CrossRef]
- Deaton, A.M.; Webb, S.; Kerr, A.R.; Illingworth, R.S.; Guy, J.; Andrews, R.; Bird, A. Cell type–specific DNA methylation at intragenic CpG islands in the immune system. Genome Res. 2011, 21, 1074–1086. [Google Scholar] [CrossRef] [PubMed]
- Vieth, B.; Ziegenhain, C.; Parekh, S.; Enard, W.; Hellmann, I. powsimR: Power analysis for bulk and single cell RNA-seq experiments. Bioinformatics 2017, 33, 3486–3488. [Google Scholar] [CrossRef] [PubMed]
- SEQC/MAQC-III Consortium. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat. Biotechnol. 2014, 32, 903–914. [Google Scholar] [CrossRef]
- Mereu, E.; Lafzi, A.; Moutinho, C.; Ziegenhain, C.; McCarthy, D.J.; Álvarez-Varela, A.; Batlle, E.; Sagar, N.; Gruen, D.; Lau, J.K. Benchmarking single-cell RNA-sequencing protocols for cell atlas projects. Nat. Biotechnol. 2020, 38, 747–755. [Google Scholar] [CrossRef]
- Zhao, W.; He, X.; Hoadley, K.A.; Parker, J.S.; Hayes, D.N.; Perou, C.M. Comparison of RNA-Seq by poly (A) capture, ribosomal RNA depletion, and DNA microarray for expression profiling. BMC Genom. 2014, 15, 419. [Google Scholar] [CrossRef]
- Hegenbarth, J.-C.; Lezzoche, G.; De Windt, L.J.; Stoll, M. Perspectives on bulk-tissue RNA sequencing and single-cell RNA sequencing for cardiac transcriptomics. Front. Mol. Med. 2022, 2, 839338. [Google Scholar] [CrossRef]
- Liu, J.; Lichtenberg, T.; Hoadley, K.A.; Poisson, L.M.; Lazar, A.J.; Cherniack, A.D.; Kovatich, A.J.; Benz, C.C.; Levine, D.A.; Lee, A.V. An integrated TCGA pan-cancer clinical data resource to drive high-quality survival outcome analytics. Cell 2018, 173, 400–416.e411. [Google Scholar] [CrossRef]
- Lonsdale, J.; Thomas, J.; Salvatore, M.; Phillips, R.; Lo, E.; Shad, S.; Hasz, R.; Walters, G.; Garcia, F.; Young, N. The genotype-tissue expression (GTEx) project. Nat. Genet. 2013, 45, 580–585. [Google Scholar] [CrossRef] [PubMed]
- Slyper, M.; Porter, C.B.; Ashenberg, O.; Waldman, J.; Drokhlyansky, E.; Wakiro, I.; Smillie, C.; Smith-Rosario, G.; Wu, J.; Dionne, D. A single-cell and single-nucleus RNA-Seq toolbox for fresh and frozen human tumors. Nat. Med. 2020, 26, 792–802, Correction in Nat. Med. 2020, 26, 1307. [Google Scholar] [CrossRef] [PubMed]
- Vladimirova, U.; Rumiantsev, P.; Zolotovskaia, M.; Albert, E.; Abrosimov, A.; Slashchuk, K.; Nikiforovich, P.; Chukhacheva, O.; Gaifullin, N.; Suntsova, M. DNA repair pathway activation features in follicular and papillary thyroid tumors, interrogated using 95 experimental RNA sequencing profiles. Heliyon 2021, 7, e06408. [Google Scholar] [CrossRef] [PubMed]
- Shen-Orr, S.S.; Gaujoux, R. Computational deconvolution: Extracting cell type-specific information from heterogeneous samples. Curr. Opin. Immunol. 2013, 25, 571–578. [Google Scholar] [CrossRef]
- Heumos, L.; Schaar, A.C.; Lance, C.; Litinetskaya, A.; Drost, F.; Zappia, L.; Lücken, M.D.; Strobl, D.C.; Henao, J.; Curion, F. Best practices for single-cell analysis across modalities. Nat. Rev. Genet. 2023, 24, 550–572. [Google Scholar] [CrossRef]
- Sorokin, M.; Ignatev, K.; Poddubskaya, E.; Vladimirova, U.; Gaifullin, N.; Lantsov, D.; Garazha, A.; Allina, D.; Suntsova, M.; Barbara, V. RNA sequencing in comparison to immunohistochemistry for measuring cancer biomarkers in breast cancer and lung cancer specimens. Biomedicines 2020, 8, 114. [Google Scholar] [CrossRef]
- Sorokin, M.; Garazha, A.; Suntsova, M.; Tkachev, V.; Poddubskaya, E.; Gaifullin, N.; Sushinskaya, T.; Lantsov, D.; Borisov, V.; Naskhletashvili, D. Prospective trial of the Oncobox platform RNA sequencing bioinformatic analysis for personalized prescription of targeted drugs. Comput. Biol. Med. 2025, 187, 109716. [Google Scholar] [CrossRef]
- Donovan, M.K.; D’Antonio-Chronowska, A.; D’Antonio, M.; Frazer, K.A. Cellular deconvolution of GTEx tissues powers discovery of disease and cell-type associated regulatory variants. Nat. Commun. 2020, 11, 955, Correction in Nat. Commun. 2020, 11, 4426. [Google Scholar] [CrossRef]
- Moses, L.; Pachter, L. Museum of spatial transcriptomics. Nat. Methods 2022, 19, 534–546, Correction in Nat. Methods 2022, 19, 628. [Google Scholar] [CrossRef] [PubMed]
- Lim, H.J.; Wang, Y.; Buzdin, A.; Li, X. A practical guide for choosing an optimal spatial transcriptomics technology from seven major commercially available options. BMC Genom. 2025, 26, 47. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Liu, B.; Zhao, G.; Lee, Y.; Buzdin, A.; Mu, X.; Zhao, J.; Chen, H.; Li, X. Spatial transcriptomics: Technologies, applications and experimental considerations. Genomics 2023, 115, 110671. [Google Scholar] [CrossRef] [PubMed]
- Kleshchevnikov, V.; Shmatko, A.; Dann, E.; Aivazidis, A.; King, H.W.; Li, T.; Elmentaite, R.; Lomakin, A.; Kedlian, V.; Gayoso, A. Cell2location maps fine-grained cell types in spatial transcriptomics. Nat. Biotechnol. 2022, 40, 661–671. [Google Scholar] [CrossRef]
- Biancalani, T.; Scalia, G.; Buffoni, L.; Avasthi, R.; Lu, Z.; Sanger, A.; Tokcan, N.; Vanderburg, C.R.; Segerstolpe, Å.; Zhang, M. Deep learning and alignment of spatially resolved single-cell transcriptomes with Tangram. Nat. Methods 2021, 18, 1352–1362. [Google Scholar] [CrossRef]
- Stoeckius, M.; Hafemeister, C.; Stephenson, W.; Houck-Loomis, B.; Chattopadhyay, P.K.; Swerdlow, H.; Satija, R.; Smibert, P. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 2017, 14, 865–868. [Google Scholar] [CrossRef]
- Swanson, E.; Lord, C.; Reading, J.; Heubeck, A.T.; Genge, P.C.; Thomson, Z.; Weiss, M.D.; Li, X.-j.; Savage, A.K.; Green, R.R. Simultaneous trimodal single-cell measurement of transcripts, epitopes, and chromatin accessibility using TEA-seq. eLife 2021, 10, e63632. [Google Scholar] [CrossRef]
- Cao, Z.-J.; Gao, G. Multi-omics single-cell data integration and regulatory inference with graph-linked embedding. Nat. Biotechnol. 2022, 40, 1458–1466. [Google Scholar] [CrossRef]
- Erhard, F.; Baptista, M.A.; Krammer, T.; Hennig, T.; Lange, M.; Arampatzi, P.; Jürges, C.S.; Theis, F.J.; Saliba, A.-E.; Dölken, L. scSLAM-seq reveals core features of transcription dynamics in single cells. Nature 2019, 571, 419–423. [Google Scholar] [CrossRef]
- Tirosh, I.; Izar, B.; Prakadan, S.M.; Wadsworth, M.H.; Treacy, D.; Trombetta, J.J.; Rotem, A.; Rodman, C.; Lian, C.; Murphy, G. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science 2016, 352, 189–196. [Google Scholar] [CrossRef]
- Wang, X.; Park, J.; Susztak, K.; Zhang, N.R.; Li, M. Bulk tissue cell type deconvolution with multi-subject single-cell expression reference. Nat. Commun. 2019, 10, 380. [Google Scholar] [CrossRef]
- Zaitsev, A.; Chelushkin, M.; Dyikanov, D.; Cheremushkin, I.; Shpak, B.; Nomie, K.; Zyrin, V.; Nuzhdina, E.; Lozinsky, Y.; Zotova, A. Precise reconstruction of the TME using bulk RNA-seq and a machine learning algorithm trained on artificial transcriptomes. Cancer Cell 2022, 40, 879–894.e816. [Google Scholar] [CrossRef] [PubMed]
- Menden, K.; Marouf, M.; Oller, S.; Dalmia, A.; Magruder, D.S.; Kloiber, K.; Heutink, P.; Bonn, S. Deep learning–based cell composition analysis from tissue expression profiles. Sci. Adv. 2020, 6, eaba2619. [Google Scholar] [CrossRef] [PubMed]
- Li, Z.; Wu, H. TOAST: Improving reference-free cell composition estimation by cross-cell type differential analysis. Genome Biol. 2019, 20, 190. [Google Scholar] [CrossRef] [PubMed]
- Yang, X.; Zhao, F.; Ren, T.; Chen, C.; Byrne, K.T.; Danilov, A.V.; Sears, R.C.; Nelson, P.S.; Coussens, L.M.; Mills, G.B. OmicsTweezer: A distribution-independent cell deconvolution model for multi-omics data. Cell Genom. 2025, 5, 100950. [Google Scholar] [CrossRef]
- Feng, S.; Huang, L.; Pournara, A.V.; Huang, Z.; Yang, X.; Zhang, Y.; Brazma, A.; Shi, M.; Papatheodorou, I.; Miao, Z. Alleviating batch effects in cell type deconvolution with SCCAF-D. Nat. Commun. 2024, 15, 10867. [Google Scholar] [CrossRef]
- Dietrich, A.; Sturm, G.; Merotto, L.; Marini, F.; Finotello, F.; List, M. SimBu: Bias-aware simulation of bulk RNA-seq data with variable cell-type composition. Bioinformatics 2022, 38, ii141–ii147. [Google Scholar] [CrossRef]
- Squair, J.W.; Gautier, M.; Kathe, C.; Anderson, M.A.; James, N.D.; Hutson, T.H.; Hudelle, R.; Qaiser, T.; Matson, K.J.; Barraud, Q. Confronting false discoveries in single-cell differential expression. Nat. Commun. 2021, 12, 5692. [Google Scholar] [CrossRef]
- Crowell, H.L.; Soneson, C.; Germain, P.-L.; Calini, D.; Collin, L.; Raposo, C.; Malhotra, D.; Robinson, M.D. Muscat detects subpopulation-specific state transitions from multi-sample multi-condition single-cell transcriptomics data. Nat. Commun. 2020, 11, 6077. [Google Scholar] [CrossRef]
- Cobos, F.A.; Panah, M.J.N.; Epps, J.; Long, X.; Man, T.-K.; Chiu, H.-S.; Chomsky, E.; Kiner, E.; Krueger, M.J.; di Bernardo, D. Effective methods for bulk RNA-seq deconvolution using scnRNA-seq transcriptomes. Genome Biol. 2023, 24, 177. [Google Scholar] [CrossRef]
- Avila Cobos, F.; Alquicira-Hernandez, J.; Powell, J.E.; Mestdagh, P.; De Preter, K. Benchmarking of cell type deconvolution pipelines for transcriptomics data. Nat. Commun. 2020, 11, 5650, Correction in Nat. Commun. 2020, 11, 6291. [Google Scholar] [CrossRef] [PubMed]
- Ghaffari, S.; Bouchonville, K.J.; Saleh, E.; Schmidt, R.E.; Offer, S.M.; Sinha, S. BEDwARS: A robust Bayesian approach to bulk gene expression deconvolution with noisy reference signatures. Genome Biol. 2023, 24, 178. [Google Scholar] [CrossRef] [PubMed]
- Nadel, B.B.; Oliva, M.; Shou, B.L.; Mitchell, K.; Ma, F.; Montoya, D.J.; Mouton, A.; Kim-Hellmuth, S.; Stranger, B.E.; Pellegrini, M. Systematic evaluation of transcriptomics-based deconvolution methods and references using thousands of clinical samples. Brief. Bioinform. 2021, 22, bbab265. [Google Scholar] [CrossRef]
- Hu, M.; Chikina, M. InstaPrism: An R package for fast implementation of BayesPrism. Bioinformatics 2024, 40, btae440. [Google Scholar] [CrossRef]
- Petitprez, F.; De Reyniés, A.; Keung, E.Z.; Chen, T.W.-W.; Sun, C.-M.; Calderaro, J.; Jeng, Y.-M.; Hsiao, L.-P.; Lacroix, L.; Bougoüin, A. B cells are associated with survival and immunotherapy response in sarcoma. Nature 2020, 577, 556–560. [Google Scholar] [CrossRef]
- Thorsson, V.; Gibbs, D.L.; Brown, S.D.; Wolf, D.; Bortone, D.S.; Yang, T.-H.O.; Porta-Pardo, E.; Gao, G.F.; Plaisier, C.L.; Eddy, J.A. The immune landscape of cancer. Immunity 2018, 48, 812–830.e814, Erratum in Immunity 2019, 51, 411–412. [Google Scholar] [CrossRef]
- Jerby-Arnon, L.; Shah, P.; Cuoco, M.S.; Rodman, C.; Su, M.-J.; Melms, J.C.; Leeson, R.; Kanodia, A.; Mei, S.; Lin, J.-R. A cancer cell program promotes T cell exclusion and resistance to checkpoint blockade. Cell 2018, 175, 984–997.e924. [Google Scholar] [CrossRef]
- Kagohara, L.T.; Zamuner, F.; Davis-Marcisak, E.F.; Sharma, G.; Considine, M.; Allen, J.; Yegnasubramanian, S.; Gaykalova, D.A.; Fertig, E.J. Integrated single-cell and bulk gene expression and ATAC-seq reveals heterogeneity and early changes in pathways associated with resistance to cetuximab in HNSCC-sensitive cell lines. Br. J. Cancer 2020, 123, 101–113, Correction in Br. J. Cancer 2020, 123, 101–113. [Google Scholar] [CrossRef]
- Joanito, I.; Wirapati, P.; Zhao, N.; Nawaz, Z.; Yeo, G.; Lee, F.; Eng, C.L.; Macalinao, D.C.; Kahraman, M.; Srinivasan, H. Single-cell and bulk transcriptome sequencing identifies two epithelial tumor cell states and refines the consensus molecular classification of colorectal cancer. Nat. Genet. 2022, 54, 963–975. [Google Scholar] [CrossRef]
- Mathys, H.; Davila-Velderrain, J.; Peng, Z.; Gao, F.; Mohammadi, S.; Young, J.Z.; Menon, M.; He, L.; Abdurrob, F.; Jiang, X. Single-cell transcriptomic analysis of Alzheimer’s disease. Nature 2019, 570, 332–337, Correction in Nature 2019, 571, E1. [Google Scholar] [CrossRef]
- Consens, M.E.; Chen, Y.; Menon, V.; Wang, Y.; Schneider, J.A.; De Jager, P.L.; Bennett, D.A.; Tripathy, S.J.; Felsky, D. Bulk and single-nucleus transcriptomics highlight intra-telencephalic and somatostatin neurons in Alzheimer’s disease. Front. Mol. Neurosci. 2022, 15, 903175. [Google Scholar] [CrossRef] [PubMed]
- Skene, N.G.; Grant, S.G. Identification of vulnerable cell types in major brain disorders using single cell transcriptomes and expression weighted cell type enrichment. Front. Neurosci. 2016, 10, 16. [Google Scholar] [CrossRef] [PubMed]
- Antunes, A.S.; Martins-de-Souza, D. Single-cell RNA sequencing and its applications in the study of psychiatric disorders. Biol. Psychiatry Glob. Open Sci. 2023, 3, 329–339. [Google Scholar] [CrossRef] [PubMed]
- Forte, E.; Skelly, D.A.; Chen, M.; Daigle, S.; Morelli, K.A.; Hon, O.; Philip, V.M.; Costa, M.W.; Rosenthal, N.A.; Furtado, M.B. Dynamic interstitial cell response during myocardial infarction predicts resilience to rupture in genetically diverse mice. Cell Rep. 2020, 30, 3149–3163.e3146. [Google Scholar] [CrossRef]
- Lamarthée, B.; Callemeyn, J.; Van Herck, Y.; Antoranz, A.; Anglicheau, D.; Boada, P.; Becker, J.U.; Debyser, T.; De Smet, F.; De Vusser, K. Transcriptional and spatial profiling of the kidney allograft unravels a central role for FcyRIII+ innate immune cells in rejection. Nat. Commun. 2023, 14, 4359. [Google Scholar] [CrossRef]
- Dong, M.; Thennavan, A.; Urrutia, E.; Li, Y.; Perou, C.M.; Zou, F.; Jiang, Y. SCDC: Bulk gene expression deconvolution by multiple single-cell RNA sequencing references. Brief. Bioinform. 2021, 22, 416–427. [Google Scholar] [CrossRef]
- Qi, Z.; Liu, Y.; Mints, M.; Mullins, R.; Sample, R.; Law, T.; Barrett, T.; Mazul, A.L.; Jackson, R.S.; Kang, S.Y. Single-cell deconvolution of head and neck squamous cell carcinoma. Cancers 2021, 13, 1230. [Google Scholar] [CrossRef]
- Wu, J.; Li, W.; Su, J.; Zheng, J.; Liang, Y.; Lin, J.; Xu, B.; Liu, Y. Integration of single-cell sequencing and bulk RNA-seq to identify and develop a prognostic signature related to colorectal cancer stem cells. Sci. Rep. 2024, 14, 12270. [Google Scholar] [CrossRef]
- Aran, D.; Hu, Z.; Butte, A.J. xCell: Digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 2017, 18, 220. [Google Scholar] [CrossRef]
- Nguyen, H.; Nguyen, H.; Tran, D.; Draghici, S.; Nguyen, T. Fourteen years of cellular deconvolution: Methodology, applications, technical evaluation and outstanding challenges. Nucleic Acids Res. 2024, 52, 4761–4783. [Google Scholar] [CrossRef]
- Jaakkola, M.K.; Elo, L.L. Computational deconvolution to estimate cell type-specific gene expression from bulk data. NAR Genom. Bioinform. 2021, 3, lqaa110. [Google Scholar] [CrossRef]
- Jin, H.; Liu, Z. A benchmark for RNA-seq deconvolution analysis under dynamic testing environments. Genome Biol. 2021, 22, 102. [Google Scholar] [CrossRef] [PubMed]
- Dietrich, A.; Merotto, L.; Pelz, K.; Eder, B.; Zackl, C.; Reinisch, K.; Edenhofer, F.; Marini, F.; Sturm, G.; List, M. Benchmarking second-generation methods for cell-type deconvolution of transcriptomic data. bioRxiv 2024. [Google Scholar] [CrossRef]
- Fan, J.; Lyu, Y.; Zhang, Q.; Wang, X.; Li, M.; Xiao, R. MuSiC2: Cell-type deconvolution for multi-condition bulk RNA-seq data. Brief. Bioinform. 2022, 23, bbac430. [Google Scholar] [CrossRef] [PubMed]
- Xu, X.; Li, R.; Mo, O.; Liu, K.; Li, J.; Hao, P. Cell-type deconvolution for bulk RNA-seq data using single-cell reference: A comparative analysis and recommendation guideline. Brief. Bioinform. 2025, 26, bbaf031. [Google Scholar] [CrossRef]
- Huuki-Myers, L.A.; Montgomery, K.D.; Kwon, S.H.; Cinquemani, S.; Eagles, N.J.; Gonzalez-Padilla, D.; Maden, S.K.; Kleinman, J.E.; Hyde, T.M.; Hicks, S.C. Benchmark of cellular deconvolution methods using a multi-assay dataset from postmortem human prefrontal cortex. Genome Biol. 2025, 26, 88. [Google Scholar] [CrossRef]
- Ivich, A.; Davidson, N.R.; Grieshober, L.; Li, W.; Hicks, S.C.; Doherty, J.A.; Greene, C.S. Missing cell types in single-cell references impact deconvolution of bulk data but are detectable. Genome Biol. 2025, 26, 86. [Google Scholar] [CrossRef]
- Burkhardt, D.B.; San Juan, B.P.; Lock, J.G.; Krishnaswamy, S.; Chaffer, C.L. Mapping phenotypic plasticity upon the cancer cell state landscape using manifold learning. Cancer Discov. 2022, 12, 1847–1859. [Google Scholar] [CrossRef]
- Song, L.; Sun, X.; Qi, T.; Yang, J. Mixed model-based deconvolution of cell-state abundances (MeDuSA) along a one-dimensional trajectory. Nat. Comput. Sci. 2023, 3, 630–643. [Google Scholar] [CrossRef]
- Quinn, T.P.; Erb, I.; Gloor, G.; Notredame, C.; Richardson, M.F.; Crowley, T.M. A field guide for the compositional analysis of any-omics data. GigaScience 2019, 8, giz107. [Google Scholar] [CrossRef]
- Xu, S.; Chen, D.; Wang, X.; Li, S. Robustness and resilience of computational deconvolution methods for bulk RNA sequencing data. Brief. Bioinform. 2025, 26, bbaf264. [Google Scholar] [CrossRef] [PubMed]
- Wolfram-Schauerte, M.; Vogel, T.; Tuoken, H.; Fälth Savitski, M.; Simon, E.; Nieselt, K. Approaching the holistic transcriptome—Convolution and deconvolution in transcriptomics. Brief. Bioinform. 2025, 26, bbaf388. [Google Scholar] [CrossRef] [PubMed]
- Sevahn, K.; Vorperian, M.N.M.; Tabula Sapiens Consortium; Stephen, R. Quake Cell types of origin of the cell-free transcriptome. Nat. Biotechnol. 2022, 40, 855–861, Erratum in Nat. Biotechnol. 2022, 40, 974. [Google Scholar] [CrossRef]
- Lotfollahi, M.; Naghipourfar, M.; Luecken, M.D.; Khajavi, M.; Büttner, M.; Wagenstetter, M.; Avsec, Ž.; Gayoso, A.; Yosef, N.; Interlandi, M. Mapping single-cell data to reference atlases by transfer learning. Nat. Biotechnol. 2022, 40, 121–130. [Google Scholar] [CrossRef]
- Luecken, M.D.; Büttner, M.; Chaichoompu, K.; Danese, A.; Interlandi, M.; Müller, M.F.; Strobl, D.C.; Zappia, L.; Dugas, M.; Colomé-Tatché, M. Benchmarking atlas-level data integration in single-cell genomics. Nat. Methods 2022, 19, 41–50. [Google Scholar] [CrossRef]
- Satija, R.; Farrell, J.A.; Gennert, D.; Schier, A.F.; Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 2015, 33, 495–502. [Google Scholar] [CrossRef]
- Wolf, F.A.; Angerer, P.; Theis, F.J. SCANPY: Large-scale single-cell gene expression data analysis. Genome Biol. 2018, 19, 15. [Google Scholar] [CrossRef]
- Hao, Y.; Hao, S.; Andersen-Nissen, E.; Mauck, W.M.; Zheng, S.; Butler, A.; Lee, M.J.; Wilk, A.J.; Darby, C.; Zager, M. Integrated analysis of multimodal single-cell data. Cell 2021, 184, 3573–3587.e3529. [Google Scholar] [CrossRef]
- Korsunsky, I.; Millard, N.; Fan, J.; Slowikowski, K.; Zhang, F.; Wei, K.; Baglaenko, Y.; Brenner, M.; Loh, P.-r.; Raychaudhuri, S. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 2019, 16, 1289–1296. [Google Scholar] [CrossRef]
- Welch, J.D.; Kozareva, V.; Ferreira, A.; Vanderburg, C.; Martin, C.; Macosko, E.Z. Single-cell multi-omic integration compares and contrasts features of brain cell identity. Cell 2019, 177, 1873–1887.e1817. [Google Scholar] [CrossRef]
- Lopez, R.; Regier, J.; Cole, M.B.; Jordan, M.I.; Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods 2018, 15, 1053–1058. [Google Scholar] [CrossRef] [PubMed]
- Oh, K.; Yoo, Y.J.; Torre-Healy, L.A.; Rao, M.; Fassler, D.; Wang, P.; Caponegro, M.; Gao, M.; Kim, J.; Sasson, A. Coordinated single-cell tumor microenvironment dynamics reinforce pancreatic cancer subtype. Nat. Commun. 2023, 14, 5226. [Google Scholar] [CrossRef] [PubMed]
- Aibar, S.; González-Blas, C.B.; Moerman, T.; Huynh-Thu, V.A.; Imrichova, H.; Hulselmans, G.; Rambow, F.; Marine, J.-C.; Geurts, P.; Aerts, J. SCENIC: Single-cell regulatory network inference and clustering. Nat. Methods 2017, 14, 1083–1086. [Google Scholar] [CrossRef] [PubMed]
- Aran, D.; Looney, A.P.; Liu, L.; Wu, E.; Fong, V.; Hsu, A.; Chak, S.; Naikawadi, R.P.; Wolters, P.J.; Abate, A.R. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat. Immunol. 2019, 20, 163–172. [Google Scholar] [CrossRef]
- Domínguez Conde, C.; Xu, C.; Jarvis, L.B.; Rainbow, D.B.; Wells, S.B.; Gomes, T.; Howlett, S.; Suchanek, O.; Polanski, K.; King, H. Cross-tissue immune cell analysis reveals tissue-specific features in humans. Science 2022, 376, eabl5197. [Google Scholar] [CrossRef]
- Wolock, S.L.; Lopez, R.; Klein, A.M. Scrublet: Computational identification of cell doublets in single-cell transcriptomic data. Cell Syst. 2019, 8, 281–291.e289. [Google Scholar] [CrossRef]
- Stuart, T.; Butler, A.; Hoffman, P.; Hafemeister, C.; Papalexi, E.; Mauck, W.M.; Hao, Y.; Stoeckius, M.; Smibert, P.; Satija, R. Comprehensive integration of single-cell data. Cell 2019, 177, 1888–1902.e1821. [Google Scholar] [CrossRef]
- Love, M.I.; Huber, W.; Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014, 15, 550. [Google Scholar] [CrossRef]
- McInnes, L.; Healy, J.; Melville, J. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv 2018, arXiv:1802.03426. [Google Scholar] [CrossRef]
- Zhang, Q.; Liu, Y.; Wang, X.; Zhang, C.; Hou, M.; Liu, Y. Integration of single-cell RNA sequencing and bulk RNA transcriptome sequencing reveals a heterogeneous immune landscape and pivotal cell subpopulations associated with colorectal cancer prognosis. Front. Immunol. 2023, 14, 1184167. [Google Scholar] [CrossRef]




| Parameter | Bulk RNAseq | scRNAseq |
|---|---|---|
| Level of resolution | Tissue/cell population | Single cell |
| Sequencing depth | 10–30 million reads per sample [33] | 10,000–100,000 reads per cell [58,68] |
| Dynamic range of detection | Wide dynamic range; enables detection of low- and high-abundance transcripts [69] | Lower effective dynamic range at the single-cell level; limited by low mRNA capture efficiency and dropout events; no broadly accepted quantitative estimate [62,63] |
| Proportion of zero values in count matrix | Moderate (~10–40%) [70,71] | High (~80%) [64] |
| Resolution of cellular heterogeneity | Not resolved (signal averaging) [41,44] | Fully resolved [3,60] |
| Detection of rare cell populations | Limited [6] | Feasible given sufficient cell numbers [3,60] |
| Inter-laboratory reproducibility | High (correlation >0.9) [72] | Variable depending on protocol [73] |
| Sample requirements | Fresh, frozen or formalin-fixed paraffin-embedded (FFPE) tissue [74] | Primarily fresh or cryopreserved samples [13,14] |
| Relative cost | Moderate [75] | High [75] |
| Scalability for cohort studies | High (thousands of samples) [38,39,40,76,77] | Limited [75] |
| Major technical artifacts | Minimal | Amplification bias [62], dissociation-induced stress [65], loss of sensitive populations [13,78] |
| Method | Year | Algorithmic Approach | Main Advantages | Limitations |
|---|---|---|---|---|
| CIBERSORTx [6] | 2019 | ν-support vector regression (ν-SVR) with cross-platform normalization | Enables construction of custom signatures using scRNAseq data; reconstructs cell-type-specific expression profiles | High computational requirements; requires a predefined signature matrix |
| MuSiC [95] | 2019 | Weighted non-negative regression accounting for inter-individual and intra-cell-type variability | Accounts for sample-level and within-cell-type variability, improving accuracy, especially for closely related cell types | Requires scRNAseq reference profiles from multiple donors |
| SCDC [120] | 2021 | Integration of multiple scRNAseq reference datasets with optimized weighting | Improves accuracy through ensemble use of multiple reference datasets | Performance depends on consistency between reference datasets |
| Bisque [54] | 2020 | Gene-specific linear transformation with weighted least squares adjustment | Corrects systematic discrepancies between bulk RNAseq and scRNAseq data, improving gene-level accuracy | Optimal performance when paired bulk RNAseq and scRNAseq data are available |
| BayesPrism [7] | 2022 | Bayesian probabilistic model | Jointly estimates cell proportions and cell-type-specific expression profiles; accounts for within-cell-type variability and cross-platform differences | Computationally intensive; sensitive to incompleteness of the reference scRNAseq dataset |
| Kassandra [96] | 2022 | Gradient boosting using LightGBM trained on simulated transcriptomes | High accuracy and robustness; performs well with overlapping markers and complex tissues (e.g., tumors) | Performance depends on the predefined training cell-type panel |
| xCell [123] | 2017 | Marker gene signature-based enrichment scoring | Reduces artificial correlation between related cell types; does not require external reference datasets; provides relative enrichment estimates | Does not provide absolute cell proportions; limited to enrichment-based interpretation |
| Scaden [97] | 2020 | Deep neural network trained on synthetic bulk mixtures derived from scRNAseq | Captures nonlinear relationships; competitive performance across simulated and experimental datasets; robust to noise and technical bias in some settings | Reduced interpretability; depends on training data design and transferability across datasets |
| Method | Typical Pearson r Range (Published Benchmarks) | Approx. Runtime (5000 Ref. Cells) | Bulk Input Format | scRNAseq Reference Requirement |
|---|---|---|---|---|
| CIBERSORTx [15] | r 0.69–0.97 on simulated mixtures across dynamic conditions, performance declines with increasing tumor purity [126] | ~5 min [127]; Docker/web server required | Tab-delimited normalized expression matrix; Docker or web server | Predefined or custom scRNAseq signature matrix; cross-platform normalization built in |
| MuSiC [95] | Top-ranked on simulated and pseudobulk datasets [95,128]; performance variable on real bulk data (Pearson r < 0.4 in some real PBMC benchmarks [129]) | <30 s; fastest combined runtime [127]; no separate signature build step | R ExpressionSet (counts or CPM) | Multi-donor scRNAseq ExpressionSet; multi-subject reference strongly recommended |
| Bisque [54,130] | r = 0.92 on matched datasets [54]; cor = 0.48–0.68 on real brain tissue across RNAseq protocols [130]; drops substantially under strong cross-platform mismatch | <30 s [127]; no separate signature build step | R ExpressionSet | scRNAseq ExpressionSet; matched bulk+scRNAseq improves gene-specific transformation accuracy; ≥4 donors in reference recommended for stable performance |
| BayesPrism [7] | Top-ranked in heterogeneous simulation settings and for granular immune lineages in tumors [129]; correlation with ground truth >0.95 for malignant cell gene expression at >50% tumor purity | ~5 min (external benchmark [127]); scales substantially with dataset size; computationally intensive | R raw count matrix with cell-type labels | scRNAseq raw count matrix; sensitive to reference incompleteness |
| SCDC [120] | Pearson r = 0.99 on controlled cell-line mixtures [120]; improves MuSiC estimates when multiple independent references integrated via ENSEMBLE weighting | ~120 s for 5000 cells [127] | R ExpressionSet | Single or multiple scRNAseq ExpressionSets; ENSEMBLE framework requires ≥2 independent references |
| Scaden [97] | CCC = 0.88–0.98 on simulated data (average CCC = 0.88 on PBMC, CCC = 0.98 on pancreas); CCC = 0.56–0.92 on real bulk datasets (PBMC and brain) [97] | ~27 min total (training + data generation) [127]; prediction ~8 s; GPU reduces training ~3× | Python/CLI; count data in AnnData (.h5ad) format | scRNAseq data of the same target tissue required for simulation of training mixtures; tissue-specific model must be trained before each new application; no pre-trained universal model available |
| Kassandra [96] | r = 0.83–0.97 across original validation studies; superior accuracy vs. CIBERSORTx and Scaden in TME validation [96] | Not reported in independent benchmark studies | TPM-normalized bulk expression matrix; custom transcript filtering required | Pre-trained LightGBM ensemble; no user-supplied reference required; fixed cell-type panel |
| xCell [123] | Produces enrichment scores only, not comparable to proportion-based r metrics | Seconds; marker-based scoring, no regression step | FPKM/TPM-normalized bulk expression matrix; R package or web server | Not required; curated marker gene signatures for 64 immune and stromal cell types |
| Tool | Algorithmic Framework | Main Relevance to Integrative Workflows | Key Considerations |
|---|---|---|---|
| Seurat [151] | PCA; k-nearest neighbor (kNN) graph; shared nearest neighbor (SNN) graph; graph-based clustering; anchor-based dataset integration; weighted nearest neighbors (WNN) for multimodal analysis | Preprocessing, clustering, batch correction, multimodal atlas construction, reference signature generation | Performance depends on parameter selection; memory-intensive for large datasets |
| Scanpy [141] | PCA; neighborhood graph construction; Louvain/Leiden clustering; scalable sparse matrix implementation | Large-scale preprocessing, dimensionality reduction, marker gene identification | Requires careful parameter tuning; multimodal support relies on additional modules |
| Harmony [143] | Iterative batch correction in low-dimensional embedding space | Removal of batch effects prior to joint dataset analysis | Risk of overcorrection if biological and batch effects are confounded |
| LIGER [144] | Integrative non-negative matrix factorization (iNMF) | Joint analysis of multiple datasets with separation of shared and dataset-specific factors | Requires optimization of factor number and regularization parameters |
| scVI [145] | Hierarchical Bayesian variational autoencoder; negative binomial modeling of UMI counts | Latent representation learning; batch correction; scalable integration of large scRNAseq datasets | Computationally demanding for very large datasets |
| SCENIC [147] | Gene regulatory network inference (GENIE3/GRNBoost) combined with regulon activity scoring (AUCell) | Biological interpretation of regulatory programs in annotated cell populations | Dependent on completeness of transcription factor annotations |
| SingleR [148] | Correlation-based iterative annotation using reference transcriptomic datasets | Automated cell-type annotation and reference mapping | Annotation accuracy depends on reference dataset quality |
| CellTypist [149] | Logistic regression classifier trained on immune reference atlas | Automated immune cell subtype annotation | Primarily optimized for immune cell populations |
| Scrublet [150] | Simulation-based doublet detection using kNN-based scoring | Identification and removal of technical doublets prior to downstream analysis | Reduced sensitivity in highly homogeneous datasets |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Golushko, N.; Buzdin, A. Integration of Bulk and Single-Cell RNA Sequencing Analyses in Biomedicine. Int. J. Mol. Sci. 2026, 27, 3334. https://doi.org/10.3390/ijms27073334
Golushko N, Buzdin A. Integration of Bulk and Single-Cell RNA Sequencing Analyses in Biomedicine. International Journal of Molecular Sciences. 2026; 27(7):3334. https://doi.org/10.3390/ijms27073334
Chicago/Turabian StyleGolushko, Nikita, and Anton Buzdin. 2026. "Integration of Bulk and Single-Cell RNA Sequencing Analyses in Biomedicine" International Journal of Molecular Sciences 27, no. 7: 3334. https://doi.org/10.3390/ijms27073334
APA StyleGolushko, N., & Buzdin, A. (2026). Integration of Bulk and Single-Cell RNA Sequencing Analyses in Biomedicine. International Journal of Molecular Sciences, 27(7), 3334. https://doi.org/10.3390/ijms27073334

