Next Article in Journal
Activation of Immune System May Cause Pathophysiological Changes in the Myocardium of SARS-CoV-2 Infected Monkey Model
Next Article in Special Issue
Nucleus Near-Infrared (nNIR) Irradiation of Single A549 Cells Induces DNA Damage and Activates EGFR Leading to Mitochondrial Fission
Previous Article in Journal
Dissecting miRNA–Gene Networks to Map Clinical Utility Roads of Pharmacogenomics-Guided Therapeutic Decisions in Cardiovascular Precision Medicine
Previous Article in Special Issue
Interpretable Autoencoders Trained on Single Cell Sequencing Data Can Transfer Directly to Data from Unseen Tissues
 
 
Article

Discordant Genome Assemblies Drastically Alter the Interpretation of Single-Cell RNA Sequencing Data Which Can Be Mitigated by a Novel Integration Method

1
Burdon Sanderson Cardiac Science Centre, Department of Physiology, Anatomy & Genetics, University of Oxford, Oxford OX1 3PT, UK
2
Bioinfo, Plantagenet, ON K0B 1L0, Canada
3
Department of Animal Sciences, Bond Life Sciences Center, University of Missouri, Columbia, MO 65201, USA
4
Division of Cardiovascular Medicine, University of Oxford, Oxford OX3 9DU, UK
*
Author to whom correspondence should be addressed.
Academic Editors: Tuhin Subhra Santra and Fan-Gang Tseng
Cells 2022, 11(4), 608; https://doi.org/10.3390/cells11040608
Received: 31 December 2021 / Revised: 27 January 2022 / Accepted: 7 February 2022 / Published: 10 February 2022
(This article belongs to the Special Issue Single Cell Analysis 2.0)
Advances in sequencing and assembly technology have led to the creation of genome assemblies for a wide variety of non-model organisms. The rapid production and proliferation of updated, novel assembly versions can create vexing problems for researchers when multiple-genome assembly versions are available at once, requiring researchers to work with more than one reference genome. Multiple-genome assemblies are especially problematic for researchers studying the genetic makeup of individual cells, as single-cell RNA sequencing (scRNAseq) requires sequenced reads to be mapped and aligned to a single reference genome. Using the Astyanax mexicanus, this study highlights how the interpretation of a single-cell dataset from the same sample changes when aligned to its two different available genome assemblies. We found that the number of cells and expressed genes detected were drastically different when aligning to the different assemblies. When the genome assemblies were used in isolation with their respective annotations, cell-type identification was confounded, as some classic cell-type markers were assembly-specific, whilst other genes showed differential patterns of expression between the two assemblies. To overcome the problems posed by multiple-genome assemblies, we propose that researchers align to each available assembly and then integrate the resultant datasets to produce a final dataset in which all genome alignments can be used simultaneously. We found that this approach increased the accuracy of cell-type identification and maximised the amount of data that could be extracted from our single-cell sample by capturing all possible cells and transcripts. As scRNAseq becomes more widely available, it is imperative that the single-cell community is aware of how genome assembly alignment can alter single-cell data and their interpretation, especially when reviewing studies on non-model organisms. View Full-Text
Keywords: genome assembly; Astyanax mexicanus; integration; seurat; read alignment; non-model organisms; scRNAseq genome assembly; Astyanax mexicanus; integration; seurat; read alignment; non-model organisms; scRNAseq
Show Figures

Figure 1

MDPI and ACS Style

Potts, H.G.; Lemieux, M.E.; Rice, E.S.; Warren, W.; Choudhury, R.P.; Mommersteeg, M.T.M. Discordant Genome Assemblies Drastically Alter the Interpretation of Single-Cell RNA Sequencing Data Which Can Be Mitigated by a Novel Integration Method. Cells 2022, 11, 608. https://doi.org/10.3390/cells11040608

AMA Style

Potts HG, Lemieux ME, Rice ES, Warren W, Choudhury RP, Mommersteeg MTM. Discordant Genome Assemblies Drastically Alter the Interpretation of Single-Cell RNA Sequencing Data Which Can Be Mitigated by a Novel Integration Method. Cells. 2022; 11(4):608. https://doi.org/10.3390/cells11040608

Chicago/Turabian Style

Potts, Helen G., Madeleine E. Lemieux, Edward S. Rice, Wesley Warren, Robin P. Choudhury, and Mathilda T. M. Mommersteeg. 2022. "Discordant Genome Assemblies Drastically Alter the Interpretation of Single-Cell RNA Sequencing Data Which Can Be Mitigated by a Novel Integration Method" Cells 11, no. 4: 608. https://doi.org/10.3390/cells11040608

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop