Next Article in Journal
Targeted Next-Generation Sequencing in Patients with Suggestive X-Linked Intellectual Disability
Previous Article in Journal
Genome-Wide Association Study (GWAS) for Mesocotyl Elongation in Rice (Oryza sativa L.) under Multiple Culture Conditions
Open AccessFeature PaperArticle

Consensify: A Method for Generating Pseudohaploid Genome Sequences from Palaeogenomic Datasets with Reduced Error Rates

Institute for Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Str. 24–25, 14476 Potsdam, Germany
*
Authors to whom correspondence should be addressed.
Present Address: School of Science and Technology, Nottingham Trent University, Clifton Lane, Nottingham NG11 8NS, UK.
Present Address: Natural History Museum of Potsdam, Breite Straße 11/13, 14467 Potsdam, Germany.
§
Present Address: Department of Genetics & Genome Biology, University of Leicester, Leicester LE1 7RH, UK.
Genes 2020, 11(1), 50; https://doi.org/10.3390/genes11010050
Received: 20 November 2019 / Revised: 24 December 2019 / Accepted: 27 December 2019 / Published: 2 January 2020
(This article belongs to the Section Technologies and Resources for Genetics)
A standard practise in palaeogenome analysis is the conversion of mapped short read data into pseudohaploid sequences, frequently by selecting a single high-quality nucleotide at random from the stack of mapped reads. This controls for biases due to differential sequencing coverage, but it does not control for differential rates and types of sequencing error, which are frequently large and variable in datasets obtained from ancient samples. These errors have the potential to distort phylogenetic and population clustering analyses, and to mislead tests of admixture using D statistics. We introduce Consensify, a method for generating pseudohaploid sequences, which controls for biases resulting from differential sequencing coverage while greatly reducing error rates. The error correction is derived directly from the data itself, without the requirement for additional genomic resources or simplifying assumptions such as contemporaneous sampling. For phylogenetic and population clustering analysis, we find that Consensify is less affected by artefacts than methods based on single read sampling. For D statistics, Consensify is more resistant to false positives and appears to be less affected by biases resulting from different laboratory protocols than other frequently used methods. Although Consensify is developed with palaeogenomic data in mind, it is applicable for any low to medium coverage short read datasets. We predict that Consensify will be a useful tool for future studies of palaeogenomes. View Full-Text
Keywords: palaeogenomics; ancient DNA; sequencing error; error reduction; D statistics; bioinformatics palaeogenomics; ancient DNA; sequencing error; error reduction; D statistics; bioinformatics
Show Figures

Figure 1

MDPI and ACS Style

Barlow, A.; Hartmann, S.; Gonzalez, J.; Hofreiter, M.; Paijmans, J.L.A. Consensify: A Method for Generating Pseudohaploid Genome Sequences from Palaeogenomic Datasets with Reduced Error Rates. Genes 2020, 11, 50.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop