Next Article in Journal
Lyophilized Platelet-Rich Fibrin (PRF) Promotes Craniofacial Bone Regeneration through Runx2
Next Article in Special Issue
Molecular Characterization of α- and β-Thalassaemia among Malay Patients
Previous Article in Journal
Arabidopsis ABA Receptor RCAR1/PYL9 Interacts with an R2R3-Type MYB Transcription Factor, AtMYB44
Previous Article in Special Issue
Development of a Multiplex and Cost-Effective Genotype Test toward More Personalized Medicine for the Antiplatelet Drug Clopidogrel
Article Menu

Export Article

Open AccessArticle
Int. J. Mol. Sci. 2014, 15(5), 8491-8508; https://doi.org/10.3390/ijms15058491

DEFLATE Compression Algorithm Corrects for Overestimation of Phylogenetic Diversity by Grantham Approach to Single-Nucleotide Polymorphism Classification

1
Kolling Institute of Medical Research, Royal North Shore Hospital, Pacific Hwy, St Leonards, NSW 2065, Australia
2
Sydney Medical School, the University of Sydney, NSW 2006, Australia
3
University of Cambridge Metabolic Research Laboratories, Box 289, Level 4 Wellcome Trust-MRC Institute of Metabolic Science, Addenbrooke's Hospital, Hills Road, Cambridge CB2 0QQ, UK
*
Author to whom correspondence should be addressed.
Received: 22 January 2014 / Revised: 28 March 2014 / Accepted: 4 May 2014 / Published: 13 May 2014
(This article belongs to the Collection Human Single Nucleotide Polymorphisms and Disease Diagnostics)
View Full-Text   |   Download PDF [571 KB, uploaded 19 June 2014]   |  

Abstract

Improvements in speed and cost of genome sequencing are resulting in increasing numbers of novel non-synonymous single nucleotide polymorphisms (nsSNPs) in genes known to be associated with disease. The large number of nsSNPs makes laboratory-based classification infeasible and familial co-segregation with disease is not always possible. In-silico methods for classification or triage are thus utilised. A popular tool based on multiple-species sequence alignments (MSAs) and work by Grantham, Align-GVGD, has been shown to underestimate deleterious effects, particularly as sequence numbers increase. We utilised the DEFLATE compression algorithm to account for expected variation across a number of species. With the adjusted Grantham measure we derived a means of quantitatively clustering known neutral and deleterious nsSNPs from the same gene; this was then used to assign novel variants to the most appropriate cluster as a means of binary classification. Scaling of clusters allows for inter-gene comparison of variants through a single pathogenicity score. The approach improves upon the classification accuracy of Align-GVGD while correcting for sensitivity to large MSAs. Open-source code and a web server are made available at https://github.com/aschlosberg/CompressGV. View Full-Text
Keywords: DEFLATE; compression; Grantham; variation; sequence alignment; nsSNP DEFLATE; compression; Grantham; variation; sequence alignment; nsSNP
Figures

Graphical abstract

This is an open access article distributed under the Creative Commons Attribution License (CC BY 3.0).

Share & Cite This Article

MDPI and ACS Style

Schlosberg, A.; Lam, B.Y.H.; Yeo, G.S.H.; Clifton-Bligh, R.J. DEFLATE Compression Algorithm Corrects for Overestimation of Phylogenetic Diversity by Grantham Approach to Single-Nucleotide Polymorphism Classification. Int. J. Mol. Sci. 2014, 15, 8491-8508.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Int. J. Mol. Sci. EISSN 1422-0067 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top