Next Article in Journal
Lyophilized Platelet-Rich Fibrin (PRF) Promotes Craniofacial Bone Regeneration through Runx2
Next Article in Special Issue
Molecular Characterization of α- and β-Thalassaemia among Malay Patients
Previous Article in Journal
Arabidopsis ABA Receptor RCAR1/PYL9 Interacts with an R2R3-Type MYB Transcription Factor, AtMYB44
Previous Article in Special Issue
Development of a Multiplex and Cost-Effective Genotype Test toward More Personalized Medicine for the Antiplatelet Drug Clopidogrel
Article Menu

Export Article

Open AccessArticle
Int. J. Mol. Sci. 2014, 15(5), 8491-8508; doi:10.3390/ijms15058491

DEFLATE Compression Algorithm Corrects for Overestimation of Phylogenetic Diversity by Grantham Approach to Single-Nucleotide Polymorphism Classification

1
Kolling Institute of Medical Research, Royal North Shore Hospital, Pacific Hwy, St Leonards, NSW 2065, Australia
2
Sydney Medical School, the University of Sydney, NSW 2006, Australia
3
University of Cambridge Metabolic Research Laboratories, Box 289, Level 4 Wellcome Trust-MRC Institute of Metabolic Science, Addenbrooke's Hospital, Hills Road, Cambridge CB2 0QQ, UK
*
Author to whom correspondence should be addressed.
Received: 22 January 2014 / Revised: 28 March 2014 / Accepted: 4 May 2014 / Published: 13 May 2014
(This article belongs to the Collection Human Single Nucleotide Polymorphisms and Disease Diagnostics)
View Full-Text   |   Download PDF [571 KB, uploaded 19 June 2014]   |  

Abstract

Improvements in speed and cost of genome sequencing are resulting in increasing numbers of novel non-synonymous single nucleotide polymorphisms (nsSNPs) in genes known to be associated with disease. The large number of nsSNPs makes laboratory-based classification infeasible and familial co-segregation with disease is not always possible. In-silico methods for classification or triage are thus utilised. A popular tool based on multiple-species sequence alignments (MSAs) and work by Grantham, Align-GVGD, has been shown to underestimate deleterious effects, particularly as sequence numbers increase. We utilised the DEFLATE compression algorithm to account for expected variation across a number of species. With the adjusted Grantham measure we derived a means of quantitatively clustering known neutral and deleterious nsSNPs from the same gene; this was then used to assign novel variants to the most appropriate cluster as a means of binary classification. Scaling of clusters allows for inter-gene comparison of variants through a single pathogenicity score. The approach improves upon the classification accuracy of Align-GVGD while correcting for sensitivity to large MSAs. Open-source code and a web server are made available at https://github.com/aschlosberg/CompressGV. View Full-Text
Keywords: DEFLATE; compression; Grantham; variation; sequence alignment; nsSNP DEFLATE; compression; Grantham; variation; sequence alignment; nsSNP
Figures

This is an open access article distributed under the Creative Commons Attribution License (CC BY 3.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Schlosberg, A.; Lam, B.Y.H.; Yeo, G.S.H.; Clifton-Bligh, R.J. DEFLATE Compression Algorithm Corrects for Overestimation of Phylogenetic Diversity by Grantham Approach to Single-Nucleotide Polymorphism Classification. Int. J. Mol. Sci. 2014, 15, 8491-8508.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Int. J. Mol. Sci. EISSN 1422-0067 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top