Next Article in Journal
Improving Man-Optimal Stable Matchings by Minimum Change of Preference Lists
Next Article in Special Issue
Efficient in silico Chromosomal Representation of Populations via Indexing Ancestral Genomes
Previous Article in Journal / Special Issue
Practical Compressed Suffix Trees
Article Menu

Export Article

Open AccessArticle
Algorithms 2013, 6(2), 352-370; doi:10.3390/a6020352

Filtering Degenerate Patterns with Application to Protein Sequence Analysis

1
Department of Information Engineering, University of Padova, Padova 35131, Italy
2
Computational and Systems Biology, Genome Institute of Singapore, Singapore 138672, Singapore
*
Author to whom correspondence should be addressed.
Received: 29 March 2013 / Revised: 30 April 2013 / Accepted: 3 May 2013 / Published: 22 May 2013
(This article belongs to the Special Issue Algorithms for Sequence Analysis and Storage)
View Full-Text   |   Download PDF [427 KB, uploaded 22 May 2013]   |  

Abstract

In biology, the notion of degenerate pattern plays a central role for describing various phenomena. For example, protein active site patterns, like those contained in the PROSITE database, e.g., [FY ]DPC[LIM][ASG]C[ASG], are, in general, represented by degenerate patterns with character classes. Researchers have developed several approaches over the years to discover degenerate patterns. Although these methods have been exhaustively and successfully tested on genomes and proteins, their outcomes often far exceed the size of the original input, making the output hard to be managed and to be interpreted by refined analysis requiring manual inspection. In this paper, we discuss a characterization of degenerate patterns with character classes, without gaps, and we introduce the concept of pattern priority for comparing and ranking different patterns. We define the class of underlying patterns for filtering any set of degenerate patterns into a new set that is linear in the size of the input sequence. We present some preliminary results on the detection of subtle signals in protein families. Results show that our approach drastically reduces the number of patterns in output for a tool for protein analysis, while retaining the representative patterns. View Full-Text
Keywords: pattern discovery and filtering; degenerate patterns; analysis of biological data pattern discovery and filtering; degenerate patterns; analysis of biological data
This is an open access article distributed under the Creative Commons Attribution License (CC BY 3.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Comin, M.; Verzotto, D. Filtering Degenerate Patterns with Application to Protein Sequence Analysis. Algorithms 2013, 6, 352-370.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Algorithms EISSN 1999-4893 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top