Next Article in Journal
Pattern-Guided k-Anonymity
Next Article in Special Issue
Modeling Dynamic Programming Problems over Sequences and Trees with Inverse Coupled Rewrite Systems
Previous Article in Journal
Multi-Threading a State-of-the-Art Maximum Clique Algorithm
Previous Article in Special Issue
Efficient in silico Chromosomal Representation of Populations via Indexing Ancestral Genomes
Article Menu

Export Article

Open AccessArticle
Algorithms 2013, 6(4), 636-677; doi:10.3390/a6040636

Sublinear Time Motif Discovery from Multiple Sequences

Department of Computer Science, University of Texas-Pan American, 1201 W University Dr., Edinburg, TX 78539, USA
*
Author to whom correspondence should be addressed.
Received: 11 June 2013 / Revised: 30 September 2013 / Accepted: 1 October 2013 / Published: 14 October 2013
(This article belongs to the Special Issue Algorithms for Sequence Analysis and Storage)
View Full-Text   |   Download PDF [479 KB, uploaded 14 October 2013]   |  

Abstract

In this paper, a natural probabilistic model for motif discovery has been used to experimentally test the quality of motif discovery programs. In this model, there are k background sequences, and each character in a background sequence is a random character from an alphabet, Σ. A motif G = g1g2 ... gm is a string of m characters. In each background sequence is implanted a probabilistically-generated approximate copy of G. For a probabilistically-generated approximate copy b1b2 ... bm of G, every character, bi, is probabilistically generated, such that the probability for bi gi is at most α. We develop two new randomized algorithms and one new deterministic algorithm. They make advancements in the following aspects: (1) The algorithms are much faster than those before. Our algorithms can even run in sublinear time. (2) They can handle any motif pattern. (3) The restriction for the alphabet size is a lower bound of four. This gives them potential applications in practical problems, since gene sequences have an alphabet size of four. (4) All algorithms have rigorous proofs about their performances. The methods developed in this paper have been used in the software implementation. We observed some encouraging results that show improved performance for motif detection compared with other software. View Full-Text
Keywords: motif discovery; sublinear time; randomized algorithm; deterministic algorithm motif discovery; sublinear time; randomized algorithm; deterministic algorithm
This is an open access article distributed under the Creative Commons Attribution License (CC BY 3.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Fu, B.; Fu, Y.; Xue, Y. Sublinear Time Motif Discovery from Multiple Sequences. Algorithms 2013, 6, 636-677.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Algorithms EISSN 1999-4893 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top