Next Article in Journal
Transform a Simple Sketch to a Chinese Painting by a Multiscale Deep Neural Network
Previous Article in Journal
Models for Multiple Attribute Decision-Making with Dual Generalized Single-Valued Neutrosophic Bonferroni Mean Operators
Previous Article in Special Issue
A Hierarchical Multi-Label Classification Algorithm for Gene Function Prediction
Article Menu

Export Article

Open AccessArticle
Algorithms 2018, 11(1), 3; https://doi.org/10.3390/a11010003

Analytic Combinatorics for Computing Seeding Probabilities

1
Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
2
University Pompeu Fabra, Dr. Aiguader 80, Barcelona 08003, Spain
Received: 12 November 2017 / Revised: 7 January 2018 / Accepted: 8 January 2018 / Published: 10 January 2018
(This article belongs to the Special Issue Bioinformatics Algorithms and Applications)
Full-Text   |   PDF [1375 KB, uploaded 10 January 2018]   |  

Abstract

Seeding heuristics are the most widely used strategies to speed up sequence alignment in bioinformatics. Such strategies are most successful if they are calibrated, so that the speed-versus-accuracy trade-off can be properly tuned. In the widely used case of read mapping, it has been so far impossible to predict the success rate of competing seeding strategies for lack of a theoretical framework. Here, we present an approach to estimate such quantities based on the theory of analytic combinatorics. The strategy is to specify a combinatorial construction of reads where the seeding heuristic fails, translate this specification into a generating function using formal rules, and finally extract the probabilities of interest from the singularities of the generating function. The generating function can also be used to set up a simple recurrence to compute the probabilities with greater precision. We use this approach to construct simple estimators of the success rate of the seeding heuristic under different types of sequencing errors, and we show that the estimates are accurate in practical situations. More generally, this work shows novel strategies based on analytic combinatorics to compute probabilities of interest in bioinformatics. View Full-Text
Keywords: analytic combinatorics; bioinformatics; seeding sequence alignment; generating functions analytic combinatorics; bioinformatics; seeding sequence alignment; generating functions
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Filion, G.J. Analytic Combinatorics for Computing Seeding Probabilities. Algorithms 2018, 11, 3.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Algorithms EISSN 1999-4893 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top