Next Article in Journal
A General Integrated Method for Design Analysis and Optimization of Missile Structure
Next Article in Special Issue
Parameterized Optimization in Uncertain Graphs—A Survey and Some Results
Previous Article in Journal
Pre and Postprocessing for JPEG to Handle Large Monochrome Images
Previous Article in Special Issue
FPT Algorithms for Diverse Collections of Hitting Sets
Open AccessArticle
Peer-Review Record

Parameterized Algorithms in Bioinformatics: An Overview

Algorithms 2019, 12(12), 256; https://doi.org/10.3390/a12120256
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Shilpa Garg
Algorithms 2019, 12(12), 256; https://doi.org/10.3390/a12120256
Received: 30 September 2019 / Revised: 11 November 2019 / Accepted: 19 November 2019 / Published: 1 December 2019
(This article belongs to the Special Issue New Frontiers in Parameterized Complexity and Algorithms)

Round 1

Reviewer 1 Report

The authors present a survey on parameterized algorithms in ("combinatorial") Bioinformatics.

The list (and the paper) is well structured and the state-of-the-art results are concisely but clearly presented. The authors also highlight the open questions for each problem (or group of related problems). As such, the article can be a good reference point for computer scientists that study the parameterized complexity of problems arising in Bioinformatics.

I was only able to skim over the survey due to the time constraints imposed by the editorial office, thus I was not able to thoroughly check all the results. Of the results which I already knew, I spotted a small imprecision: the Xor-Haplotyping problem is not known to be NP-hard (at least [112] does not prove that result but only the parameterized algorithm).

Author Response

see PDF

Author Response File: Author Response.pdf

Reviewer 2 Report

The manuscript is a review of parameterized algorithms in Bioinformatics.

The paper consists of a list of known results on several problems in bioinformatics, therefore there is no original contribution, as is expected in a review paper.

I have a major concern; I cannot understand what is the expected audience. Usually a review targets the largest possible audience. But this survey is not suited for that, as it does not give sufficient information on fixed-parameter algorithms. In fact, I do not think that a reader with a general knowledge of algorithms would benefit from this paper, unless he has already read some introductory text on parameterized algorithms.

The paper does not describe any tool, so it is not appealing to biologists interested in computational aspects.

Finally, I doubt that it is attractive for researchers that are already active in fixed-parameter algorithms research, as the large majority of those results should be widely known in that subfield. Overall, this manuscript looks more like an encyclopedia chapter than a scholarly article.

Leaving this concern aside, I think a more detailed discussion of some aspects would improve the paper. Right now, the paper is just a list of results, without clear description of how those results have been obtained. I am giving just an example, but there are lots of instances throughout the paper. Line 131: "It can also be solved in $k^{O(k^2)}$". Just a sketch of the argument, or the name of the technique used, would help. Too many times I have read "can be solved" or similar sentences, without any hint on how it can be solved.

Another general observation is that the biological discussions of the problems lack citation. For example, at page 2, lines 34-52 do not have any citations. I understand that, in most cases, the citations to add are already in the bibliography, but this is the correct place to introduce such citations.

Some detailed comments follow.

In the introduction, the content of Section 3 precedes that of Sect. 2. Please swap them.

Line 16.  "[1,2]))" -> "[1,2])" 

Line 55. "most basic distance .. breakpoint". IMO the most basic distance is the Hamming distance.

Line 63. "precise". Rearrangement distance is precise if we are measuring events on a certain time scale. A different time scale would call for a different distance.

Line 72. "mutli", "doucble"

Line 87. what is "optimal"? Usually optimality requires an optimization criterion that I could not find.

Line 148. Fix leading full stop.

Line 152. "XP". please define

Line 187. DNA is uppercase

Line 194. "tupple"

Sect 3.2.2, def LCS. there is ",," in the last line

Line 244-246. I am pretty sure that most of those results have appeared in Mike Hallett's PhD thesis.

Line 253. "FPT in P". Please explain

Line 309. "O^*" please explain

Line 314. Explain if the result holds for all cost-measures or only for some.

Line 322-324. This paragraph is not related to fixed-parameter. Please remove

Caption Figure 7. The use of colors is not sufficient to distinguish the sets if you print the paper, or if you have a color-related vision disabilities. Please add another visual cue.

Line 351. "p conflict". The factor p makes sense only for the polyploid case, which has not be treated previously. I suggest to introduce 2 conflict here, and extend it later in the paper to p conflict

Line 357. "is any position". In my opinion the statement is not consistent with the fact that gapless rows can contain dashes at the beginning/end of rows.

Line 372. SNIP -> SNP

Line 434. "insert size ... around 10^3". In haplotype assembly, you only care about loci where you actually expect SNPs. In fact, Illumina reads are essentially useless in this context, 3rd gen reads (PacBio or ONT) are used. Since almost all loci are not considered, the insert size is irrelevant.

Sect. 5. In my opinion, it is better to introduce and discuss trees before networks.

line 490. "networks as phylogenies". Usually phylogenies are trees, not networks.

Sect. 5.2. You should cite also https://doi.org/10.1007/3-540-48194-X_23

Line 528 "sunflower-based kernel". An explanation is needed.

Line 548 O((8n)^t) and O((6n)^t) have a non-polynomial dependency on n, which defies the purpose of fixed-parameter analysis. Either you remove those results, or you discuss that the time bound era not usually considered fixed-parameter tractable

Sect 5.2.3 def MAF, last line. You have a "\le" symbol which has not been defined

Line 636. "MSOL" expand

Line 671-673. Besides [230 you should cite also https://doi.org/10.1016/j.tcs.2005.05.016 and https://doi.org/10.1137/S0097539798343362 that have started the study of the reconciliation problem

Line 712-718. That part is unclear. Please expand the discussion and/or use an example.

I have been puzzled by the lack of discussion on perfect phylogeny reconstruction, as it has been one of the first application of fixed parameter analysis in bioinformatics. Especially important are https://doi.org/10.1137/S0097539793244587 and https://doi.org/10.1137/S0097539794279067

Alsoa discussion on character compatibility would be welcome. See https://doi.org/10.1007/978-3-642-21260-4_41 and https://doi.org/10.1007/978-3-642-02882-3_27

 

Author Response

see PDF

Author Response File: Author Response.pdf

Reviewer 3 Report

The authors present a survey on parameterized algorithms for computational biology problems such as genome assembly, annotation, comparisons, phasing and phylogenetics. FPTs are important for finding optimal solutions to biological problems in the community. For phasing problem, the formulation is presented as a minimum error correction problem, essentially flip the minimum set of entries to partition the reads to sets without any conflict. In a similar context, genome assembly is the process of assembling from sequencing reads, that is formulated as longest common subsequence from the overlaps of reads. This survey can serve as a good source to review concepts for the bioinformatics community.

Comments:
1. The survey presented by the authors is interesting, involved, and correct as far as I can tell.
2. The paper is well written
3. Garg et al. 2018 and 2019 provide FPT for phasing on graphs for single individuals and trios. These might be good results to incorporate in the paper.
4. Garg et al. ESA 2018 shows PTAS status for Gapless-MEC phasing instances. This is an important recent result, might be useful to include in the paper.
5. A final summary table on best FPTs and the running times for presented problems in this survey might be helpful for readers.

Nevertheless, I think the survey is a good fit, and it deserves to be accepted.

Author Response

see PDF

Author Response File: Author Response.pdf

Back to TopTop