Algorithms
http://www.mdpi.com/journal/algorithms
Latest open access articles published in Algorithms at http://www.mdpi.com/journal/algorithms<![CDATA[Algorithms, Vol. 7, Pages 203-205: Editorial: Special Issue on Matching under Preferences]]>
http://www.mdpi.com/1999-4893/7/2/203
This special issue of Algorithms is devoted to the study of matching problems involving ordinal preferences from the standpoint of algorithms and complexity.Algorithms2014-04-0872Editorial10.3390/a70202032032051999-48932014-04-08doi: 10.3390/a7020203Péter BiróDavid Manlove<![CDATA[Algorithms, Vol. 7, Pages 189-202: Faster and Simpler Approximation of Stable Matchings]]>
http://www.mdpi.com/1999-4893/7/2/189
We give a 3 2 -approximation algorithm for finding stable matchings that runs in O(m) time. The previous most well-known algorithm, by McDermid, has the same approximation ratio but runs in O(n3/2m) time, where n denotes the number of people andm is the total length of the preference lists in a given instance. In addition, the algorithm and the analysis are much simpler. We also give the extension of the algorithm for computing stable many-to-many matchings.Algorithms2014-04-0472Article10.3390/a70201891892021999-48932014-04-04doi: 10.3390/a7020189Katarzyna Paluch<![CDATA[Algorithms, Vol. 7, Pages 188: Correction: Pareto Optimization or Cascaded Weighted Sum: A Comparison of Concepts. Algorithms 2014, 7, 166–185]]>
http://www.mdpi.com/1999-4893/7/2/188
It has come to our attention that due to an error in producing the PDF version of the paper [1], doi:10.3390/a7010166, website: http://www.mdpi.com/1999-4893/7/1/166, Figures 1 and 9 are displayed incorrectly.Algorithms2014-04-0272Correction10.3390/a70201881881881999-48932014-04-02doi: 10.3390/a7020188Kazuo Iwama<![CDATA[Algorithms, Vol. 7, Pages 186-187: Editorial: Special Issue on Algorithms for Sequence Analysis and Storage]]>
http://www.mdpi.com/1999-4893/7/1/186
This special issue of Algorithms is dedicated to approaches to biological sequence analysis that have algorithmic novelty and potential for fundamental impact in methods used for genome research.Algorithms2014-03-2571Editorial10.3390/a70101861861871999-48932014-03-25doi: 10.3390/a7010186Veli Mäkinen<![CDATA[Algorithms, Vol. 7, Pages 166-185: Pareto Optimization or Cascaded Weighted Sum: A Comparison of Concepts]]>
http://www.mdpi.com/1999-4893/7/1/166
Looking at articles or conference papers published since the turn of the century, Pareto optimization is the dominating assessment method for multi-objective nonlinear optimization problems. However, is it always the method of choice for real-world applications, where either more than four objectives have to be considered, or the same type of task is repeated again and again with only minor modifications, in an automated optimization or planning process? This paper presents a classification of application scenarios and compares the Pareto approach with an extended version of the weighted sum, called cascaded weighted sum, for the different scenarios. Its range of application within the field of multi-objective optimization is discussed as well as its strengths and weaknesses.Algorithms2014-03-2171Article10.3390/a70101661661851999-48932014-03-21doi: 10.3390/a7010166Wilfried JakobChristian Blume<![CDATA[Algorithms, Vol. 7, Pages 145-165: The Minimum Scheduling Time for Convergecast in Wireless Sensor Networks]]>
http://www.mdpi.com/1999-4893/7/1/145
We study the scheduling problem for data collection from sensor nodes to the sink node in wireless sensor networks, also referred to as the convergecast problem. The convergecast problem in general network topology has been proven to be NP-hard. In this paper, we propose our heuristic algorithm (finding the minimum scheduling time for convergecast (FMSTC)) for general network topology and evaluate the performance by simulation. The results of the simulation showed that the number of time slots to reach the sink node decreased with an increase in the power. We compared the performance of the proposed algorithm to the optimal time slots in a linear network topology. The proposed algorithm for convergecast in a general network topology has 2.27 times more time slots than that of a linear network topology. To the best of our knowledge, the proposed method is the first attempt to apply the optimal algorithm in a linear network topology to a general network topology.Algorithms2014-03-1771Article10.3390/a70101451451651999-48932014-03-17doi: 10.3390/a7010145Changyong(Andrew) JungSuk LeeVijay Bhuse<![CDATA[Algorithms, Vol. 7, Pages 62-144: Modeling Dynamic Programming Problems over Sequences and Trees with Inverse Coupled Rewrite Systems]]>
http://www.mdpi.com/1999-4893/7/1/62
Dynamic programming is a classical algorithmic paradigm, which often allows the evaluation of a search space of exponential size in polynomial time. Recursive problem decomposition, tabulation of intermediate results for re-use, and Bellman’s Principle of Optimality are its well-understood ingredients. However, algorithms often lack abstraction and are difficult to implement, tedious to debug, and delicate to modify. The present article proposes a generic framework for specifying dynamic programming problems. This framework can handle all kinds of sequential inputs, as well as tree-structured data. Biosequence analysis, document processing, molecular structure analysis, comparison of objects assembled in a hierarchic fashion, and generally, all domains come under consideration where strings and ordered, rooted trees serve as natural data representations. The new approach introduces inverse coupled rewrite systems. They describe the solutions of combinatorial optimization problems as the inverse image of a term rewrite relation that reduces problem solutions to problem inputs. This specification leads to concise yet translucent specifications of dynamic programming algorithms. Their actual implementation may be challenging, but eventually, as we hope, it can be produced automatically. The present article demonstrates the scope of this new approach by describing a diverse set of dynamic programming problems which arise in the domain of computational biology, with examples in biosequence and molecular structure analysis.Algorithms2014-03-0771Article10.3390/a7010062621441999-48932014-03-07doi: 10.3390/a7010062Robert GiegerichH´el'ene Touzet<![CDATA[Algorithms, Vol. 7, Pages 60-61: Acknowledgement to Reviewers of Algorithms in 2013]]>
http://www.mdpi.com/1999-4893/7/1/60
The editors of Algorithms would like to express their sincere gratitude to the following reviewers for assessing manuscripts in 2013.Algorithms2014-02-2571Editorial10.3390/a701006060611999-48932014-02-25doi: 10.3390/a7010060 Algorithms Editorial Office<![CDATA[Algorithms, Vol. 7, Pages 32-59: Choice Function-Based Two-Sided Markets: Stability, Lattice Property, Path Independence and Algorithms]]>
http://www.mdpi.com/1999-4893/7/1/32
We build an abstract model, closely related to the stable marriage problem and motivated by Hungarian college admissions. We study different stability notions and show that an extension of the lattice property of stable marriages holds in these more general settings, even if the choice function on one side is not path independent. We lean on Tarski’s fixed point theorem and the substitutability property of choice functions. The main virtue of the work is that it exhibits practical, interesting examples, where non-path independent choice functions play a role, and proves various stability-related results.Algorithms2014-02-1471Article10.3390/a701003232591999-48932014-02-14doi: 10.3390/a7010032Tamàs FleinerZsuzsanna Jankó<![CDATA[Algorithms, Vol. 7, Pages 15-31: Bio-Inspired Meta-Heuristics for Emergency Transportation Problems]]>
http://www.mdpi.com/1999-4893/7/1/15
Emergency transportation plays a vital role in the success of disaster rescue and relief operations, but its planning and scheduling often involve complex objectives and search spaces. In this paper, we conduct a survey of recent advances in bio-inspired meta-heuristics, including genetic algorithms (GA), particle swarm optimization (PSO), ant colony optimization (ACO), etc., for solving emergency transportation problems. We then propose a new hybrid biogeography-based optimization (BBO) algorithm, which outperforms some state-of-the-art heuristics on a typical transportation planning problem.Algorithms2014-02-1171Article10.3390/a701001515311999-48932014-02-11doi: 10.3390/a7010015Min-Xia ZhangBei ZhangYu-Jun Zheng<![CDATA[Algorithms, Vol. 7, Pages 1-14: On Stable Matchings and Flows]]>
http://www.mdpi.com/1999-4893/7/1/1
We describe a flow model related to ordinary network flows the same way as stable matchings are related to maximum matchings in bipartite graphs. We prove that there always exists a stable flow and generalize the lattice structure of stable marriages to stable flows. Our main tool is a straightforward reduction of the stable flow problem to stable allocations. For the sake of completeness, we prove the results we need on stable allocations as an application of Tarski’s fixed point theorem.Algorithms2014-01-2271Article10.3390/a70100011141999-48932014-01-22doi: 10.3390/a7010001Tamás Fleiner<![CDATA[Algorithms, Vol. 6, Pages 871-882: Sparse Signal Recovery from Fixed Low-Rank Subspace via Compressive Measurement]]>
http://www.mdpi.com/1999-4893/6/4/871
This paper designs and evaluates a variant of CoSaMP algorithm, for recovering the sparse signal s from the compressive measurement v = A(Uw+s) given a fixed low-rank subspace spanned by U. Instead of firstly recovering the full vector then separating the sparse part from the structured dense part, the proposed algorithm directly works on the compressive measurement to do the separation. We investigate the performance of the algorithm on both simulated data and video compressive sensing. The results show that for a fixed low-rank subspace and truly sparse signal the proposed algorithm could successfully recover the signal only from a few compressive sensing (CS) measurements, and it performs better than ordinary CoSaMP when the sparse signal is corrupted by additional Gaussian noise.Algorithms2013-12-1764Article10.3390/a60408718718821999-48932013-12-17doi: 10.3390/a6040871Jun HeMing-Wei GaoLei ZhangHao Wu<![CDATA[Algorithms, Vol. 6, Pages 857-870: Solving Matrix Equations on Multi-Core and Many-Core Architectures]]>
http://www.mdpi.com/1999-4893/6/4/857
We address the numerical solution of Lyapunov, algebraic and differential Riccati equations, via the matrix sign function, on platforms equipped with general-purpose multicore processors and, optionally, one or more graphics processing units (GPUs). In particular, we review the solvers for these equations, as well as the underlying methods, analyze their concurrency and scalability and provide details on their parallel implementation. Our experimental results show that this class of hardware provides sufficient computational power to tackle large-scale problems, which only a few years ago would have required a cluster of computers.Algorithms2013-11-2564Article10.3390/a60408578578701999-48932013-11-25doi: 10.3390/a6040857Peter BennerPablo EzzattiHermann MenaEnrique Quintana-OrtíAlfredo Remón<![CDATA[Algorithms, Vol. 6, Pages 824-856: Overlays with Preferences: Distributed, Adaptive Approximation Algorithms for Matching with Preference Lists]]>
http://www.mdpi.com/1999-4893/6/4/824
A key property of overlay networks is the overlay nodes’ ability to establish connections (or be matched) to other nodes by preference, based on some suitability metric related to, e.g., the node’s distance, interests, recommendations, transaction history or available resources. When there are no preference cycles among the nodes, a stable matching exists in which nodes have maximized individual satisfaction, due to their choices, however no such guarantees are currently being given in the generic case. In this work, we employ the notion of node satisfaction to suggest a novel modeling for matching problems, suitable for overlay networks. We start by presenting a simple, yet powerful, distributed algorithm that solves the many-to-many matching problem with preferences. It achieves that by using local information and aggregate satisfaction as an optimization metric, while providing a guaranteed convergence and approximation ratio. Subsequently, we show how to extend the algorithm in order to support and adapt to changes in the nodes’ connectivity and preferences. In addition, we provide a detailed experimental study that focuses on the levels of achieved satisfaction, as well as convergence and reconvergence speed.Algorithms2013-11-1964Article10.3390/a60408248248561999-48932013-11-19doi: 10.3390/a6040824Giorgos GeorgiadisMarina Papatriantafilou<![CDATA[Algorithms, Vol. 6, Pages 805-823: PMS6MC: A Multicore Algorithm for Motif Discovery]]>
http://www.mdpi.com/1999-4893/6/4/805
We develop an efficient multicore algorithm, PMS6MC, for the (l; d)-motif discovery problem in which we are to find all strings of length l that appear in every string of a given set of strings with at most d mismatches. PMS6MC is based on PMS6, which is currently the fastest single-core algorithm for motif discovery in large instances. The speedup, relative to PMS6, attained by our multicore algorithm ranges from a high of 6.62 for the (17,6) challenging instances to a low of 2.75 for the (13,4) challenging instances on an Intel 6-core system. We estimate that PMS6MC is 2 to 4 times faster than other parallel algorithms for motif search on large instances.Algorithms2013-11-1864Article10.3390/a60408058058231999-48932013-11-18doi: 10.3390/a6040805Shibdas BandyopadhyaySartaj SahniSanguthevar Rajasekaran<![CDATA[Algorithms, Vol. 6, Pages 782-804: Stability, Optimality and Manipulation in Matching Problems with Weighted Preferences]]>
http://www.mdpi.com/1999-4893/6/4/782
The stable matching problem (also known as the stable marriage problem) is a well-known problem of matching men to women, so that no man and woman, who are not married to each other, both prefer each other. Such a problem has a wide variety of practical applications, ranging from matching resident doctors to hospitals, to matching students to schools or, more generally, to any two-sided market. In the classical stable marriage problem, both men and women express a strict preference order over the members of the other sex, in a qualitative way. Here, we consider stable marriage problems with weighted preferences: each man (resp., woman) provides a score for each woman (resp., man). Such problems are more expressive than the classical stable marriage problems. Moreover, in some real-life situations, it is more natural to express scores (to model, for example, profits or costs) rather than a qualitative preference ordering. In this context, we define new notions of stability and optimality, and we provide algorithms to find marriages that are stable and/or optimal according to these notions. While expressivity greatly increases by adopting weighted preferences, we show that, in most cases, the desired solutions can be found by adapting existing algorithms for the classical stable marriage problem. We also consider the manipulability properties of the procedures that return such stable marriages. While we know that all procedures are manipulable by modifying the preference lists or by truncating them, here, we consider if manipulation can occur also by just modifying the weights while preserving the ordering and avoiding truncation. It turns out that, by adding weights, in some cases, we may increase the possibility of manipulating, and this cannot be avoided by any reasonable restriction on the weights.Algorithms2013-11-1864Article10.3390/a60407827828041999-48932013-11-18doi: 10.3390/a6040782Maria PiniFrancesca RossiK. VenableToby Walsh<![CDATA[Algorithms, Vol. 6, Pages 762-781: Very High Resolution Satellite Image Classification Using Fuzzy Rule-Based Systems]]>
http://www.mdpi.com/1999-4893/6/4/762
The aim of this research is to present a detailed step-by-step method for classification of very high resolution urban satellite images (VHRSI) into specific classes such as road, building, vegetation, etc., using fuzzy logic. In this study, object-based image analysis is used for image classification. The main problems in high resolution image classification are the uncertainties in the position of object borders in satellite images and also multiplex resemblance of the segments to different classes. In order to solve this problem, fuzzy logic is used for image classification, since it provides the possibility of image analysis using multiple parameters without requiring inclusion of certain thresholds in the class assignment process. In this study, an inclusive semi-automatic method for image classification is offered, which presents the configuration of the related fuzzy functions as well as fuzzy rules. The produced results are compared to the results of a normal classification using the same parameters, but with crisp rules. The overall accuracies and kappa coefficients of the presented method stand higher than the check projects.Algorithms2013-11-1264Article10.3390/a60407627627811999-48932013-11-12doi: 10.3390/a6040762Shabnam JabariYun Zhang<![CDATA[Algorithms, Vol. 6, Pages 747-761: Multi-Core Parallel Gradual Pattern Mining Based on Multi-Precision Fuzzy Orderings]]>
http://www.mdpi.com/1999-4893/6/4/747
Gradual patterns aim at describing co-variations of data such as the higher the size, the higher the weight. In recent years, such patterns have been studied more and more from the data mining point of view. The extraction of such patterns relies on efficient and smart orderings that can be built among data, for instance, when ordering the data with respect to the size, then the data are also ordered with respect to the weight. However, in many application domains, it is hardly possible to consider that data values are crisply ordered. When considering gene expression, it is not true from the biological point of view that Gene 1 is more expressed than Gene 2, if the levels of expression only differ from the tenth decimal. We thus consider fuzzy orderings and fuzzy gamma rank correlation. In this paper, we address two major problems related to this framework: (i) the high memory consumption and (ii) the precision, representation and efficient storage of the fuzzy concordance degrees versus the loss or gain of computing power. For this purpose, we consider multi-precision matrices represented using sparse matrices coupled with parallel algorithms. Experimental results show the interest of our proposal.Algorithms2013-11-0164Article10.3390/a60407477477611999-48932013-11-01doi: 10.3390/a6040747Nicolas SicardYogi AryadinataFederico Del Razo LopezAnne LaurentPerfecto Flores<![CDATA[Algorithms, Vol. 6, Pages 726-746: An Efficient Local Search for the Feedback Vertex Set Problem]]>
http://www.mdpi.com/1999-4893/6/4/726
Inspired by many deadlock detection applications, the feedback vertex set is defined as a set of vertices in an undirected graph, whose removal would result in a graph without cycle. The Feedback Vertex Set Problem, known to be NP-complete, is to search for a feedback vertex set with the minimal cardinality to benefit the deadlock recovery. To address the issue, this paper presents NewkLS FVS(LS, local search; FVS, feedback vertex set), a variable depth-based local search algorithm with a randomized scheme to optimize the efficiency and performance. Experimental simulations are conducted to compare the algorithm with recent metaheuristics, and the computational results show that the proposed algorithm can outperform the other state-of-art algorithms and generate satisfactory solutions for most DIMACSbenchmarks.Algorithms2013-11-0164Article10.3390/a60407267267461999-48932013-11-01doi: 10.3390/a6040726Zhiqiang ZhangAnsheng YeXiaoqing ZhouZehui Shao<![CDATA[Algorithms, Vol. 6, Pages 702-725: New Parallel Sparse Direct Solvers for Multicore Architectures]]>
http://www.mdpi.com/1999-4893/6/4/702
At the heart of many computations in science and engineering lies the need to efficiently and accurately solve large sparse linear systems of equations. Direct methods are frequently the method of choice because of their robustness, accuracy and potential for use as black-box solvers. In the last few years, there have been many new developments, and a number of new modern parallel general-purpose sparse solvers have been written for inclusion within the HSL mathematical software library. In this paper, we introduce and briefly review these solvers for symmetric sparse systems. We describe the algorithms used, highlight key features (including bit-compatibility and out-of-core working) and then, using problems arising from a range of practical applications, we illustrate and compare their performances. We demonstrate that modern direct solvers are able to accurately solve systems of order 106 in less than 3 minutes on a 16-core machine.Algorithms2013-11-0164Article10.3390/a60407027027251999-48932013-11-01doi: 10.3390/a6040702Jonathan HoggJennifer Scott<![CDATA[Algorithms, Vol. 6, Pages 678-701: Pattern-Guided k-Anonymity]]>
http://www.mdpi.com/1999-4893/6/4/678
We suggest a user-oriented approach to combinatorial data anonymization. A data matrix is called k-anonymous if every row appears at least k times—the goal of the NP-hard k-ANONYMITY problem then is to make a given matrix k-anonymous by suppressing (blanking out) as few entries as possible. Building on previous work and coping with corresponding deficiencies, we describe an enhanced k-anonymization problem called PATTERN-GUIDED k-ANONYMITY, where the users specify in which combinations suppressions may occur. In this way, the user of the anonymized data can express the differing importance of various data features. We show that PATTERN-GUIDED k-ANONYMITY is NP-hard. We complement this by a fixed-parameter tractability result based on a “data-driven parameterization” and, based on this, develop an exact integer linear program (ILP)-based solution method, as well as a simple, but very effective, greedy heuristic. Experiments on several real-world datasets show that our heuristic easily matches up to the established “Mondrian” algorithm for k-ANONYMITY in terms of the quality of the anonymization and outperforms it in terms of running time.Algorithms2013-10-1764Article10.3390/a60406786787011999-48932013-10-17doi: 10.3390/a6040678Robert BredereckAndré NichterleinRolf Niedermeier<![CDATA[Algorithms, Vol. 6, Pages 636-677: Sublinear Time Motif Discovery from Multiple Sequences]]>
http://www.mdpi.com/1999-4893/6/4/636
In this paper, a natural probabilistic model for motif discovery has been used to experimentally test the quality of motif discovery programs. In this model, there are k background sequences, and each character in a background sequence is a random character from an alphabet, Σ. A motif G = g1g2 ... gm is a string of m characters. In each background sequence is implanted a probabilistically-generated approximate copy of G. For a probabilistically-generated approximate copy b1b2 ... bm of G, every character, bi, is probabilistically generated, such that the probability for bi ≠ gi is at most α. We develop two new randomized algorithms and one new deterministic algorithm. They make advancements in the following aspects: (1) The algorithms are much faster than those before. Our algorithms can even run in sublinear time. (2) They can handle any motif pattern. (3) The restriction for the alphabet size is a lower bound of four. This gives them potential applications in practical problems, since gene sequences have an alphabet size of four. (4) All algorithms have rigorous proofs about their performances. The methods developed in this paper have been used in the software implementation. We observed some encouraging results that show improved performance for motif detection compared with other software.Algorithms2013-10-1464Article10.3390/a60406366366771999-48932013-10-14doi: 10.3390/a6040636Bin FuYunhui FuYuan Xue<![CDATA[Algorithms, Vol. 6, Pages 618-635: Multi-Threading a State-of-the-Art Maximum Clique Algorithm]]>
http://www.mdpi.com/1999-4893/6/4/618
We present a threaded parallel adaptation of a state-of-the-art maximum clique algorithm for dense, computationally challenging graphs. We show that near-linear speedups are achievable in practice and that superlinear speedups are common. We include results for several previously unsolved benchmark problems.Algorithms2013-10-0364Article10.3390/a60406186186351999-48932013-10-03doi: 10.3390/a6040618Ciaran McCreeshPatrick Prosser<![CDATA[Algorithms, Vol. 6, Pages 591-617: Local Search Approaches in Stable Matching Problems]]>
http://www.mdpi.com/1999-4893/6/4/591
The stable marriage (SM) problem has a wide variety of practical applications, ranging from matching resident doctors to hospitals, to matching students to schools or, more generally, to any two-sided market. In the classical formulation, n men and n women express their preferences (via a strict total order) over the members of the other sex. Solving an SM problem means finding a stable marriage where stability is an envy-free notion: no man and woman who are not married to each other would both prefer each other to their partners or to being single. We consider both the classical stable marriage problem and one of its useful variations (denoted SMTI (Stable Marriage with Ties and Incomplete lists)) where the men and women express their preferences in the form of an incomplete preference list with ties over a subset of the members of the other sex. Matchings are permitted only with people who appear in these preference lists, and we try to find a stable matching that marries as many people as possible. Whilst the SM problem is polynomial to solve, the SMTI problem is NP-hard. We propose to tackle both problems via a local search approach, which exploits properties of the problems to reduce the size of the neighborhood and to make local moves efficiently. We empirically evaluate our algorithm for SM problems by measuring its runtime behavior and its ability to sample the lattice of all possible stable marriages. We evaluate our algorithm for SMTI problems in terms of both its runtime behavior and its ability to find a maximum cardinality stable marriage. Experimental results suggest that for SM problems, the number of steps of our algorithm grows only as O(n log(n)), and that it samples very well the set of all stable marriages. It is thus a fair and efficient approach to generate stable marriages. Furthermore, our approach for SMTI problems is able to solve large problems, quickly returning stable matchings of large and often optimal size, despite the NP-hardness of this problem.Algorithms2013-10-0364Article10.3390/a60405915916171999-48932013-10-03doi: 10.3390/a6040591Mirco GelainMaria PiniFrancesca RossiK. VenableToby Walsh<![CDATA[Algorithms, Vol. 6, Pages 565-590: An Emergent Approach to Text Analysis Based on a Connectionist Model and the Web]]>
http://www.mdpi.com/1999-4893/6/3/565
In this paper, we present a method to provide proactive assistance in text checking, based on usage relationships between words structuralized on the Web. For a given sentence, the method builds a connectionist structure of relationships between word n-grams. Such structure is then parameterized by means of an unsupervised and language agnostic optimization process. Finally, the method provides a representation of the sentence that allows emerging the least prominent usage-based relational patterns, helping to easily find badly-written and unpopular text. The study includes the problem statement and its characterization in the literature, as well as the proposed solving approach and some experimental use.Algorithms2013-09-1763Article10.3390/a60305655655901999-48932013-09-17doi: 10.3390/a6030565Mario CiminoGigliola Vaglini<![CDATA[Algorithms, Vol. 6, Pages 546-564: Quantitative Trait Loci Mapping Problem: An Extinction-Based Multi-Objective Evolutionary Algorithm Approach]]>
http://www.mdpi.com/1999-4893/6/3/546
The Quantitative Trait Loci (QTL) mapping problem aims to identify regions in the genome that are linked to phenotypic features of the developed organism that vary in degree. It is a principle step in determining targets for further genetic analysis and is key in decoding the role of specific genes that control quantitative traits within species. Applications include identifying genetic causes of disease, optimization of cross-breeding for desired traits and understanding trait diversity in populations. In this paper a new multi-objective evolutionary algorithm (MOEA) method is introduced and is shown to increase the accuracy of QTL mapping identification for both independent and epistatic loci interactions. The MOEA method optimizes over the space of possible partial least squares (PLS) regression QTL models and considers the conflicting objectives of model simplicity versus model accuracy. By optimizing for minimal model complexity, MOEA has the advantage of solving the over-fitting problem of conventional PLS models. The effectiveness of the method is confirmed by comparing the new method with Bayesian Interval Mapping approaches over a series of test cases where the optimal solutions are known. This approach can be applied to many problems that arise in analysis of genomic data sets where the number of features far exceeds the number of observations and where features can be highly correlated.Algorithms2013-09-0263Article10.3390/a60305465465641999-48932013-09-02doi: 10.3390/a6030546Ahmadreza GhaffarizadehMehdi EftekhariAli EsmailizadehNicholas Flann<![CDATA[Algorithms, Vol. 6, Pages 532-545: Stable Flows over Time]]>
http://www.mdpi.com/1999-4893/6/3/532
In this paper, the notion of stability is extended to network flows over time. As a useful device in our proofs, we present an elegant preflow-push variant of the Gale-Shapley algorithm that operates directly on the given network and computes stable flows in pseudo-polynomial time, both in the static flow and the flow over time case. We show periodical properties of stable flows over time on networks with an infinite time horizon. Finally, we discuss the influence of storage at vertices, with different results depending on the priority of the corresponding holdover edges.Algorithms2013-08-2163Article10.3390/a60305325325451999-48932013-08-21doi: 10.3390/a6030532Ágnes CsehJannik MatuschkeMartin Skutella<![CDATA[Algorithms, Vol. 6, Pages 512-531: Extraction and Segmentation of Sputum Cells for Lung Cancer Early Diagnosis]]>
http://www.mdpi.com/1999-4893/6/3/512
Lung cancer has been the largest cause of cancer deaths worldwide with an overall 5-year survival rate of only 15%. Its symptoms can be found exclusively in advanced stages where the chances for patients to survive are very low, thus making the mortality rate the highest among all other types of cancer. The present work deals with the attempt to design computer-aided detection or diagnosis (CAD) systems for early detection of lung cancer based on the analysis of sputum color images. The aim is to reduce the false negative rate and to increase the true positive rate as much as possible. The early detection of lung cancer from sputum images is a challenging problem, due to both the structure of the cancer cells and the stained method which are employed in the formulation of the sputum cells. We present here a framework for the extraction and segmentation of sputum cells in sputum images using, respectively, a threshold classifier, a Bayesian classification and mean shift segmentation. Our methods are validated and compared with other competitive techniques via a series of experimentation conducted with a data set of 100 images. The extraction and segmentation results will be used as a base for a CAD system for early detection of lung cancer which will improve the chances of survival for the patient.Algorithms2013-08-2163Article10.3390/a60305125125311999-48932013-08-21doi: 10.3390/a6030512Fatma TaherNaoufel WerghiHussain Al-AhmadChristian Donner<![CDATA[Algorithms, Vol. 6, Pages 494-511: An Algorithm for Managing Aircraft Movement on an Airport Surface]]>
http://www.mdpi.com/1999-4893/6/3/494
The present paper focuses on the development of an algorithm for safely and optimally managing the routing of aircraft on an airport surface in future airport operations. This tool is intended to support air traffic controllers’ decision-making in selecting the paths of all aircraft and the engine startup approval time for departing ones. Optimal routes are sought for minimizing the time both arriving and departing aircraft spend on an airport surface with engines on, with benefits in terms of safety, efficiency and costs. The proposed algorithm first computes a standalone, shortest path solution from runway to apron or vice versa, depending on the aircraft being inbound or outbound, respectively. For taking into account the constraints due to other traffic on an airport surface, this solution is amended by a conflict detection and resolution task that attempts to reduce and possibly nullify the number of conflicts generated in the first phase. An example application on a simple Italian airport exemplifies how the algorithm can be applied to true-world applications. Emphasis is given on how to model an airport surface as a weighted and directed graph with non-negative weights, as required for the input to the algorithm.Algorithms2013-08-1663Article10.3390/a60304944945111999-48932013-08-16doi: 10.3390/a6030494Urbano TancrediDomenico AccardoGiancarmine FasanoAlfredo RengaGiancarlo RufinoGiuseppe Maresca<![CDATA[Algorithms, Vol. 6, Pages 485-493: A Simple Algorithm for Solving for the Generalized Longest Common Subsequence (LCS) Problem with a Substring Exclusion Constraint]]>
http://www.mdpi.com/1999-4893/6/3/485
This paper studies the string-excluding (STR-EC)-constrained longest common subsequence (LCS) problem, a generalized LCS problem. For the two input sequences, X and Y , of lengths n andmand a constraint string, P, of length r, the goal is to find the longest common subsequence, Z, of X and Y that excludes P as a substring. The problem and its solution were first proposed by Chen and Chao, but we found that their algorithm cannot solve the problem correctly. A new dynamic programming solution for the STR-EC-LCS problem is then presented in this paper, and the correctness of the new algorithm is proven. The time complexity of the new algorithm is O(nmr).Algorithms2013-08-1563Article10.3390/a60304854854931999-48932013-08-15doi: 10.3390/a6030485Daxin ZhuXiaodong Wang<![CDATA[Algorithms, Vol. 6, Pages 471-484: Linear Time Local Approximation Algorithm for Maximum Stable Marriage]]>
http://www.mdpi.com/1999-4893/6/3/471
We consider a two-sided market under incomplete preference lists with ties, where the goal is to find a maximum size stable matching. The problem is APX-hard, and a 3/2-approximation was given by McDermid [1]. This algorithm has a non-linear running time, and, more importantly needs global knowledge of all preference lists. We present a very natural, economically reasonable, local, linear time algorithm with the same ratio, using some ideas of Paluch [2]. In this algorithm every person make decisions using only their own list, and some information asked from members of these lists (as in the case of the famous algorithm of Gale and Shapley). Some consequences to the Hospitals/Residents problem are also discussed.Algorithms2013-08-1563Article10.3390/a60304714714841999-48932013-08-15doi: 10.3390/a6030471Zoltán Király<![CDATA[Algorithms, Vol. 6, Pages 459-470: Ubiquitous Integrity via Network Integration and Parallelism—Sustaining Pedestrian/Bike Urbanism]]>
http://www.mdpi.com/1999-4893/6/3/459
Nowadays, due to the concern regarding environmental issues, establishing pedestrian/bike friendly urbanism is widely encouraged. To promote safety-assured, mobile communication environments, efficient, reliable maintenance, and information integrity need to be designed, especially in highly possibly interfered places. For busy traffic areas, regular degree-3 dedicated short range communication (DSRC) networks are safety and information featured with availability, reliability, and maintainability in paths of multi-lanes. For sparsely populated areas, probes of wireless sensors are rational, especially if sensor nodes can be organized to enhance security, reliability, and flexibility. Applying alternative network topologies, such as spider-webs, generalized honeycomb tori, and cube-connected cycles, for comparing and analyzing is proposed in DSRC and cellular communications to enhance integrity in communications.Algorithms2013-08-1263Article10.3390/a60304594594701999-48932013-08-12doi: 10.3390/a6030459Li-Yen Hsu<![CDATA[Algorithms, Vol. 6, Pages 457-458: Special Issue on Graph Algorithms]]>
http://www.mdpi.com/1999-4893/6/3/457
This special issue of Algorithms is devoted to the design and analysis of algorithms for solving combinatorial problems of a theoretical or practical nature involving graphs, with a focus on computational complexity.Algorithms2013-08-1263Editorial10.3390/a60304574574581999-48932013-08-12doi: 10.3390/a6030457Jesper Jansson<![CDATA[Algorithms, Vol. 6, Pages 442-456: A Review of Routing Protocols Based on Ant-Like Mobile Agents]]>
http://www.mdpi.com/1999-4893/6/3/442
A survey on the routing protocols based on ant-like mobile agents is given. These protocols are often employed in Mobile Ad Hoc Networks (MANET). Mobile Ad Hoc Networks are collections of wireless mobile nodes such as PDAs, laptop computers, and cellular phones having wireless communication capability that dynamically form a temporary network without using any existing network infrastructures such as wireless access points. The only infrastructure in MANET is the wireless communication interfaces on the devices. In such a circumstance, where some of the wireless devices are not within wireless range of each other, multi-hop routing is required to transmit messages to the destination. A node that wants to start communication with other nodes that are not within its one-hop wireless transmission range has to request intermediate nodes to forward their communication packets to the destination. In this paper, we survey a variety of proposed network protocols to accommodate this situation. We focus especially on biologically-inspired routing algorithms that are based on the ant colony optimization algorithm.Algorithms2013-08-0663Review10.3390/a60304424424561999-48932013-08-06doi: 10.3390/a6030442Yasushi Kambayashi<![CDATA[Algorithms, Vol. 6, Pages 430-441: Efficient in silico Chromosomal Representation of Populations via Indexing Ancestral Genomes]]>
http://www.mdpi.com/1999-4893/6/3/430
One of the major challenges in handling realistic forward simulations for plant and animal breeding is the sheer number of markers. Due to advancing technologies, the requirement has quickly grown from hundreds of markers to millions. Most simulators are lagging behind in handling these sizes, since they do not scale well. We present a scheme for representing and manipulating such realistic size genomes, without any loss of information. Usually, the simulation is forward and over tens to hundreds of generations with hundreds of thousands of individuals at each generation. We demonstrate through simulations that our representation can be two orders of magnitude faster and handle at least two orders of magnitude more markers than existing software on realistic breeding scenarios.Algorithms2013-07-3063Article10.3390/a60304304304411999-48932013-07-30doi: 10.3390/a6030430Niina HaiminenFilippo UtroClaude LebretonPascal FlamentZivan KaramanLaxmi Parida<![CDATA[Algorithms, Vol. 6, Pages 407-429: Noise Reduction for Nonlinear Nonstationary Time Series Data using Averaging Intrinsic Mode Function]]>
http://www.mdpi.com/1999-4893/6/3/407
A novel noise filtering algorithm based on averaging Intrinsic Mode Function (aIMF), which is a derivation of Empirical Mode Decomposition (EMD), is proposed to remove white-Gaussian noise of foreign currency exchange rates that are nonlinear nonstationary times series signals. Noise patterns with different amplitudes and frequencies were randomly mixed into the five exchange rates. A number of filters, namely; Extended Kalman Filter (EKF), Wavelet Transform (WT), Particle Filter (PF) and the averaging Intrinsic Mode Function (aIMF) algorithm were used to compare filtering and smoothing performance. The aIMF algorithm demonstrated high noise reduction among the performance of these filters.Algorithms2013-07-1963Article10.3390/a60304074074291999-48932013-07-19doi: 10.3390/a6030407Bhusana PremanodeJumlong VongprasertChristofer Toumazou<![CDATA[Algorithms, Vol. 6, Pages 396-406: New Heuristics for Rooted Triplet Consistency]]>
http://www.mdpi.com/1999-4893/6/3/396
Rooted triplets are becoming one of the most important types of input for reconstructing rooted phylogenies. A rooted triplet is a phylogenetic tree on three leaves and shows the evolutionary relationship of the corresponding three species. In this paper, we investigate the problem of inferring the maximum consensus evolutionary tree from a set of rooted triplets. This problem is known to be APX-hard. We present two new heuristic algorithms. For a given set of m triplets on n species, the FastTree algorithm runs in O(m + α(n)n2) time, where α(n) is the functional inverse of Ackermann’s function. This is faster than any other previously known algorithms, although the outcome is less satisfactory. The Best Pair Merge with Total Reconstruction (BPMTR) algorithm runs in O(mn3) time and, on average, performs better than any other previously known algorithms for this problem.Algorithms2013-07-1163Article10.3390/a60303963964061999-48932013-07-11doi: 10.3390/a6030396Soheil JahangiriSeyed HashemiHadi Poormohammadi<![CDATA[Algorithms, Vol. 6, Pages 383-395: Maximum Locally Stable Matchings]]>
http://www.mdpi.com/1999-4893/6/3/383
Motivated by the observation that most companies are more likely to consider job applicants referred by their employees than those who applied on their own, Arcaute and Vassilvitskii modeled a job market that integrates social networks into stable matchings in an interesting way. We call their model HR+SN because an instance of their model is an ordered pair (I, G) where I is a typical instance of the Hospital/Residents problem (HR) and G is a graph that describes the social network (SN) of the residents in I. A matching p, of hospitals and residents has a local blocking pair (h, r) if (h, r) is a blocking pair of ii, and there is a resident r' such that r' is simultaneously an employee of h in the matching and a neighbor of r in G. Such a pair is likely to compromise the matching because the participants have access to each other through r': r can give her resume to r' who can then forward it to h. A locally stable matching is a matching with no local blocking pairs. The cardinality of the locally stable matchings of I can vary. This paper presents a variety of results on computing a locally stable matching with maximum cardinality.Algorithms2013-06-2463Article10.3390/a60303833833951999-48932013-06-24doi: 10.3390/a6030383Christine ChengEric McDermid<![CDATA[Algorithms, Vol. 6, Pages 371-382: Improving Man-Optimal Stable Matchings by Minimum Change of Preference Lists]]>
http://www.mdpi.com/1999-4893/6/2/371
In the stable marriage problem, any instance admits the so-called man-optimal stable matching, in which every man is assigned the best possible partner. However, there are instances for which all men receive low-ranked partners even in the man-optimal stable matching. In this paper we consider the problem of improving the man-optimal stable matching by changing only one man’s preference list. We show that the optimization variant and the decision variant of this problem can be solved in time O(n3) and O(n2), respectively, where n is the number of men (women) in an input. We further extend the problem so that we are allowed to change k men’s preference lists. We show that the problem is W[1]-hard with respect to the parameter k and give O(n2k+1)-time and O(nk+1)-time exact algorithms for the optimization and decision variants, respectively. Finally, we show that the problems become easy when k = n; we give O(n2.5 log n)-time and O(n2)-time algorithms for the optimization and decision variants, respectively.Algorithms2013-05-2862Article10.3390/a60203713713821999-48932013-05-28doi: 10.3390/a6020371Takao InoshitaRobert IrvingKazuo IwamaShuichi MiyazakiTakashi Nagase<![CDATA[Algorithms, Vol. 6, Pages 352-370: Filtering Degenerate Patterns with Application to Protein Sequence Analysis]]>
http://www.mdpi.com/1999-4893/6/2/352
In biology, the notion of degenerate pattern plays a central role for describing various phenomena. For example, protein active site patterns, like those contained in the PROSITE database, e.g., [FY ]DPC[LIM][ASG]C[ASG], are, in general, represented by degenerate patterns with character classes. Researchers have developed several approaches over the years to discover degenerate patterns. Although these methods have been exhaustively and successfully tested on genomes and proteins, their outcomes often far exceed the size of the original input, making the output hard to be managed and to be interpreted by refined analysis requiring manual inspection. In this paper, we discuss a characterization of degenerate patterns with character classes, without gaps, and we introduce the concept of pattern priority for comparing and ranking different patterns. We define the class of underlying patterns for filtering any set of degenerate patterns into a new set that is linear in the size of the input sequence. We present some preliminary results on the detection of subtle signals in protein families. Results show that our approach drastically reduces the number of patterns in output for a tool for protein analysis, while retaining the representative patterns.Algorithms2013-05-2262Article10.3390/a60203523523701999-48932013-05-22doi: 10.3390/a6020352Matteo CominDavide Verzotto<![CDATA[Algorithms, Vol. 6, Pages 319-351: Practical Compressed Suffix Trees]]>
http://www.mdpi.com/1999-4893/6/2/319
The suffix tree is an extremely important data structure in bioinformatics. Classical implementations require much space, which renders them useless to handle large sequence collections. Recent research has obtained various compressed representations for suffix trees, with widely different space-time tradeoffs. In this paper we show how the use of range min-max trees yields novel representations achieving practical space/time tradeoffs. In addition, we show how those trees can be modified to index highly repetitive collections, obtaining the first compressed suffix tree representation that effectively adapts to that scenario.Algorithms2013-05-2162Article10.3390/a60203193193511999-48932013-05-21doi: 10.3390/a6020319Andrés AbeliukRodrigo CánovasGonzalo Navarro<![CDATA[Algorithms, Vol. 6, Pages 309-318: Multi-Sided Compression Performance Assessment of ABI SOLiD WES Data]]>
http://www.mdpi.com/1999-4893/6/2/309
Data storage is a major and growing part of IT budgets for research since manyyears. Especially in biology, the amount of raw data products is growing continuously,and the advent of the so-called "next-generation" sequencers has made things worse.Affordable prices have pushed scientists to massively sequence whole genomes and to screenlarge cohort of patients, thereby producing tons of data as a side effect. The need formaximally fitting data into the available storage volumes has encouraged and welcomednew compression algorithms and tools. We focus here on state-of-the-art compression toolsand measure their compression performance on ABI SOLiD data.Algorithms2013-05-2162Article10.3390/a60203093093181999-48932013-05-21doi: 10.3390/a6020309Tommaso MazzaStefano Castellana<![CDATA[Algorithms, Vol. 6, Pages 278-308: A Generic Two-Phase Stochastic Variable Neighborhood Approach for Effectively Solving the Nurse Rostering Problem]]>
http://www.mdpi.com/1999-4893/6/2/278
In this contribution, a generic two-phase stochastic variable neighborhood approach is applied to nurse rostering problems. The proposed algorithm is used for creating feasible and efficient nurse rosters for many different nurse rostering cases. In order to demonstrate the efficiency and generic applicability of the proposed approach, experiments with real-world input data coming from many different nurse rostering cases have been conducted. The nurse rostering instances used have significant differences in nature, structure, philosophy and the type of hard and soft constraints. Computational results show that the proposed algorithm performs better than six different existing approaches applied to the same nurse rostering input instances using the same evaluation criteria. In addition, in all cases, it manages to reach the best-known fitness achieved in the literature, and in one case, it manages to beat the best-known fitness achieved till now.Algorithms2013-05-2162Article10.3390/a60202782783081999-48932013-05-21doi: 10.3390/a6020278Ioannis SolosIoannis TassopoulosGrigorios Beligiannis<![CDATA[Algorithms, Vol. 6, Pages 245-277: Fast Rescheduling of Multiple Workflows to Constrained Heterogeneous Resources Using Multi-Criteria Memetic Computing]]>
http://www.mdpi.com/1999-4893/6/2/245
This paper is motivated by, but not limited to, the task of scheduling jobs organized in workflows to a computational grid. Due to the dynamic nature of grid computing, more or less permanent replanning is required so that only very limited time is available to come up with a revised plan. To meet the requirements of both users and resource owners, a multi-objective optimization comprising execution time and costs is needed. This paper summarizes our work over the last six years in this field, and reports new results obtained by the combination of heuristics and evolutionary search in an adaptive Memetic Algorithm. We will show how different heuristics contribute to solving varying replanning scenarios and investigate the question of the maximum manageable work load for a grid of growing size starting with a load of 200 jobs and 20 resources up to 7000 jobs and 700 resources. Furthermore, the effect of four different local searchers incorporated into the evolutionary search is studied. We will also report briefly on approaches that failed within the short time frame given for planning.Algorithms2013-04-2262Article10.3390/a60202452452771999-48932013-04-22doi: 10.3390/a6020245Wilfried JakobSylvia StrackAlexander QuinteGünther BengelKarl-Uwe StuckyWolfgang Süß<![CDATA[Algorithms, Vol. 6, Pages 227-244: Solving University Course Timetabling Problems Using Constriction Particle Swarm Optimization with Local Search]]>
http://www.mdpi.com/1999-4893/6/2/227
Course timetabling is a combinatorial optimization problem and has been confirmed to be an NP-complete problem. Course timetabling problems are different for different universities. The studied university course timetabling problem involves hard constraints such as classroom, class curriculum, and other variables. Concurrently, some soft constraints need also to be considered, including teacher’s preferred time, favorite class time etc. These preferences correspond to satisfaction values obtained via questionnaires. Particle swarm optimization (PSO) is a promising scheme for solving NP-complete problems due to its fast convergence, fewer parameter settings and ability to fit dynamic environmental characteristics. Therefore, PSO was applied towards solving course timetabling problems in this work. To reduce the computational complexity, a timeslot was designated in a particle’s encoding as the scheduling unit. Two types of PSO, the inertia weight version and constriction version, were evaluated. Moreover, an interchange heuristic was utilized to explore the neighboring solution space to improve solution quality. Additionally, schedule conflicts are handled after a solution has been generated. Experimental results demonstrate that the proposed scheme of constriction PSO with interchange heuristic is able to generate satisfactory course timetables that meet the requirements of teachers and classes according to the various applied constraints.Algorithms2013-04-1962Article10.3390/a60202272272441999-48932013-04-19doi: 10.3390/a6020227Ruey-Maw ChenHsiao-Fang Shih<![CDATA[Algorithms, Vol. 6, Pages 197-226: Enforcing Security Mechanisms in the IP-Based Internet of Things: An Algorithmic Overview]]>
http://www.mdpi.com/1999-4893/6/2/197
The Internet of Things (IoT) refers to the Internet-like structure of billions of interconnected constrained devices, denoted as “smart objects”. Smart objects have limited capabilities, in terms of computational power and memory, and might be battery-powered devices, thus raising the need to adopt particularly energy efficient technologies. Among the most notable challenges that building interconnected smart objects brings about, there are standardization and interoperability. The use of IP has been foreseen as the standard for interoperability for smart objects. As billions of smart objects are expected to come to life and IPv4 addresses have eventually reached depletion, IPv6 has been identified as a candidate for smart-object communication. The deployment of the IoT raises many security issues coming from (i) the very nature of smart objects, e.g., the adoption of lightweight cryptographic algorithms, in terms of processing and memory requirements; and (ii) the use of standard protocols, e.g., the need to minimize the amount of data exchanged between nodes. This paper provides a detailed overview of the security challenges related to the deployment of smart objects. Security protocols at network, transport, and application layers are discussed, together with lightweight cryptographic algorithms proposed to be used instead of conventional and demanding ones, in terms of computational resources. Security aspects, such as key distribution and security bootstrapping, and application scenarios, such as secure data aggregation and service authorization, are also discussed.Algorithms2013-04-0262Article10.3390/a60201971972261999-48932013-04-02doi: 10.3390/a6020197Simone CiraniGianluigi FerrariLuca Veltri<![CDATA[Algorithms, Vol. 6, Pages 169-196: An Open-Source Implementation of the Critical-Line Algorithm for Portfolio Optimization]]>
http://www.mdpi.com/1999-4893/6/1/169
Portfolio optimization is one of the problems most frequently encountered by financial practitioners. The main goal of this paper is to fill a gap in the literature by providing a well-documented, step-by-step open-source implementation of Critical Line Algorithm (CLA) in scientific language. The code is implemented as a Python class object, which allows it to be imported like any other Python module, and integrated seamlessly with pre-existing code. We discuss the logic behind CLA following the algorithm’s decision flow. In addition, we developed several utilities that support finding answers to recurrent practical problems. We believe this publication will offer a better alternative to financial practitioners, many of whom are currently relying on generic-purpose optimizers which often deliver suboptimal solutions. The source code discussed in this paper can be downloaded at the authors’ websites (see Appendix).Algorithms2013-03-2261Article10.3390/a60101691691961999-48932013-03-22doi: 10.3390/a6010169David BaileyMarcos López de Prado<![CDATA[Algorithms, Vol. 6, Pages 161-168: Stable Multicommodity Flows]]>
http://www.mdpi.com/1999-4893/6/1/161
We extend the stable flow model of Fleiner to multicommodity flows. In addition to the preference lists of agents on trading partners for each commodity, every trading pair has a preference list on the commodities that the seller can sell to the buyer. A blocking walk (with respect to a certain commodity) may include saturated arcs, provided that a positive amount of less preferred commodity is traded along the arc. We prove that a stable multicommodity flow always exists, although it is PPAD-hard to find one.Algorithms2013-03-1861Article10.3390/a60101611611681999-48932013-03-18doi: 10.3390/a6010161Tamás KirályJúlia Pap<![CDATA[Algorithms, Vol. 6, Pages 136-160: Algorithms for Non-Negatively Constrained Maximum Penalized Likelihood Reconstruction in Tomographic Imaging]]>
http://www.mdpi.com/1999-4893/6/1/136
Image reconstruction is a key component in many medical imaging modalities. The problem of image reconstruction can be viewed as a special inverse problem where the unknown image pixel intensities are estimated from the observed measurements. Since the measurements are usually noise contaminated, statistical reconstruction methods are preferred. In this paper we review some non-negatively constrained simultaneous iterative algorithms for maximum penalized likelihood reconstructions, where all measurements are used to estimate all pixel intensities in each iteration.Algorithms2013-03-1261Review10.3390/a60101361361601999-48932013-03-12doi: 10.3390/a6010136Jun Ma<![CDATA[Algorithms, Vol. 6, Pages 119-135: A Polynomial-Time Algorithm for Computing the Maximum Common Connected Edge Subgraph of Outerplanar Graphs of Bounded Degree]]>
http://www.mdpi.com/1999-4893/6/1/119
The maximum common connected edge subgraph problem is to find a connected graph with the maximum number of edges that is isomorphic to a subgraph of each of the two input graphs, where it has applications in pattern recognition and chemistry. This paper presents a dynamic programming algorithm for the problem when the two input graphs are outerplanar graphs of a bounded vertex degree, where it is known that the problem is NP-hard, even for outerplanar graphs of an unbounded degree. Although the algorithm repeatedly modifies input graphs, it is shown that the number of relevant subproblems is polynomially bounded, and thus, the algorithm works in polynomial time.Algorithms2013-02-1861Article10.3390/a60101191191351999-48932013-02-18doi: 10.3390/a6010119Tatsuya AkutsuTakeyuki Tamura<![CDATA[Algorithms, Vol. 6, Pages 100-118: Computing the Eccentricity Distribution of Large Graphs]]>
http://www.mdpi.com/1999-4893/6/1/100
The eccentricity of a node in a graph is defined as the length of a longest shortest path starting at that node. The eccentricity distribution over all nodes is a relevant descriptive property of the graph, and its extreme values allow the derivation of measures such as the radius, diameter, center and periphery of the graph. This paper describes two new methods for computing the eccentricity distribution of large graphs such as social networks, web graphs, biological networks and routing networks.We first propose an exact algorithm based on eccentricity lower and upper bounds, which achieves significant speedups compared to the straightforward algorithm when computing both the extreme values of the distribution as well as the eccentricity distribution as a whole. The second algorithm that we describe is a hybrid strategy that combines the exact approach with an efficient sampling technique in order to obtain an even larger speedup on the computation of the entire eccentricity distribution. We perform an extensive set of experiments on a number of large graphs in order to measure and compare the performance of our algorithms, and demonstrate how we can efficiently compute the eccentricity distribution of various large real-world graphs.Algorithms2013-02-1861Article10.3390/a60101001001181999-48932013-02-18doi: 10.3390/a6010100Frank TakesWalter Kosters<![CDATA[Algorithms, Vol. 6, Pages 84-99: Dubins Traveling Salesman Problem with Neighborhoods: A Graph-Based Approach]]>
http://www.mdpi.com/1999-4893/6/1/84
We study the problem of finding the minimum-length curvature constrained closed path through a set of regions in the plane. This problem is referred to as the Dubins Traveling Salesperson Problem with Neighborhoods (DTSPN). An algorithm is presented that uses sampling to cast this infinite dimensional combinatorial optimization problem as a Generalized Traveling Salesperson Problem (GTSP) with intersecting node sets. The GTSP is then converted to an Asymmetric Traveling Salesperson Problem (ATSP) through a series of graph transformations, thus allowing the use of existing approximation algorithms. This algorithm is shown to perform no worse than the best existing DTSPN algorithm and is shown to perform significantly better when the regions overlap. We report on the application of this algorithm to route an Unmanned Aerial Vehicle (UAV) equipped with a radio to collect data from sparsely deployed ground sensors in a field demonstration of autonomous detection, localization, and verification of multiple acoustic events.Algorithms2013-02-0461Article10.3390/a601008484991999-48932013-02-04doi: 10.3390/a6010084Jason IsaacsJoão Hespanha<![CDATA[Algorithms, Vol. 6, Pages 60-83: Tractabilities and Intractabilities on Geometric Intersection Graphs]]>
http://www.mdpi.com/1999-4893/6/1/60
A graph is said to be an intersection graph if there is a set of objects such that each vertex corresponds to an object and two vertices are adjacent if and only if the corresponding objects have a nonempty intersection. There are several natural graph classes that have geometric intersection representations. The geometric representations sometimes help to prove tractability/intractability of problems on graph classes. In this paper, we show some results proved by using geometric representations.Algorithms2013-01-2561Article10.3390/a601006060831999-48932013-01-25doi: 10.3390/a6010060Ryuhei Uehara<![CDATA[Algorithms, Vol. 6, Pages 43-59: Computational Study on a PTAS for Planar Dominating Set Problem]]>
http://www.mdpi.com/1999-4893/6/1/43
The dominating set problem is a core NP-hard problem in combinatorial optimization and graph theory, and has many important applications. Baker [JACM 41,1994] introduces a k-outer planar graph decomposition-based framework for designing polynomial time approximation scheme (PTAS) for a class of NP-hard problems in planar graphs. It is mentioned that the framework can be applied to obtain an O(2ckn) time, c is a constant, (1+1/k)-approximation algorithm for the planar dominating set problem. We show that the approximation ratio achieved by the mentioned application of the framework is not bounded by any constant for the planar dominating set problem. We modify the application of the framework to give a PTAS for the planar dominating set problem. With k-outer planar graph decompositions, the modified PTAS has an approximation ratio (1 + 2/k). Using 2k-outer planar graph decompositions, the modified PTAS achieves the approximation ratio (1+1/k) in O(22ckn) time. We report a computational study on the modified PTAS. Our results show that the modified PTAS is practical.Algorithms2013-01-2161Article10.3390/a601004343591999-48932013-01-21doi: 10.3390/a6010043Marjan MarzbanQian-Ping Gu<![CDATA[Algorithms, Vol. 6, Pages 29-42: Energy Efficient Routing in Wireless Sensor Networks Through Balanced Clustering]]>
http://www.mdpi.com/1999-4893/6/1/29
The wide utilization of Wireless Sensor Networks (WSNs) is obstructed by the severely limited energy constraints of the individual sensor nodes. This is the reason why a large part of the research in WSNs focuses on the development of energy efficient routing protocols. In this paper, a new protocol called Equalized Cluster Head Election Routing Protocol (ECHERP), which pursues energy conservation through balanced clustering, is proposed. ECHERP models the network as a linear system and, using the Gaussian elimination algorithm, calculates the combinations of nodes that can be chosen as cluster heads in order to extend the network lifetime. The performance evaluation of ECHERP is carried out through simulation tests, which evince the effectiveness of this protocol in terms of network energy efficiency when compared against other well-known protocols.Algorithms2013-01-1861Article10.3390/a601002929421999-48932013-01-18doi: 10.3390/a6010029Stefanos NikolidakisDionisis KandrisDimitrios VergadosChristos Douligeris<![CDATA[Algorithms, Vol. 6, Pages 12-28: ℓ1 Major Component Detection and Analysis (ℓ1 MCDA): Foundations in Two Dimensions]]>
http://www.mdpi.com/1999-4893/6/1/12
Principal Component Analysis (PCA) is widely used for identifying the major components of statistically distributed point clouds. Robust versions of PCA, often based in part on the ℓ1 norm (rather than the ℓ2 norm), are increasingly used, especially for point clouds with many outliers. Neither standard PCA nor robust PCAs can provide, without additional assumptions, reliable information for outlier-rich point clouds and for distributions with several main directions (spokes). We carry out a fundamental and complete reformulation of the PCA approach in a framework based exclusively on the ℓ1 norm and heavy-tailed distributions. The ℓ1 Major Component Detection and Analysis (ℓ1 MCDA) that we propose can determine the main directions and the radial extent of 2D data from single or multiple superimposed Gaussian or heavy-tailed distributions without and with patterned artificial outliers (clutter). In nearly all cases in the computational results, 2D ℓ1 MCDA has accuracy superior to that of standard PCA and of two robust PCAs, namely, the projection-pursuit method of Croux and Ruiz-Gazen and the ℓ1 factorization method of Ke and Kanade. (Standard PCA is, of course, superior to ℓ1 MCDA for Gaussian-distributed point clouds.) The computing time of ℓ1 MCDA is competitive with the computing times of the two robust PCAs.Algorithms2013-01-1761Article10.3390/a601001212281999-48932013-01-17doi: 10.3390/a6010012Ye TianQingwei JinJohn LaveryShu-Cherng Fang<![CDATA[Algorithms, Vol. 6, Pages 1-11: Maximum Disjoint Paths on Edge-Colored Graphs: Approximability and Tractability]]>
http://www.mdpi.com/1999-4893/6/1/1
The problem of finding the maximum number of vertex-disjoint uni-color paths in an edge-colored graph has been recently introduced in literature, motivated by applications in social network analysis. In this paper we investigate the approximation and parameterized complexity of the problem. First, we show that, for any constant ε &gt; 0, the problem is not approximable within factor c1-ε, where c is the number of colors, and that the corresponding decision problem is W[1]-hard when parametrized by the number of disjoint paths. Then, we present a fixed-parameter algorithm for the problem parameterized by the number and the length of the disjoint paths.Algorithms2012-12-2761Article10.3390/a60100011111999-48932012-12-27doi: 10.3390/a6010001Paola BonizzoniRiccardo DondiYuri Pirola<![CDATA[Algorithms, Vol. 5, Pages 654-667: Extracting Co-Occurrence Relations from ZDDs]]>
http://www.mdpi.com/1999-4893/5/4/654
A zero-suppressed binary decision diagram (ZDD) is a graph representation suitable for handling sparse set families. Given a ZDD representing a set family, we present an efficient algorithm to discover a hidden structure, called a co-occurrence relation, on the ground set. This computation can be done in time complexity that is related not to the number of sets, but to some feature values of the ZDD. We furthermore introduce a conditional co-occurrence relation and present an extraction algorithm, which enables us to discover further structural information.Algorithms2012-12-1354Article10.3390/a50406546546671999-48932012-12-13doi: 10.3390/a5040654Takahisa Toda<![CDATA[Algorithms, Vol. 5, Pages 636-653: Edge Detection from MRI and DTI Images with an Anisotropic Vector Field Flow Using a Divergence Map]]>
http://www.mdpi.com/1999-4893/5/4/636
The aim of this work is the extraction of edges from Magnetic Resonance Imaging (MRI) and Diffusion Tensor Imaging (DTI) images by a deformable contour procedure, using an external force field derived from an anisotropic flow. Moreover, we introduce a divergence map in order to check the convergence of the process. As we know from vector calculus, divergence is a measure of the magnitude of a vector field convergence at a given point. Thus by means level curves of the divergence map, we have automatically selected an initial contour for the deformation process. If the initial curve includes the areas from which the vector field diverges, it will be able to push the curve towards the edges. Furthermore the divergence map highlights the presence of curves pointing to the most significant geometric parts of boundaries corresponding to high curvature values. In this way, the skeleton of the extracted object will be rather well defined and may subsequently be employed in shape analysis and morphological studies.Algorithms2012-12-1354Article10.3390/a50406366366531999-48932012-12-13doi: 10.3390/a5040636Donatella Giuliani<![CDATA[Algorithms, Vol. 5, Pages 629-635: Testing Goodness of Fit of Random Graph Models]]>
http://www.mdpi.com/1999-4893/5/4/629
Random graphs are matrices with independent 0–1 elements with probabilities determined by a small number of parameters. One of the oldest models is the Rasch model where the odds are ratios of positive numbers scaling the rows and columns. Later Persi Diaconis with his coworkers rediscovered the model for symmetric matrices and called the model beta. Here we give goodness-of-fit tests for the model and extend the model to a version of the block model introduced by Holland, Laskey and Leinhard.Algorithms2012-12-0654Article10.3390/a50406296296351999-48932012-12-06doi: 10.3390/a5040629Villõ CsiszárPéter HussamiJános KomlósTamás MóriLídia RejtõGábor Tusnády<![CDATA[Algorithms, Vol. 5, Pages 604-628: Laplace–Fourier Transform of the Stretched Exponential Function: Analytic Error Bounds, Double Exponential Transform, and Open-Source Implementation “libkww”]]>
http://www.mdpi.com/1999-4893/5/4/604
The C library libkww provides functions to compute the Kohlrausch–Williams– Watts function, i.e., the Laplace–Fourier transform of the stretched (or compressed) exponential function exp(-tβ ) for exponents β between 0.1 and 1.9 with double precision. Analytic error bounds are derived for the low and high frequency series expansions. For intermediate frequencies, the numeric integration is enormously accelerated by using the Ooura–Mori double exponential transformation. The primitive of the cosine transform needed for the convolution integrals is also implemented. The software is hosted at http://apps.jcns.fz-juelich.de/kww; version 3.0 is deposited as supplementary material to this article.Algorithms2012-11-2254Article10.3390/a50406046046281999-48932012-11-22doi: 10.3390/a5040604Joachim Wuttke<![CDATA[Algorithms, Vol. 5, Pages 588-603: An Efficient Algorithm for Automatic Peak Detection in Noisy Periodic and Quasi-Periodic Signals]]>
http://www.mdpi.com/1999-4893/5/4/588
We present a new method for automatic detection of peaks in noisy periodic and quasi-periodic signals. The new method, called automatic multiscale-based peak detection (AMPD), is based on the calculation and analysis of the local maxima scalogram, a matrix comprising the scale-dependent occurrences of local maxima. The usefulness of the proposed method is shown by applying the AMPD algorithm to simulated and real-world signals.Algorithms2012-11-2154Article10.3390/a50405885886031999-48932012-11-21doi: 10.3390/a5040588Felix ScholkmannJens BossMartin Wolf<![CDATA[Algorithms, Vol. 5, Pages 545-587: Exact Algorithms for Maximum Clique: A Computational Study ]]>
http://www.mdpi.com/1999-4893/5/4/545
We investigate a number of recently reported exact algorithms for the maximum clique problem. The program code is presented and analyzed to show how small changes in implementation can have a drastic effect on performance. The computational study demonstrates how problem features and hardware platforms influence algorithm behaviour. The effect of vertex ordering is investigated. One of the algorithms (MCS) is broken into its constituent parts and we discover that one of these parts frequently degrades performance. It is shown that the standard procedure used for rescaling published results (i.e., adjusting run times based on the calibration of a standard program over a set of benchmarks) is unsafe and can lead to incorrect conclusions being drawn from empirical data.Algorithms2012-11-1954Article10.3390/a50405455455871999-48932012-11-19doi: 10.3390/a5040545Patrick Prosser<![CDATA[Algorithms, Vol. 5, Pages 529-544: Finite Element Quadrature of Regularized Discontinuous and Singular Level Set Functions in 3D Problems]]>
http://www.mdpi.com/1999-4893/5/4/529
Regularized Heaviside and Dirac delta function are used in several fields of computational physics and mechanics. Hence the issue of the quadrature of integrals of discontinuous and singular functions arises. In order to avoid ad-hoc quadrature procedures, regularization of the discontinuous and the singular fields is often carried out. In particular, weight functions of the signed distance with respect to the discontinuity interface are exploited. Tornberg and Engquist (Journal of Scientific Computing, 2003, 19: 527–552) proved that the use of compact support weight function is not suitable because it leads to errors that do not vanish for decreasing mesh size. They proposed the adoption of non-compact support weight functions. In the present contribution, the relationship between the Fourier transform of the weight functions and the accuracy of the regularization procedure is exploited. The proposed regularized approach was implemented in the eXtended Finite Element Method. As a three-dimensional example, we study a slender solid characterized by an inclined interface across which the displacement is discontinuous. The accuracy is evaluated for varying position of the discontinuity interfaces with respect to the underlying mesh. A procedure for the choice of the regularization parameters is proposed.Algorithms2012-11-0754Article10.3390/a50405295295441999-48932012-11-07doi: 10.3390/a5040529Elena BenvenutiGiulio VenturaNicola Ponara<![CDATA[Algorithms, Vol. 5, Pages 521-528: Alpha-Beta Pruning and Althöfer’s Pathology-Free Negamax Algorithm]]>
http://www.mdpi.com/1999-4893/5/4/521
The minimax algorithm, also called the negamax algorithm, remains today the most widely used search technique for two-player perfect-information games. However, minimaxing has been shown to be susceptible to game tree pathology, a paradoxical situation in which the accuracy of the search can decrease as the height of the tree increases. Althöfer’s alternative minimax algorithm has been proven to be invulnerable to pathology. However, it has not been clear whether alpha-beta pruning, a crucial component of practical game programs, could be applied in the context of Alhöfer’s algorithm. In this brief paper, we show how alpha-beta pruning can be adapted to Althöfer’s algorithm.Algorithms2012-11-0554Article10.3390/a50405215215281999-48932012-11-05doi: 10.3390/a5040521Ashraf M. Abdelbar<![CDATA[Algorithms, Vol. 5, Pages 506-520: Extracting Hierarchies from Data Clusters for Better Classification]]>
http://www.mdpi.com/1999-4893/5/4/506
In this paper we present the PHOCS-2 algorithm, which extracts a “Predicted Hierarchy Of ClassifierS”. The extracted hierarchy helps us to enhance performance of flat classification. Nodes in the hierarchy contain classifiers. Each intermediate node corresponds to a set of classes and each leaf node corresponds to a single class. In the PHOCS-2 we make estimation for each node and achieve more precise computation of false positives, true positives and false negatives. Stopping criteria are based on the results of the flat classification. The proposed algorithm is validated against nine datasets.Algorithms2012-10-2354Article10.3390/a50405065065201999-48932012-10-23doi: 10.3390/a5040506German SapozhnikovAlexander Ulanov<![CDATA[Algorithms, Vol. 5, Pages 490-505: The Effects of Tabular-Based Content Extraction on Patent Document Clustering]]>
http://www.mdpi.com/1999-4893/5/4/490
Data can be represented in many different ways within a particular document or set of documents. Hence, attempts to automatically process the relationships between documents or determine the relevance of certain document objects can be problematic. In this study, we have developed software to automatically catalog objects contained in HTML files for patents granted by the United States Patent and Trademark Office (USPTO). Once these objects are recognized, the software creates metadata that assigns a data type to each document object. Such metadata can be easily processed and analyzed for subsequent text mining tasks. Specifically, document similarity and clustering techniques were applied to a subset of the USPTO document collection. Although our preliminary results demonstrate that tables and numerical data do not provide quantifiable value to a document’s content, the stage for future work in measuring the importance of document objects within a large corpus has been set.Algorithms2012-10-2254Article10.3390/a50404904905051999-48932012-10-22doi: 10.3390/a5040490Denise R. KoesslerBenjamin W. MartinBruce E. KieferMichael W. Berry<![CDATA[Algorithms, Vol. 5, Pages 469-489: Contextual Anomaly Detection in Text Data]]>
http://www.mdpi.com/1999-4893/5/4/469
We propose using side information to further inform anomaly detection algorithms of the semantic context of the text data they are analyzing, thereby considering both divergence from the statistical pattern seen in particular datasets and divergence seen from more general semantic expectations. Computational experiments show that our algorithm performs as expected on data that reflect real-world events with contextual ambiguity, while replicating conventional clustering on data that are either too specialized or generic to result in contextual information being actionable. These results suggest that our algorithm could potentially reduce false positive rates in existing anomaly detection systems.Algorithms2012-10-1954Article10.3390/a50404694694891999-48932012-10-19doi: 10.3390/a5040469Amogh MahapatraNisheeth SrivastavaJaideep Srivastava<![CDATA[Algorithms, Vol. 5, Pages 449-468: Forecasting the Unit Cost of a Product with Some Linear Fuzzy Collaborative Forecasting Models]]>
http://www.mdpi.com/1999-4893/5/4/449
Forecasting the unit cost of every product type in a factory is an important task. However, it is not easy to deal with the uncertainty of the unit cost. Fuzzy collaborative forecasting is a very effective treatment of the uncertainty in the distributed environment. This paper presents some linear fuzzy collaborative forecasting models to predict the unit cost of a product. In these models, the experts&rsquo; forecasts differ and therefore need to be aggregated through collaboration. According to the experimental results, the effectiveness of forecasting the unit cost was considerably improved through collaboration.Algorithms2012-10-1554Article10.3390/a50404494494681999-48932012-10-15doi: 10.3390/a5040449Toly Chen<![CDATA[Algorithms, Vol. 5, Pages 433-448: Interaction Enhanced Imperialist Competitive Algorithms]]>
http://www.mdpi.com/1999-4893/5/4/433
Imperialist Competitive Algorithm (ICA) is a new population-based evolutionary algorithm. It divides its population of solutions into several sub-populations, and then searches for the optimal solution through two operations: assimilation and competition. The assimilation operation moves each non-best solution (called colony) in a sub-population toward the best solution (called imperialist) in the same sub-population. The competition operation removes a colony from the weakest sub-population and adds it to another sub-population. Previous work on ICA focuses mostly on improving the assimilation operation or replacing the assimilation operation with more powerful meta-heuristics, but none focuses on the improvement of the competition operation. Since the competition operation simply moves a colony (i.e., an inferior solution) from one sub-population to another sub-population, it incurs weak interaction among these sub-populations. This work proposes Interaction Enhanced ICA that strengthens the interaction among the imperialists of all sub-populations. The performance of Interaction Enhanced ICA is validated on a set of benchmark functions for global optimization. The results indicate that the performance of Interaction Enhanced ICA is superior to that of ICA and its existing variants.Algorithms2012-10-1554Article10.3390/a50404334334481999-48932012-10-15doi: 10.3390/a5040433Jun-Lin LinYu-Hsiang TsaiChun-Ying YuMeng-Shiou Li<![CDATA[Algorithms, Vol. 5, Pages 421-432: Univariate Lp and ɭ p Averaging, 0 < p < 1, in Polynomial Time by Utilization of Statistical Structure]]>
http://www.mdpi.com/1999-4893/5/4/421
We present evidence that one can calculate generically combinatorially expensive Lp and lp averages, 0 < p < 1, in polynomial time by restricting the data to come from a wide class of statistical distributions. Our approach differs from the approaches in the previous literature, which are based on a priori sparsity requirements or on accepting a local minimum as a replacement for a global minimum. The functionals by which Lp averages are calculated are not convex but are radially monotonic and the functionals by which lp averages are calculated are nearly so, which are the keys to solvability in polynomial time. Analytical results for symmetric, radially monotonic univariate distributions are presented. An algorithm for univariate lp averaging is presented. Computational results for a Gaussian distribution, a class of symmetric heavy-tailed distributions and a class of asymmetric heavy-tailed distributions are presented. Many phenomena in human-based areas are increasingly known to be represented by data that have large numbers of outliers and belong to very heavy-tailed distributions. When tails of distributions are so heavy that even medians (L1 and l1 averages) do not exist, one needs to consider using lp minimization principles with 0 < p < 1.Algorithms2012-10-0554Article10.3390/a50404214214321999-48932012-10-05doi: 10.3390/a5040421John E. Lavery<![CDATA[Algorithms, Vol. 5, Pages 398-420: Better Metrics to Automatically Predict the Quality of a Text Summary]]>
http://www.mdpi.com/1999-4893/5/4/398
In this paper we demonstrate a family of metrics for estimating the quality of a text summary relative to one or more human-generated summaries. The improved metrics are based on features automatically computed from the summaries to measure content and linguistic quality. The features are combined using one of three methods—robust regression, non-negative least squares, or canonical correlation, an eigenvalue method. The new metrics significantly outperform the previous standard for automatic text summarization evaluation, ROUGE.Algorithms2012-09-2654Article10.3390/a50403983984201999-48932012-09-26doi: 10.3390/a5040398Peter A. RankelJohn M. ConroyJudith D. Schlesinger<![CDATA[Algorithms, Vol. 5, Pages 379-397: Monitoring Threshold Functions over Distributed Data Streams with Node Dependent Constraints]]>
http://www.mdpi.com/1999-4893/5/3/379
Monitoring data streams in a distributed system has attracted considerable interest in recent years. The task of feature selection (e.g., by monitoring the information gain of various features) requires a very high communication overhead when addressed using straightforward centralized algorithms. While most of the existing algorithms deal with monitoring simple aggregated values such as frequency of occurrence of stream items, motivated by recent contributions based on geometric ideas we present an alternative approach. The proposed approach enables monitoring values of an arbitrary threshold function over distributed data streams through stream dependent constraints applied separately on each stream. We report numerical experiments on a real-world data that detect instances where communication between nodes is required, and compare the approach and the results to those recently reported in the literature.Algorithms2012-09-1853Article10.3390/a50303793793971999-48932012-09-18doi: 10.3390/a5030379Yaakov MalinovskyJacob Kogan<![CDATA[Algorithms, Vol. 5, Pages 364-378: Incremental Clustering of News Reports]]>
http://www.mdpi.com/1999-4893/5/3/364
When an event occurs in the real world, numerous news reports describing this event start to appear on different news sites within a few minutes of the event occurrence. This may result in a huge amount of information for users, and automated processes may be required to help manage this information. In this paper, we describe a clustering system that can cluster news reports from disparate sources into event-centric clusters—i.e., clusters of news reports describing the same event. A user can identify any RSS feed as a source of news he/she would like to receive and our clustering system can cluster reports received from the separate RSS feeds as they arrive without knowing the number of clusters in advance. Our clustering system was designed to function well in an online incremental environment. In evaluating our system, we found that our system is very good in performing fine-grained clustering, but performs rather poorly when performing coarser-grained clustering.Algorithms2012-08-2453Article10.3390/a50303643643781999-48932012-08-24doi: 10.3390/a5030364Joel AzzopardiChristopher Staff<![CDATA[Algorithms, Vol. 5, Pages 330-363: Use of Logistic Regression for Forecasting Short-Term Volcanic Activity]]>
http://www.mdpi.com/1999-4893/5/3/330
An algorithm that forecasts volcanic activity using an event tree decision making framework and logistic regression has been developed, characterized, and validated. The suite of empirical models that drive the system were derived from a sparse and geographically diverse dataset comprised of source modeling results, volcano monitoring data, and historic information from analog volcanoes. Bootstrapping techniques were applied to the training dataset to allow for the estimation of robust logistic model coefficients. Probabilities generated from the logistic models increase with positive modeling results, escalating seismicity, and rising eruption frequency. Cross validation yielded a series of receiver operating characteristic curves with areas ranging between 0.78 and 0.81, indicating that the algorithm has good forecasting capabilities. Our results suggest that the logistic models are highly transportable and can compete with, and in some cases outperform, non-transportable empirical models trained with site specific information.Algorithms2012-08-2253Article10.3390/a50303303303631999-48932012-08-22doi: 10.3390/a5030330William N. JunekLinwood W. JonesMark T. Woods<![CDATA[Algorithms, Vol. 5, Pages 318-329: Mammographic Segmentation Using WaveCluster]]>
http://www.mdpi.com/1999-4893/5/3/318
Segmentation of clinically relevant regions from potentially noisy images represents a significant challenge in the field of mammography. We propose novel approaches based on the WaveCluster clustering algorithm for segmenting both the breast profile in the presence of significant acquisition noise and segmenting regions of interest (ROIs) within the breast. Using prior manual segmentations performed by domain experts as ground truth data, we apply our method to 150 film mammograms with significant acquisition noise from the University of South Florida’s Digital Database for Screening Mammography. We then apply a similar segmentation procedure to detect the position and extent of suspicious regions of interest. Our approach was able to segment the breast profile from all 150 images, leaving minor residual noise adjacent to the breast in three. Performance on ROI extraction was also excellent, with 81% sensitivity and 0.96 false positives per image when measured against manually segmented ground truth ROIs. When not utilizing image morphology, our approach ran in linear time with the input size. These results highlight the potential of WaveCluster as a useful addition to the mammographic segmentation repertoire.Algorithms2012-08-1053Article10.3390/a50303183183291999-48932012-08-10doi: 10.3390/a5030318Michael Barnathan<![CDATA[Algorithms, Vol. 5, Pages 304-317: An Agent-Based Fuzzy Collaborative Intelligence Approach for Predicting the Price of a Dynamic Random Access Memory (DRAM) Product]]>
http://www.mdpi.com/1999-4893/5/2/304
Predicting the price of a dynamic random access memory (DRAM) product is a critical task to the manufacturer. However, it is not easy to contend with the uncertainty of the price. In order to effectively predict the price of a DRAM product, an agent-based fuzzy collaborative intelligence approach is proposed in this study. In the agent-based fuzzy collaborative intelligence approach, each agent uses a fuzzy neural network to predict the DRAM price based on its view. The agent then communicates its view and forecasting results to other agents with the aid of an automatic collaboration mechanism. According to the experimental results, the overall performance was improved through the agents’ collaboration.Algorithms2012-05-2452Article10.3390/a50203043043171999-48932012-05-24doi: 10.3390/a5020304Toly Chen<![CDATA[Algorithms, Vol. 5, Pages 289-303: Modeling and Performance Analysis to Predict the Behavior of a Divisible Load Application in a Cloud Computing Environment]]>
http://www.mdpi.com/1999-4893/5/2/289
Cloud computing is an emerging technology where IT resources are virtualized to users as a set of a unified computing resources on a pay per use basis. The resources are dynamically chosen to satisfy a user Service Level Agreement and a required level of performance. Divisible load applications occur in many scientific and engineering applications and can easily be mapped to a Cloud using a master-worker pattern. However, those applications pose challenges to obtain the required performance. We model divisible load applications tasks processing on a set of cloud resources. We derive a novel model and formulas for computing the blocking probability in the system. The formulas are useful to analyze and predict the behavior of a divisible load application on a chosen set of resources to satisfy a Service Level Agreement before the implementation phase, thus saving time and platform energy. They are also useful as a dynamic feedback to a cloud scheduler for optimal scheduling. We evaluate the model in a set of illustrative scenarios.Algorithms2012-05-1152Article10.3390/a50202892893031999-48932012-05-11doi: 10.3390/a5020289Leila IsmailLiren Zhang<![CDATA[Algorithms, Vol. 5, Pages 273-288: Imaginary Cubes and Their Puzzles]]>
http://www.mdpi.com/1999-4893/5/2/273
Imaginary cubes are three dimensional objects which have square silhouette projections in three orthogonal ways just as a cube has. In this paper, we study imaginary cubes and present assembly puzzles based on them. We show that there are 16 equivalence classes of minimal convex imaginary cubes, among whose representatives are a hexagonal bipyramid imaginary cube and a triangular antiprism imaginary cube. Our main puzzle is to put three of the former and six of the latter pieces into a cube-box with an edge length of twice the size of the original cube. Solutions of this puzzle are based on remarkable properties of these two imaginary cubes, in particular, the possibility of tiling 3D Euclidean space.Algorithms2012-05-0952Article10.3390/a50202732732881999-48932012-05-09doi: 10.3390/a5020273Hideki Tsuiki<![CDATA[Algorithms, Vol. 5, Pages 261-272: A Polynomial-Time Reduction from the 3SAT Problem to the Generalized String Puzzle Problem]]>
http://www.mdpi.com/1999-4893/5/2/261
A disentanglement puzzle consists of mechanically interlinked pieces, and the puzzle is solved by disentangling one piece from another set of pieces. A string puzzle consists of strings entangled with one or more wooden pieces. We consider the generalized string puzzle problem whose input is the layout of strings and a wooden board with holes embedded in the 3-dimensional Euclidean space. We present a polynomial-time transformation from an arbitrary instance ƒ of the 3SAT problem to a string puzzle s such that ƒ is satisfiable if and only if s is solvable. Therefore, the generalized string puzzle problem is NP-hard.Algorithms2012-04-1352Article10.3390/a50202612612721999-48932012-04-13doi: 10.3390/a5020261Chuzo IwamotoKento SasakiKenichi Morita<![CDATA[Algorithms, Vol. 5, Pages 236-260: Content Sharing Graphs for Deduplication-Enabled Storage Systems]]>
http://www.mdpi.com/1999-4893/5/2/236
Deduplication in storage systems has gained momentum recently for its capability in reducing data footprint. However, deduplication introduces challenges to storage management as storage objects (e.g., files) are no longer independent from each other due to content sharing between these storage objects. In this paper, we present a graph-based framework to address the challenges of storage management due to deduplication. Specifically, we model content sharing among storage objects by content sharing graphs (CSG), and apply graph-based algorithms to two real-world storage management use cases for deduplication-enabled storage systems. First, a quasi-linear algorithm was developed to partition deduplication domains with a minimal amount of deduplication loss (i.e., data replicated across partitioned domains) in commercial deduplication-enabled storage systems, whereas in general the partitioning problem is NP-complete. For a real-world trace of 3 TB data with 978 GB of removable duplicates, the proposed algorithm can partition the data into 15 balanced partitions with only 54 GB of deduplication loss, that is, a 5% deduplication loss. Second, a quick and accurate method to query the deduplicated size for a subset of objects in deduplicated storage systems was developed. For the same trace of 3 TB data, the optimized graph-based algorithm can complete the query in 2.6 s, which is less than 1% of that of the traditional algorithm based on the deduplication metadata.Algorithms2012-04-1052Article10.3390/a50202362362601999-48932012-04-10doi: 10.3390/a5020236Maohua LuCornel ConstantinescuPrasenjit Sarkar<![CDATA[Algorithms, Vol. 5, Pages 214-235: An Online Algorithm for Lightweight Grammar-Based Compression]]>
http://www.mdpi.com/1999-4893/5/2/214
Grammar-based compression is a well-studied technique to construct a context-free grammar (CFG) deriving a given text uniquely. In this work, we propose an online algorithm for grammar-based compression. Our algorithm guarantees O(log2 n)- approximation ratio for the minimum grammar size, where n is an input size, and it runs in input linear time and output linear space. In addition, we propose a practical encoding, which transforms a restricted CFG into a more compact representation. Experimental results by comparison with standard compressors demonstrate that our algorithm is especially effective for highly repetitive text.Algorithms2012-04-1052Article10.3390/a50202142142351999-48932012-04-10doi: 10.3390/a5020214Shirou MaruyamaHiroshi SakamotoMasayuki Takeda<![CDATA[Algorithms, Vol. 5, Pages 176-213: Finding All Solutions and Instances of Numberlink and Slitherlink by ZDDs]]>
http://www.mdpi.com/1999-4893/5/2/176
Link puzzles involve finding paths or a cycle in a grid that satisfy given local and global properties. This paper proposes algorithms that enumerate solutions and instances of two link puzzles, Slitherlink and Numberlink, by zero-suppressed binary decision diagrams (ZDDs). A ZDD is a compact data structure for a family of sets provided with a rich family of set operations, by which, for example, one can easily extract a subfamily satisfying a desired property. Thanks to the nature of ZDDs, our algorithms offer a tool to assist users to design instances of those link puzzles.Algorithms2012-04-0552Article10.3390/a50201761762131999-48932012-04-05doi: 10.3390/a5020176Ryo YoshinakaToshiki SaitohJun KawaharaKoji TsurumaHiroaki IwashitaShin-ichi Minato<![CDATA[Algorithms, Vol. 5, Pages 158-175: An Integer Programming Approach to Solving Tantrix on Fixed Boards]]>
http://www.mdpi.com/1999-4893/5/1/158
Tantrix (Tantrix R ⃝ is a registered trademark of Colour of Strategy Ltd. in New Zealand, and of TANTRIX JAPAN in Japan, respectively, under the license of M. McManaway, the inventor.) is a puzzle to make a loop by connecting lines drawn on hexagonal tiles, and the objective of this research is to solve it by a computer. For this purpose, we first give a problem setting of solving Tantrix as making a loop on a given fixed board. We then formulate it as an integer program by describing the rules of Tantrix as its constraints, and solve it by a mathematical programming solver to have a solution. As a result, we establish a formulation that can solve Tantrix of moderate size, and even when the solutions are invalid only by elementary constraints, we achieved it by introducing additional constraints and re-solve it. By this approach we succeeded to solve Tantrix of size up to 60.Algorithms2012-03-2251Article10.3390/a50101581581751999-48932012-03-22doi: 10.3390/a5010158Fumika KinoYushi Uno<![CDATA[Algorithms, Vol. 5, Pages 148-157: Any Monotone Function Is Realized by Interlocked Polygons]]>
http://www.mdpi.com/1999-4893/5/1/148
Suppose there is a collection of n simple polygons in the plane, none of which overlap each other. The polygons are interlocked if no subset can be separated arbitrarily far from the rest. It is natural to ask the characterization of the subsets that makes the set of interlocked polygons free (not interlocked). This abstracts the essence of a kind of sliding block puzzle. We show that any monotone Boolean function ƒ on n variables can be described by m = O(n) interlocked polygons. We also show that the decision problem that asks if given polygons are interlocked is PSPACE-complete.Algorithms2012-03-1951Article10.3390/a50101481481571999-48932012-03-19doi: 10.3390/a5010148Erik D. DemaineMartin L. DemaineRyuhei Uehara<![CDATA[Algorithms, Vol. 5, Pages 113-147: A Semi-Preemptive Computational Service System with Limited Resources and Dynamic Resource Ranking]]>
http://www.mdpi.com/1999-4893/5/1/113
In this paper, we integrate a grid system and a wireless network to present a convenient computational service system, called the Semi-Preemptive Computational Service system (SePCS for short), which provides users with a wireless access environment and through which a user can share his/her resources with others. In the SePCS, each node is dynamically given a score based on its CPU level, available memory size, current length of waiting queue, CPU utilization and bandwidth. With the scores, resource nodes are classified into three levels. User requests based on their time constraints are also classified into three types. Resources of higher levels are allocated to more tightly constrained requests so as to increase the total performance of the system. To achieve this, a resource broker with the Semi-Preemptive Algorithm (SPA) is also proposed. When the resource broker cannot find suitable resources for the requests of higher type, it preempts the resource that is now executing a lower type request so that the request of higher type can be executed immediately. The SePCS can be applied to a Vehicular Ad Hoc Network (VANET), users of which can then exploit the convenient mobile network services and the wireless distributed computing. As a result, the performance of the system is higher than that of the tested schemes.Algorithms2012-03-1451Article10.3390/a50101131131471999-48932012-03-14doi: 10.3390/a5010113Fang-Yie LeuKeng-Yen ChaoMing-Chang LeeJia-Chun Lin<![CDATA[Algorithms, Vol. 5, Pages 98-112: Successive Standardization of Rectangular Arrays]]>
http://www.mdpi.com/1999-4893/5/1/98
In this note we illustrate and develop further with mathematics and examples, the work on successive standardization (or normalization) that is studied earlier by the same authors in [1] and [2]. Thus, we deal with successive iterations applied to rectangular arrays of numbers, where to avoid technical difficulties an array has at least three rows and at least three columns. Without loss, an iteration begins with operations on columns: first subtract the mean of each column; then divide by its standard deviation. The iteration continues with the same two operations done successively for rows. These four operations applied in sequence completes one iteration. One then iterates again, and again, and again, ... In [1] it was argued that if arrays are made up of real numbers, then the set for which convergence of these successive iterations fails has Lebesgue measure 0. The limiting array has row and column means 0, row and column standard deviations 1. A basic result on convergence given in [1] is true, though the argument in [1] is faulty. The result is stated in the form of a theorem here, and the argument for the theorem is correct. Moreover, many graphics given in [1] suggest that except for a set of entries of any array with Lebesgue measure 0, convergence is very rapid, eventually exponentially fast in the number of iterations. Because we learned this set of rules from Bradley Efron, we call it “Efron’s algorithm”. More importantly, the rapidity of convergence is illustrated by numerical examples.Algorithms2012-02-2951Article10.3390/a5010098981121999-48932012-02-29doi: 10.3390/a5010098Richard A. OlshenBala Rajaratnam<![CDATA[Algorithms, Vol. 5, Pages 76-97: Visualization, Band Ordering and Compression of Hyperspectral Images]]>
http://www.mdpi.com/1999-4893/5/1/76
Air-borne and space-borne acquired hyperspectral images are used to recognize objects and to classify materials on the surface of the earth. The state of the art compressor for lossless compression of hyperspectral images is the Spectral oriented Least SQuares (SLSQ) compressor (see [1–7]). In this paper we discuss hyperspectral image compression: we show how to visualize each band of a hyperspectral image and how this visualization suggests that an appropriate band ordering can lead to improvements in the compression process. In particular, we consider two important distance measures for band ordering: Pearson’s Correlation and Bhattacharyya distance, and report on experimental results achieved by a Java-based implementation of SLSQ.Algorithms2012-02-2051Article10.3390/a501007676971999-48932012-02-20doi: 10.3390/a5010076Raffaele PizzolanteBruno Carpentieri<![CDATA[Algorithms, Vol. 5, Pages 56-75: Application of Genetic Control with Adaptive Scaling Scheme to Signal Acquisition in Global Navigation Satellite System Receiver]]>
http://www.mdpi.com/1999-4893/5/1/56
This paper presents a genetic-based control scheme that not only utilizes evolutionary characteristics to find the signal acquisition parameters, but also employs an adaptive scheme to control the search space and avoid the genetic control converging to local optimal value so as to acquire the desired signal precisely and rapidly. Simulations and experiment results show that the proposed method can improve the precision of signal parameters and take less signal acquisition time than traditional serial search methods for global navigation satellite system (GNSS) signals.Algorithms2012-02-1751Article10.3390/a501005656751999-48932012-02-17doi: 10.3390/a5010056Chung-Liang ChangHo-Nien Shou<![CDATA[Algorithms, Vol. 5, Pages 50-55: A Note on Sequence Prediction over Large Alphabets]]>
http://www.mdpi.com/1999-4893/5/1/50
Building on results from data compression, we prove nearly tight bounds on how well sequences of length n can be predicted in terms of the size σ of the alphabet and the length k of the context considered when making predictions. We compare the performance achievable by an adaptive predictor with no advance knowledge of the sequence, to the performance achievable by the optimal static predictor using a table listing the frequency of each (k + 1)-tuple in the sequence. We show that, if the elements of the sequence are chosen uniformly at random, then an adaptive predictor can compete in the expected case if k ≤ logσ n – 3 – ε, for a constant ε > 0, but not if k ≥ logσ n.Algorithms2012-02-1751Article10.3390/a501005050551999-48932012-02-17doi: 10.3390/a5010050Travis Gagie<![CDATA[Algorithms, Vol. 5, Pages 30-49: Standard and Specific Compression Techniques for DNA Microarray Images]]>
http://www.mdpi.com/1999-4893/5/1/30
We review the state of the art in DNA microarray image compression and provide original comparisons between standard and microarray-specific compression techniques that validate and expand previous work. First, we describe the most relevant approaches published in the literature and classify them according to the stage of the typical image compression process where each approach makes its contribution, and then we summarize the compression results reported for these microarray-specific image compression schemes. In a set of experiments conducted for this paper, we obtain new results for several popular image coding techniques that include the most recent coding standards. Prediction-based schemes CALIC and JPEG-LS are the best-performing standard compressors, but are improved upon by the best microarray-specific technique, Battiato’s CNN-based scheme.Algorithms2012-02-1451Article10.3390/a501003030491999-48932012-02-14doi: 10.3390/a5010030Miguel Hernández-CabroneroIan BlanesMichael W. MarcellinJoan Serra-Sagristà<![CDATA[Algorithms, Vol. 5, Pages 18-29: How to Solve the Torus Puzzle]]>
http://www.mdpi.com/1999-4893/5/1/18
In this paper, we consider the following sliding puzzle called torus puzzle. In an m by n board, there are mn pieces numbered from 1 to mn. Initially, the pieces are placed in ascending order. Then they are scrambled by rotating the rows and columns without the player’s knowledge. The objective of the torus puzzle is to rearrange the pieces in ascending order by rotating the rows and columns. We provide a solution to this puzzle. In addition, we provide lower and upper bounds on the number of steps for solving the puzzle. Moreover, we consider a variant of the torus puzzle in which each piece is colored either black or white, and we present a hardness result for solving it.Algorithms2012-01-1351Article10.3390/a501001818291999-48932012-01-13doi: 10.3390/a5010018Kazuyuki AmanoYuta KojimaToshiya KurabayashiKeita KuriharaMasahiro NakamuraAyaka OmiToshiyuki TanakaKoichi Yamazaki<![CDATA[Algorithms, Vol. 5, Pages 1-17: Compression-Based Tools for Navigation with an Image Database]]>
http://www.mdpi.com/1999-4893/5/1/1
We present tools that can be used within a larger system referred to as a passive assistant. The system receives information from a mobile device, as well as information from an image database such as Google Street View, and employs image processing to provide useful information about a local urban environment to a user who is visually impaired. The first stage acquires and computes accurate location information, the second stage performs texture and color analysis of a scene, and the third stage provides specific object recognition and navigation information. These second and third stages rely on compression-based tools (dimensionality reduction, vector quantization, and coding) that are enhanced by knowledge of (approximate) location of objects.Algorithms2012-01-1051Article10.3390/a50100011171999-48932012-01-10doi: 10.3390/a5010001Antonella Di LilloAjay DaptardarKevin ThomasJames A. StorerGiovanni Motta<![CDATA[Algorithms, Vol. 4, Pages 307-333: A Catalog of Self-Affine Hierarchical Entropy Functions]]>
http://www.mdpi.com/1999-4893/4/4/307
For fixed k ≥ 2 and fixed data alphabet of cardinality m, the hierarchical type class of a data string of length n = kj for some j ≥ 1 is formed by permuting the string in all possible ways under permutations arising from the isomorphisms of the unique finite rooted tree of depth j which has n leaves and k children for each non-leaf vertex. Suppose the data strings in a hierarchical type class are losslessly encoded via binary codewords of minimal length. A hierarchical entropy function is a function on the set of m-dimensional probability distributions which describes the asymptotic compression rate performance of this lossless encoding scheme as the data length n is allowed to grow without bound. We determine infinitely many hierarchical entropy functions which are each self-affine. For each such function, an explicit iterated function system is found such that the graph of the function is the attractor of the system.Algorithms2011-11-0144Article10.3390/a40403073073331999-48932011-11-01doi: 10.3390/a4040307John Kieffer<![CDATA[Algorithms, Vol. 4, Pages 285-306: An Algorithm to Compute the Character Access Count Distribution for Pattern Matching Algorithms]]>
http://www.mdpi.com/1999-4893/4/4/285
We propose a framework for the exact probabilistic analysis of window-based pattern matching algorithms, such as Boyer–Moore, Horspool, Backward DAWG Matching, Backward Oracle Matching, and more. In particular, we develop an algorithm that efficiently computes the distribution of a pattern matching algorithm’s running time cost (such as the number of text character accesses) for any given pattern in a random text model. Text models range from simple uniform models to higher-order Markov models or hidden Markov models (HMMs). Furthermore, we provide an algorithm to compute the exact distribution of differences in running time cost of two pattern matching algorithms. Methodologically, we use extensions of finite automata which we call deterministic arithmetic automata (DAAs) and probabilistic arithmetic automata (PAAs) [1]. Given an algorithm, a pattern, and a text model, a PAA is constructed from which the sought distributions can be derived using dynamic programming. To our knowledge, this is the first time that substring- or suffix-based pattern matching algorithms are analyzed exactly by computing the whole distribution of running time cost. Experimentally, we compare Horspool’s algorithm, Backward DAWG Matching, and Backward Oracle Matching on prototypical patterns of short length and provide statistics on the size of minimal DAAs for these computations.Algorithms2011-10-3144Article10.3390/a40402852853061999-48932011-10-31doi: 10.3390/a4040285Tobias MarschallSven Rahmann<![CDATA[Algorithms, Vol. 4, Pages 262-284: The Smallest Grammar Problem as Constituents Choice and Minimal Grammar Parsing]]>
http://www.mdpi.com/1999-4893/4/4/262
The smallest grammar problem—namely, finding a smallest context-free grammar that generates exactly one sequence—is of practical and theoretical importance in fields such as Kolmogorov complexity, data compression and pattern discovery. We propose a new perspective on this problem by splitting it into two tasks: (1) choosing which words will be the constituents of the grammar and (2) searching for the smallest grammar given this set of constituents. We show how to solve the second task in polynomial time parsing longer constituent with smaller ones. We propose new algorithms based on classical practical algorithms that use this optimization to find small grammars. Our algorithms consistently find smaller grammars on a classical benchmark reducing the size in 10% in some cases. Moreover, our formulation allows us to define interesting bounds on the number of small grammars and to empirically compare different grammars of small size.Algorithms2011-10-2644Article10.3390/a40402622622841999-48932011-10-26doi: 10.3390/a4040262Rafael CarrascosaFrançois CosteMatthias GalléGabriel Infante-Lopez<![CDATA[Algorithms, Vol. 4, Pages 239-261: Radio Frequency Interference Detection and Mitigation Algorithms Based on Spectrogram Analysis]]>
http://www.mdpi.com/1999-4893/4/4/239
Radio Frequency Interference (RFI) detection and mitigation algorithms based on a signal’s spectrogram (frequency and time domain representation) are presented. The radiometric signal’s spectrogram is treated as an image, and therefore image processing techniques are applied to detect and mitigate RFI by two-dimensional filtering. A series of Monte-Carlo simulations have been performed to evaluate the performance of a simple thresholding algorithm and a modified two-dimensional Wiener filter.Algorithms2011-10-2544Article10.3390/a40402392392611999-48932011-10-25doi: 10.3390/a4040239Jose Miguel TarongiAdriano Camps<![CDATA[Algorithms, Vol. 4, Pages 223-238: Applying Length-Dependent Stochastic Context-Free Grammars to RNA Secondary Structure Prediction]]>
http://www.mdpi.com/1999-4893/4/4/223
In order to be able to capture effects from co-transcriptional folding, we extend stochastic context-free grammars such that the probability of applying a rule can depend on the length of the subword that is eventually generated from the symbols introduced by the rule, and we show that existing algorithms for training and for determining the most probable parse tree can easily be adapted to the extended model without losses in performance. Furthermore, we show that the extended model is suited to improve the quality of predictions of RNA secondary structures. The extended model may also be applied to other fields where stochastic context-free grammars are used like natural language processing. Additionally some interesting questions in the field of formal languages arise from it.Algorithms2011-10-2144Article10.3390/a40402232232381999-48932011-10-21doi: 10.3390/a4040223Frank WeinbergMarkus E. Nebel<![CDATA[Algorithms, Vol. 4, Pages 200-222: Approximating Frequent Items in Asynchronous Data Stream over a Sliding Window]]>
http://www.mdpi.com/1999-4893/4/3/200
In an asynchronous data stream, the data items may be out of order with respect to their original timestamps. This paper studies the space complexity required by a data structure to maintain such a data stream so that it can approximate the set of frequent items over a sliding time window with sufficient accuracy. Prior to our work, the best solution is given by Cormode et al. [1], who gave an O (1/ε log W log (εB/ log W) min {log W, 1/ε} log |U|)- space data structure that can approximate the frequent items within an ε error bound, where W and B are parameters of the sliding window, and U is the set of all possible item names. We gave a more space-efficient data structure that only requires O (1/ε log W log (εB/ logW) log log W) space.Algorithms2011-09-2243Article10.3390/a40302002002221999-48932011-09-22doi: 10.3390/a4030200Hing-Fung TingLap-Kei LeeHo-Leung ChanTak-Wah Lam<![CDATA[Algorithms, Vol. 4, Pages 183-199: Lempel–Ziv Data Compression on Parallel and Distributed Systems]]>
http://www.mdpi.com/1999-4893/4/3/183
We present a survey of results concerning Lempel–Ziv data compression on parallel and distributed systems, starting from the theoretical approach to parallel time complexity to conclude with the practical goal of designing distributed algorithms with low communication cost. Storer’s extension for image compression is also discussed.Algorithms2011-09-1443Article10.3390/a40301831831991999-48932011-09-14doi: 10.3390/a4030183Sergio De Agostino