In this framework, we give a (stronger) counterpart to the Greedy Conjecture: We conjecture that the presented in this paper Greedy Hierarchical Algorithm gives a 2-approximation for SCS. We develop a graph-theoretic framework for studying approximation algorithms for SCS. The Greedy Algorithm repeatedly merges two strings with the largest intersection into one, until only one string remains. While this algorithm and its analysis are technically involved, the 30 years old Greedy Conjecture claims that the trivial and efficient Greedy Algorithm gives a 2-approximation for SCS. We show that the approximation guarantee of GREEDY is at most $(13 \sqrt$-approximation in polynomial time. In a seminal work, Blum, Jiang, Li, Tromp, and Yannakakis (STOC 1991) proved that the superstring computed by GREEDY is a 4-approximation, and this upper bound was improved to 3.5 by Kaplan and Shafrir (IPL 2005). Tarhio and Ukkonen (TCS 1988) conjectured that GREEDY gives a 2-approximation. The GREEDY algorithm, being simpler than other well-performing approximation algorithms for this problem, has attracted attention since the 1980s and is commonly used in practical applications. Of particular interest is the GREEDY algorithm, which repeatedly merges two strings of maximum overlap until a single string remains. The Shortest Superstring problem is NP-hard and several constant-factor approximation algorithms are known for it. In the Shortest Superstring problem, we are given a set of strings and we are asking for a common superstring, which has the minimum number of characters. This leads to a new version of the greedy conjecture. Here, we present a novel approach to bound the superstring approximation ratio with the compression ratio, which, when applied to the greedy algorithm, shows a approximation ratio for -SSP, and also that greedy achieves ratios smaller than. In contrast the greedy conjecture asked in 1988 whether a simple greedy algorithm achieves ratio of for SSP. Numerous involved approximation algorithms achieve approximation ratio above for the superstring, but remain difficult to implement in practice. Even the variant in which all words share the same length, called -SSP, is NP-hard whenever. Unfortunately, SSP is known to be NP-hard even on a binary alphabet and also hard to approximate with respect to the superstring length or to the compression achieved by the superstring. Indeed, it models the question of assembling a genome from a set of sequencing reads. SSP is an important theoretical problem related to the Asymmetric Travelling Salesman Problem, and also has practical applications in data compression and in bioinformatics. Given such a set, the Shortest Superstring Problem (SSP) asks for a superstring of minimum length. All rights are reserved.Ī superstring of a set of words is a string that contains each input word as a substring. © Springer Science Business Media New York 2014. The order of the sections highlights the pass from hardness complexity results for the SSP to efficient algorithms for the problem based on greedy strategies, and to theoretical results that establish the strength of the greedy techniques. All these issues are presented in this chapter in a concise way covering the whole relevant literature, revealing the knowledge that is already conquered, and paving the path for further development in the study of shortest superstrings. The strength of the greedy methods for the SSP is enhanced also by the asymptotic behaviour and the smoothed analysis of the problem in random and real-world instances, respectively. The computational bounds on the approximability of the SSP are a realization of its Max-SNP-hardness, but the weak proved values of them reflect the potential strength of the greedy approximation techniques. Polynomially solvable versions of the problem obtained under specific restrictions to its parameters reveal the boundaries between hard and easy cases. Variations of these algorithms can be parallelized providing computational strength in solving real-world instances. On the other hand, several approximation and heuristic algorithms have been implemented indicating the strong effectiveness of the greedy strategies to this problem. The SSP is an NP-hard problem, and therefore great effort to develop exact algorithms for it has not been made. The shortest superstring problem (SSP) is a combinatorial optimization problem which has attracted the interest of many researchers due to its applications in computational molecular biology and in computer science.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |