Approximate String Matching Using Compressed Suffix Arrays

Let T be a text of length n and P be a pattern of length m, both strings over a fixed finite alphabet A. The k-difference (k-mismatch, respectively) problem is to find all occurrences of P in T that have edit distance (Hamming distance, respectively) at most k from P. In this paper we investigate a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Huynh, Trinh N. D., Hon, Wing-Kai, Lam, Tak-Wah, Sung, Wing-Kin
Format:	Buchkapitel
Sprache:	eng
Schlagworte:	Algorithmics. Computability. Computer arithmetics Applied sciences Approximate Match Computer science control theory systems Exact sciences and technology Pattern Query Query Time String Match Suffix Array Theoretical computing
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Let T be a text of length n and P be a pattern of length m, both strings over a fixed finite alphabet A. The k-difference (k-mismatch, respectively) problem is to find all occurrences of P in T that have edit distance (Hamming distance, respectively) at most k from P. In this paper we investigate a well-studied case in which k=1 and T is fixed and preprocessed into an indexing data structure so that any pattern query can be answered faster [16-19]. This paper gives a solution using O(n) bits indexing data structure with O(mlog2n) query time. To the best of our knowledge, this is the first result which requires linear indexing space. The results can be extended for the k-difference problem with k≥1.
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-540-27801-6_33