Online suffix trees with counts

This paper extend Ukkonen's online suffix tree construction algorithm to support substring frequency queries, by adding count fields to the internal nodes of the tree. This has applications in the field of sequential data compression. One major problem is that Ukkonen's online construction...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Nuallain, B.O., de Rooij, S.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper extend Ukkonen's online suffix tree construction algorithm to support substring frequency queries, by adding count fields to the internal nodes of the tree. This has applications in the field of sequential data compression. One major problem is that Ukkonen's online construction algorithm does not maintain explicit end of string markers in the tree. The major part of our work concerns quickly determining where the end markers for a particular edge would be, so that frequencies can be correctly obtained. So a complete characterization of all end markers on leaf edges is given. Furthermore we found that edges between two internal nodes can contain at most one end marker. Using these results, the algorithms are given to update the count fields and do frequency queries correctly. All algorithms have been implemented and tested correct in practice.
ISSN:1068-0314
2375-0359
DOI:10.1109/DCC.2004.1281531