Online suffix trees with counts
This paper extend Ukkonen's online suffix tree construction algorithm to support substring frequency queries, by adding count fields to the internal nodes of the tree. This has applications in the field of sequential data compression. One major problem is that Ukkonen's online construction...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper extend Ukkonen's online suffix tree construction algorithm to support substring frequency queries, by adding count fields to the internal nodes of the tree. This has applications in the field of sequential data compression. One major problem is that Ukkonen's online construction algorithm does not maintain explicit end of string markers in the tree. The major part of our work concerns quickly determining where the end markers for a particular edge would be, so that frequencies can be correctly obtained. So a complete characterization of all end markers on leaf edges is given. Furthermore we found that edges between two internal nodes can contain at most one end marker. Using these results, the algorithms are given to update the count fields and do frequency queries correctly. All algorithms have been implemented and tested correct in practice. |
---|---|
ISSN: | 1068-0314 2375-0359 |
DOI: | 10.1109/DCC.2004.1281531 |