Tree structure compression with RePair
In this work we introduce a new linear time compression algorithm, called "Re-pair for Trees", which compresses ranked ordered trees using linear straight-line context-free tree grammars. Such grammars generalize straight-line context-free string grammars and allow basic tree operations, l...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this work we introduce a new linear time compression algorithm, called
"Re-pair for Trees", which compresses ranked ordered trees using linear
straight-line context-free tree grammars. Such grammars generalize
straight-line context-free string grammars and allow basic tree operations,
like traversal along edges, to be executed without prior decompression. Our
algorithm can be considered as a generalization of the "Re-pair" algorithm
developed by N. Jesper Larsson and Alistair Moffat in 2000. The latter
algorithm is a dictionary-based compression algorithm for strings. We also
introduce a succinct coding which is specialized in further compressing the
grammars generated by our algorithm. This is accomplished without loosing the
ability do directly execute queries on this compressed representation of the
input tree. Finally, we compare the grammars and output files generated by a
prototype of the Re-pair for Trees algorithm with those of similar compression
algorithms. The obtained results show that that our algorithm outperforms its
competitors in terms of compression ratio, runtime and memory usage. |
---|---|
DOI: | 10.48550/arxiv.1007.5406 |