Universal Enumerative Coding for Tree Models

Efficient enumerative coding for tree sources is, in general, surprisingly intricate-a simple uniform encoding of type classes, which is asymptotically optimal in expectation for many classical models, such as FSMs, turns out not to be so in this case. We describe an efficiently computable enumerati...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on information theory 2014-03, Vol.60 (3), p.1387-1411
Hauptverfasser: Martin, Alvaro, Seroussi, Gadiel, Weinberger, Marcelo J.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Efficient enumerative coding for tree sources is, in general, surprisingly intricate-a simple uniform encoding of type classes, which is asymptotically optimal in expectation for many classical models, such as FSMs, turns out not to be so in this case. We describe an efficiently computable enumerative code that is universal in the family of tree models in the sense that, for a string emitted by an unknown source whose model is supported on a known tree, the expected normalized code length of the encoding approaches the entropy rate of the source with a convergence rate (K/2)(log n)/n, where K is the number of free parameters of the model family. Based on recent results characterizing type classes of context trees, the code consists of the index of the sequence in the tree type class, and an efficient description of the class itself using a nonuniform encoding of selected string counts. The results are extended to a twice-universal setting, where the tree underlying the source model is unknown.
ISSN:0018-9448
1557-9654
DOI:10.1109/TIT.2013.2295217