Linear time universal coding and time reversal of tree sources via FSM closure

Tree models are efficient parametrizations of finite-memory processes, offering potentially significant model cost savings. The information theory literature has focused mostly on redundancy aspects of the universal estimation and coding of these models. In this paper, we investigate representations...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on information theory 2004-07, Vol.50 (7), p.1442-1468
Hauptverfasser: Martin, A., Seroussi, G., Weinberger, M.J.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1468
container_issue 7
container_start_page 1442
container_title IEEE transactions on information theory
container_volume 50
creator Martin, A.
Seroussi, G.
Weinberger, M.J.
description Tree models are efficient parametrizations of finite-memory processes, offering potentially significant model cost savings. The information theory literature has focused mostly on redundancy aspects of the universal estimation and coding of these models. In this paper, we investigate representations and supporting data structures for finite-memory processes, as well as the major impact these structures have on the universal algorithms in which they are used. We first generalize the class of tree models, and then define and investigate the properties of the finite-state machine (FSM) closure of a tree, which is the smallest FSM that generates all the processes generated by the tree. The interaction between FSM closures, generalized context trees (GCTs), and classical data structures such as compact suffix trees brings together the information-theoretic and the computational aspects, leading to the first algorithm for linear time encoding/decoding of a lossless twice-universal code in the class of three models. The implemented code is a two-pass version of Context. The corresponding optimal context selection rule and context transitions use tools similar to those employed in efficient implementation of the popular Burrows-Wheeler transform (BWT), yielding similar computational complexities. We also present a reversible transform that displays the same "context deinterleaving" feature as the BWT but is naturally based on an optimal context tree. FSM closures are also applied to an investigation of the effect of time reversal on tree models, motivated in part by the following question: When compressing a data sequence using a universal scheme in the class of tree models, can it make a difference whether we read the sequence from left to right or from right to left? Given a tree model of a process, we show constructively that the number of states in the tree model corresponding to the reversed process might be, in the extreme case, quadratic in the number of states of the original tree. This result answers the above motivating question in the affirmative.
doi_str_mv 10.1109/TIT.2004.830763
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_195894296</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1306544</ieee_id><sourcerecordid>764677331</sourcerecordid><originalsourceid>FETCH-LOGICAL-c349t-c1faacdd02d9224ae2201c9e385110fde67d0412812ebf2c0653af62787f03303</originalsourceid><addsrcrecordid>eNp9kU1LAzEQhoMoWKtnD16CBz1tm8_d5CjFaqHqwXoOMTuRlO1uTboF_70pKwgePA3DPO_Ay4PQJSUTSomerharCSNETBQnVcmP0IhKWRW6lOIYjQihqtBCqFN0ltI6r0JSNkLPy9CCjXgXNoD7NuwhJttg19Wh_cC2rYdLhJ9D5_EuAuDU9dFBwvtg8fz1CbumS32Ec3TibZPg4meO0dv8fjV7LJYvD4vZ3bJwXOhd4ai31tU1YbVmTFhgjFCngSuZu_gayqomgjJFGbx75kgpufUlq1TlCeeEj9Ht8Hcbu88e0s5sQnLQNLaFrk9G6ZIJTgXP5M2_JFO8IkrLDF7_Ade5Y5tbGKql0oLpMkPTAXKxSymCN9sYNjZ-GUrMQYPJGsxBgxk05MTVkAgA8EvzXEkI_g2tboHQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>195894296</pqid></control><display><type>article</type><title>Linear time universal coding and time reversal of tree sources via FSM closure</title><source>IEEE Electronic Library (IEL)</source><creator>Martin, A. ; Seroussi, G. ; Weinberger, M.J.</creator><creatorcontrib>Martin, A. ; Seroussi, G. ; Weinberger, M.J.</creatorcontrib><description>Tree models are efficient parametrizations of finite-memory processes, offering potentially significant model cost savings. The information theory literature has focused mostly on redundancy aspects of the universal estimation and coding of these models. In this paper, we investigate representations and supporting data structures for finite-memory processes, as well as the major impact these structures have on the universal algorithms in which they are used. We first generalize the class of tree models, and then define and investigate the properties of the finite-state machine (FSM) closure of a tree, which is the smallest FSM that generates all the processes generated by the tree. The interaction between FSM closures, generalized context trees (GCTs), and classical data structures such as compact suffix trees brings together the information-theoretic and the computational aspects, leading to the first algorithm for linear time encoding/decoding of a lossless twice-universal code in the class of three models. The implemented code is a two-pass version of Context. The corresponding optimal context selection rule and context transitions use tools similar to those employed in efficient implementation of the popular Burrows-Wheeler transform (BWT), yielding similar computational complexities. We also present a reversible transform that displays the same "context deinterleaving" feature as the BWT but is naturally based on an optimal context tree. FSM closures are also applied to an investigation of the effect of time reversal on tree models, motivated in part by the following question: When compressing a data sequence using a universal scheme in the class of tree models, can it make a difference whether we read the sequence from left to right or from right to left? Given a tree model of a process, we show constructively that the number of states in the tree model corresponding to the reversed process might be, in the extreme case, quadratic in the number of states of the original tree. This result answers the above motivating question in the affirmative.</description><identifier>ISSN: 0018-9448</identifier><identifier>EISSN: 1557-9654</identifier><identifier>DOI: 10.1109/TIT.2004.830763</identifier><identifier>CODEN: IETTAW</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Codes ; Coding ; Computation ; Computational complexity ; Context modeling ; Costs ; Data structures ; Decoding ; Displays ; Encoding ; Information ; Information theory ; Laboratories ; Mathematical models ; Memory ; Optimization ; Theory ; Tree data structures ; Trees</subject><ispartof>IEEE transactions on information theory, 2004-07, Vol.50 (7), p.1442-1468</ispartof><rights>Copyright Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jul 2004</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c349t-c1faacdd02d9224ae2201c9e385110fde67d0412812ebf2c0653af62787f03303</citedby><cites>FETCH-LOGICAL-c349t-c1faacdd02d9224ae2201c9e385110fde67d0412812ebf2c0653af62787f03303</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1306544$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54737</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1306544$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Martin, A.</creatorcontrib><creatorcontrib>Seroussi, G.</creatorcontrib><creatorcontrib>Weinberger, M.J.</creatorcontrib><title>Linear time universal coding and time reversal of tree sources via FSM closure</title><title>IEEE transactions on information theory</title><addtitle>TIT</addtitle><description>Tree models are efficient parametrizations of finite-memory processes, offering potentially significant model cost savings. The information theory literature has focused mostly on redundancy aspects of the universal estimation and coding of these models. In this paper, we investigate representations and supporting data structures for finite-memory processes, as well as the major impact these structures have on the universal algorithms in which they are used. We first generalize the class of tree models, and then define and investigate the properties of the finite-state machine (FSM) closure of a tree, which is the smallest FSM that generates all the processes generated by the tree. The interaction between FSM closures, generalized context trees (GCTs), and classical data structures such as compact suffix trees brings together the information-theoretic and the computational aspects, leading to the first algorithm for linear time encoding/decoding of a lossless twice-universal code in the class of three models. The implemented code is a two-pass version of Context. The corresponding optimal context selection rule and context transitions use tools similar to those employed in efficient implementation of the popular Burrows-Wheeler transform (BWT), yielding similar computational complexities. We also present a reversible transform that displays the same "context deinterleaving" feature as the BWT but is naturally based on an optimal context tree. FSM closures are also applied to an investigation of the effect of time reversal on tree models, motivated in part by the following question: When compressing a data sequence using a universal scheme in the class of tree models, can it make a difference whether we read the sequence from left to right or from right to left? Given a tree model of a process, we show constructively that the number of states in the tree model corresponding to the reversed process might be, in the extreme case, quadratic in the number of states of the original tree. This result answers the above motivating question in the affirmative.</description><subject>Algorithms</subject><subject>Codes</subject><subject>Coding</subject><subject>Computation</subject><subject>Computational complexity</subject><subject>Context modeling</subject><subject>Costs</subject><subject>Data structures</subject><subject>Decoding</subject><subject>Displays</subject><subject>Encoding</subject><subject>Information</subject><subject>Information theory</subject><subject>Laboratories</subject><subject>Mathematical models</subject><subject>Memory</subject><subject>Optimization</subject><subject>Theory</subject><subject>Tree data structures</subject><subject>Trees</subject><issn>0018-9448</issn><issn>1557-9654</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2004</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNp9kU1LAzEQhoMoWKtnD16CBz1tm8_d5CjFaqHqwXoOMTuRlO1uTboF_70pKwgePA3DPO_Ay4PQJSUTSomerharCSNETBQnVcmP0IhKWRW6lOIYjQihqtBCqFN0ltI6r0JSNkLPy9CCjXgXNoD7NuwhJttg19Wh_cC2rYdLhJ9D5_EuAuDU9dFBwvtg8fz1CbumS32Ec3TibZPg4meO0dv8fjV7LJYvD4vZ3bJwXOhd4ai31tU1YbVmTFhgjFCngSuZu_gayqomgjJFGbx75kgpufUlq1TlCeeEj9Ht8Hcbu88e0s5sQnLQNLaFrk9G6ZIJTgXP5M2_JFO8IkrLDF7_Ade5Y5tbGKql0oLpMkPTAXKxSymCN9sYNjZ-GUrMQYPJGsxBgxk05MTVkAgA8EvzXEkI_g2tboHQ</recordid><startdate>20040701</startdate><enddate>20040701</enddate><creator>Martin, A.</creator><creator>Seroussi, G.</creator><creator>Weinberger, M.J.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20040701</creationdate><title>Linear time universal coding and time reversal of tree sources via FSM closure</title><author>Martin, A. ; Seroussi, G. ; Weinberger, M.J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c349t-c1faacdd02d9224ae2201c9e385110fde67d0412812ebf2c0653af62787f03303</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Algorithms</topic><topic>Codes</topic><topic>Coding</topic><topic>Computation</topic><topic>Computational complexity</topic><topic>Context modeling</topic><topic>Costs</topic><topic>Data structures</topic><topic>Decoding</topic><topic>Displays</topic><topic>Encoding</topic><topic>Information</topic><topic>Information theory</topic><topic>Laboratories</topic><topic>Mathematical models</topic><topic>Memory</topic><topic>Optimization</topic><topic>Theory</topic><topic>Tree data structures</topic><topic>Trees</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Martin, A.</creatorcontrib><creatorcontrib>Seroussi, G.</creatorcontrib><creatorcontrib>Weinberger, M.J.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on information theory</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Martin, A.</au><au>Seroussi, G.</au><au>Weinberger, M.J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Linear time universal coding and time reversal of tree sources via FSM closure</atitle><jtitle>IEEE transactions on information theory</jtitle><stitle>TIT</stitle><date>2004-07-01</date><risdate>2004</risdate><volume>50</volume><issue>7</issue><spage>1442</spage><epage>1468</epage><pages>1442-1468</pages><issn>0018-9448</issn><eissn>1557-9654</eissn><coden>IETTAW</coden><abstract>Tree models are efficient parametrizations of finite-memory processes, offering potentially significant model cost savings. The information theory literature has focused mostly on redundancy aspects of the universal estimation and coding of these models. In this paper, we investigate representations and supporting data structures for finite-memory processes, as well as the major impact these structures have on the universal algorithms in which they are used. We first generalize the class of tree models, and then define and investigate the properties of the finite-state machine (FSM) closure of a tree, which is the smallest FSM that generates all the processes generated by the tree. The interaction between FSM closures, generalized context trees (GCTs), and classical data structures such as compact suffix trees brings together the information-theoretic and the computational aspects, leading to the first algorithm for linear time encoding/decoding of a lossless twice-universal code in the class of three models. The implemented code is a two-pass version of Context. The corresponding optimal context selection rule and context transitions use tools similar to those employed in efficient implementation of the popular Burrows-Wheeler transform (BWT), yielding similar computational complexities. We also present a reversible transform that displays the same "context deinterleaving" feature as the BWT but is naturally based on an optimal context tree. FSM closures are also applied to an investigation of the effect of time reversal on tree models, motivated in part by the following question: When compressing a data sequence using a universal scheme in the class of tree models, can it make a difference whether we read the sequence from left to right or from right to left? Given a tree model of a process, we show constructively that the number of states in the tree model corresponding to the reversed process might be, in the extreme case, quadratic in the number of states of the original tree. This result answers the above motivating question in the affirmative.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIT.2004.830763</doi><tpages>27</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0018-9448
ispartof IEEE transactions on information theory, 2004-07, Vol.50 (7), p.1442-1468
issn 0018-9448
1557-9654
language eng
recordid cdi_proquest_journals_195894296
source IEEE Electronic Library (IEL)
subjects Algorithms
Codes
Coding
Computation
Computational complexity
Context modeling
Costs
Data structures
Decoding
Displays
Encoding
Information
Information theory
Laboratories
Mathematical models
Memory
Optimization
Theory
Tree data structures
Trees
title Linear time universal coding and time reversal of tree sources via FSM closure
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T14%3A39%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Linear%20time%20universal%20coding%20and%20time%20reversal%20of%20tree%20sources%20via%20FSM%20closure&rft.jtitle=IEEE%20transactions%20on%20information%20theory&rft.au=Martin,%20A.&rft.date=2004-07-01&rft.volume=50&rft.issue=7&rft.spage=1442&rft.epage=1468&rft.pages=1442-1468&rft.issn=0018-9448&rft.eissn=1557-9654&rft.coden=IETTAW&rft_id=info:doi/10.1109/TIT.2004.830763&rft_dat=%3Cproquest_RIE%3E764677331%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=195894296&rft_id=info:pmid/&rft_ieee_id=1306544&rfr_iscdi=true