Universal lossless compression via multilevel pattern matching

A universal lossless data compression code called the multilevel pattern matching code (MPM code) is introduced. In processing a finite-alphabet data string of length n, the MPM code operates at O(log log n) levels sequentially. At each level, the MPM code detects matching patterns in the input data...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on information theory 2000-07, Vol.46 (4), p.1227-1245
Hauptverfasser: Kieffer, J.C., En-Hui Yang, Nelson, G.J., Cosman, P.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1245
container_issue 4
container_start_page 1227
container_title IEEE transactions on information theory
container_volume 46
creator Kieffer, J.C.
En-Hui Yang
Nelson, G.J.
Cosman, P.
description A universal lossless data compression code called the multilevel pattern matching code (MPM code) is introduced. In processing a finite-alphabet data string of length n, the MPM code operates at O(log log n) levels sequentially. At each level, the MPM code detects matching patterns in the input data string (substrings of the data appearing in two or more nonoverlapping positions). The matching patterns detected at each level are of a fixed length which decreases by a constant factor from level to level, until this fixed length becomes one at the final level. The MPM code represents information about the matching patterns at each level as a string of tokens, with each token string encoded by an arithmetic encoder. From the concatenated encoded token strings, the decoder can reconstruct the data string via several rounds of parallel substitutions. A O(1/log n) maximal redundancy/sample upper bound is established for the MPM code with respect to any class of finite state sources of uniformly bounded complexity. We also show that the MPM code is of linear complexity in terms of time and space requirements. The results of some MPM code compression experiments are reported.
doi_str_mv 10.1109/18.850665
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_914642950</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>850665</ieee_id><sourcerecordid>914642950</sourcerecordid><originalsourceid>FETCH-LOGICAL-c402t-e5e5886771636ba5284bfc49a5c1864ff13e3d96940463d60ceab27d845e3ad13</originalsourceid><addsrcrecordid>eNqF0U1LAzEQBuAgCtbqwaunxYPiYWtmN8kmF0GKX1DwYs8hzc7qluyHybbgvzdliwcPesqEPMyQdwg5BzoDoOoW5ExyKgQ_IBPgvEiV4OyQTCgFmSrG5DE5CWEdr4xDNiF3y7beog_GJa4LwWEIie2a3sei7tpkW5uk2bihdrhFl_RmGNC3SWMG-1G376fkqDIu4Nn-nJLl48Pb_DldvD69zO8XqWU0G1LkyKUURQEiFyvDM8lWlWXKcAtSsKqCHPNSCcUoE3kpqEWzyopSMo65KSGfkuuxb--7zw2GQTd1sOicabHbBK2ACZYpTqO8-lPG0UqBgP9hEZPjSkR4-Quuu41v43c1KK6ymOWu282IrI8xeqx07-vG-C8NVO82o0HqcTPRXoy2RsQft3_8Bj_-h00</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>195920011</pqid></control><display><type>article</type><title>Universal lossless compression via multilevel pattern matching</title><source>IEEE Electronic Library (IEL)</source><creator>Kieffer, J.C. ; En-Hui Yang ; Nelson, G.J. ; Cosman, P.</creator><creatorcontrib>Kieffer, J.C. ; En-Hui Yang ; Nelson, G.J. ; Cosman, P.</creatorcontrib><description>A universal lossless data compression code called the multilevel pattern matching code (MPM code) is introduced. In processing a finite-alphabet data string of length n, the MPM code operates at O(log log n) levels sequentially. At each level, the MPM code detects matching patterns in the input data string (substrings of the data appearing in two or more nonoverlapping positions). The matching patterns detected at each level are of a fixed length which decreases by a constant factor from level to level, until this fixed length becomes one at the final level. The MPM code represents information about the matching patterns at each level as a string of tokens, with each token string encoded by an arithmetic encoder. From the concatenated encoded token strings, the decoder can reconstruct the data string via several rounds of parallel substitutions. A O(1/log n) maximal redundancy/sample upper bound is established for the MPM code with respect to any class of finite state sources of uniformly bounded complexity. We also show that the MPM code is of linear complexity in terms of time and space requirements. The results of some MPM code compression experiments are reported.</description><identifier>ISSN: 0018-9448</identifier><identifier>EISSN: 1557-9654</identifier><identifier>DOI: 10.1109/18.850665</identifier><identifier>CODEN: IETTAW</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Complexity ; Compressing ; Data compression ; Encoders ; Lossless ; Matching ; Multilevel ; Strings</subject><ispartof>IEEE transactions on information theory, 2000-07, Vol.46 (4), p.1227-1245</ispartof><rights>Copyright Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jul 2000</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c402t-e5e5886771636ba5284bfc49a5c1864ff13e3d96940463d60ceab27d845e3ad13</citedby><cites>FETCH-LOGICAL-c402t-e5e5886771636ba5284bfc49a5c1864ff13e3d96940463d60ceab27d845e3ad13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/850665$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27915,27916,54749</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/850665$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kieffer, J.C.</creatorcontrib><creatorcontrib>En-Hui Yang</creatorcontrib><creatorcontrib>Nelson, G.J.</creatorcontrib><creatorcontrib>Cosman, P.</creatorcontrib><title>Universal lossless compression via multilevel pattern matching</title><title>IEEE transactions on information theory</title><addtitle>TIT</addtitle><description>A universal lossless data compression code called the multilevel pattern matching code (MPM code) is introduced. In processing a finite-alphabet data string of length n, the MPM code operates at O(log log n) levels sequentially. At each level, the MPM code detects matching patterns in the input data string (substrings of the data appearing in two or more nonoverlapping positions). The matching patterns detected at each level are of a fixed length which decreases by a constant factor from level to level, until this fixed length becomes one at the final level. The MPM code represents information about the matching patterns at each level as a string of tokens, with each token string encoded by an arithmetic encoder. From the concatenated encoded token strings, the decoder can reconstruct the data string via several rounds of parallel substitutions. A O(1/log n) maximal redundancy/sample upper bound is established for the MPM code with respect to any class of finite state sources of uniformly bounded complexity. We also show that the MPM code is of linear complexity in terms of time and space requirements. The results of some MPM code compression experiments are reported.</description><subject>Complexity</subject><subject>Compressing</subject><subject>Data compression</subject><subject>Encoders</subject><subject>Lossless</subject><subject>Matching</subject><subject>Multilevel</subject><subject>Strings</subject><issn>0018-9448</issn><issn>1557-9654</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2000</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNqF0U1LAzEQBuAgCtbqwaunxYPiYWtmN8kmF0GKX1DwYs8hzc7qluyHybbgvzdliwcPesqEPMyQdwg5BzoDoOoW5ExyKgQ_IBPgvEiV4OyQTCgFmSrG5DE5CWEdr4xDNiF3y7beog_GJa4LwWEIie2a3sei7tpkW5uk2bihdrhFl_RmGNC3SWMG-1G376fkqDIu4Nn-nJLl48Pb_DldvD69zO8XqWU0G1LkyKUURQEiFyvDM8lWlWXKcAtSsKqCHPNSCcUoE3kpqEWzyopSMo65KSGfkuuxb--7zw2GQTd1sOicabHbBK2ACZYpTqO8-lPG0UqBgP9hEZPjSkR4-Quuu41v43c1KK6ymOWu282IrI8xeqx07-vG-C8NVO82o0HqcTPRXoy2RsQft3_8Bj_-h00</recordid><startdate>20000701</startdate><enddate>20000701</enddate><creator>Kieffer, J.C.</creator><creator>En-Hui Yang</creator><creator>Nelson, G.J.</creator><creator>Cosman, P.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20000701</creationdate><title>Universal lossless compression via multilevel pattern matching</title><author>Kieffer, J.C. ; En-Hui Yang ; Nelson, G.J. ; Cosman, P.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c402t-e5e5886771636ba5284bfc49a5c1864ff13e3d96940463d60ceab27d845e3ad13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2000</creationdate><topic>Complexity</topic><topic>Compressing</topic><topic>Data compression</topic><topic>Encoders</topic><topic>Lossless</topic><topic>Matching</topic><topic>Multilevel</topic><topic>Strings</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kieffer, J.C.</creatorcontrib><creatorcontrib>En-Hui Yang</creatorcontrib><creatorcontrib>Nelson, G.J.</creatorcontrib><creatorcontrib>Cosman, P.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on information theory</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kieffer, J.C.</au><au>En-Hui Yang</au><au>Nelson, G.J.</au><au>Cosman, P.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Universal lossless compression via multilevel pattern matching</atitle><jtitle>IEEE transactions on information theory</jtitle><stitle>TIT</stitle><date>2000-07-01</date><risdate>2000</risdate><volume>46</volume><issue>4</issue><spage>1227</spage><epage>1245</epage><pages>1227-1245</pages><issn>0018-9448</issn><eissn>1557-9654</eissn><coden>IETTAW</coden><abstract>A universal lossless data compression code called the multilevel pattern matching code (MPM code) is introduced. In processing a finite-alphabet data string of length n, the MPM code operates at O(log log n) levels sequentially. At each level, the MPM code detects matching patterns in the input data string (substrings of the data appearing in two or more nonoverlapping positions). The matching patterns detected at each level are of a fixed length which decreases by a constant factor from level to level, until this fixed length becomes one at the final level. The MPM code represents information about the matching patterns at each level as a string of tokens, with each token string encoded by an arithmetic encoder. From the concatenated encoded token strings, the decoder can reconstruct the data string via several rounds of parallel substitutions. A O(1/log n) maximal redundancy/sample upper bound is established for the MPM code with respect to any class of finite state sources of uniformly bounded complexity. We also show that the MPM code is of linear complexity in terms of time and space requirements. The results of some MPM code compression experiments are reported.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/18.850665</doi><tpages>19</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0018-9448
ispartof IEEE transactions on information theory, 2000-07, Vol.46 (4), p.1227-1245
issn 0018-9448
1557-9654
language eng
recordid cdi_proquest_miscellaneous_914642950
source IEEE Electronic Library (IEL)
subjects Complexity
Compressing
Data compression
Encoders
Lossless
Matching
Multilevel
Strings
title Universal lossless compression via multilevel pattern matching
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T06%3A07%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Universal%20lossless%20compression%20via%20multilevel%20pattern%20matching&rft.jtitle=IEEE%20transactions%20on%20information%20theory&rft.au=Kieffer,%20J.C.&rft.date=2000-07-01&rft.volume=46&rft.issue=4&rft.spage=1227&rft.epage=1245&rft.pages=1227-1245&rft.issn=0018-9448&rft.eissn=1557-9654&rft.coden=IETTAW&rft_id=info:doi/10.1109/18.850665&rft_dat=%3Cproquest_RIE%3E914642950%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=195920011&rft_id=info:pmid/&rft_ieee_id=850665&rfr_iscdi=true