Channel-wise Feature Decorrelation for Enhanced Learned Image Compression

The emerging Learned Compression (LC) replaces the traditional codec modules with Deep Neural Networks (DNN), which are trained end-to-end for rate-distortion performance. This approach is considered as the future of image/video compression, and major efforts have been dedicated to improving its com...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-03
Hauptverfasser:	Pakdaman, Farhad, Gabbouj, Moncef
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Codec Complexity Computer Science - Computer Vision and Pattern Recognition Computer Science - Multimedia Image compression Image enhancement Video compression
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Pakdaman, Farhad Gabbouj, Moncef
description	The emerging Learned Compression (LC) replaces the traditional codec modules with Deep Neural Networks (DNN), which are trained end-to-end for rate-distortion performance. This approach is considered as the future of image/video compression, and major efforts have been dedicated to improving its compression efficiency. However, most proposed works target compression efficiency by employing more complex DNNS, which contributes to higher computational complexity. Alternatively, this paper proposes to improve compression by fully exploiting the existing DNN capacity. To do so, the latent features are guided to learn a richer and more diverse set of features, which corresponds to better reconstruction. A channel-wise feature decorrelation loss is designed and is integrated into the LC optimization. Three strategies are proposed and evaluated, which optimize (1) the transformation network, (2) the context model, and (3) both networks. Experimental results on two established LC methods show that the proposed method improves the compression with a BD-Rate of up to 8.06%, with no added complexity. The proposed solution can be applied as a plug-and-play solution to optimize any similar LC method.
doi_str_mv	10.48550/arxiv.2403.10936
format	Article
fullrecord	<record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2403_10936</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2968636293</sourcerecordid><originalsourceid>FETCH-LOGICAL-a956-8d01839c789bf44b1378f23b3b0a13b9b5a2405af803b6e437f840a455c074223</originalsourceid><addsrcrecordid>eNotj71OwzAURi0kJKrSB2AiEnOK42s79ohCSyNVYukeXac3kCpNip3w8_aYlulbjj6dw9hdxpfSKMUf0X-3n0shOSwzbkFfsZkAyFIjhbhhixAOnHOhc6EUzFhZvGPfU5d-tYGSNeE4eUqeqR68pw7HduiTZvDJqo9cTftkS-j7uOUR3ygphuPJUwgRu2XXDXaBFv87Z7v1alds0u3rS1k8bVO0SqdmzzMDts6NdY2ULoPcNAIcOI4ZOOsURnWFjeHgNEnIGyM5SqVqnscCmLP7y-25szr59oj-p_rrrc69kXi4ECc_fEwUxuowTL6PTpWw2mjQwgL8AvqlVuI</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2968636293</pqid></control><display><type>article</type><title>Channel-wise Feature Decorrelation for Enhanced Learned Image Compression</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Pakdaman, Farhad ; Gabbouj, Moncef</creator><creatorcontrib>Pakdaman, Farhad ; Gabbouj, Moncef</creatorcontrib><description>The emerging Learned Compression (LC) replaces the traditional codec modules with Deep Neural Networks (DNN), which are trained end-to-end for rate-distortion performance. This approach is considered as the future of image/video compression, and major efforts have been dedicated to improving its compression efficiency. However, most proposed works target compression efficiency by employing more complex DNNS, which contributes to higher computational complexity. Alternatively, this paper proposes to improve compression by fully exploiting the existing DNN capacity. To do so, the latent features are guided to learn a richer and more diverse set of features, which corresponds to better reconstruction. A channel-wise feature decorrelation loss is designed and is integrated into the LC optimization. Three strategies are proposed and evaluated, which optimize (1) the transformation network, (2) the context model, and (3) both networks. Experimental results on two established LC methods show that the proposed method improves the compression with a BD-Rate of up to 8.06%, with no added complexity. The proposed solution can be applied as a plug-and-play solution to optimize any similar LC method.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2403.10936</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial neural networks ; Codec ; Complexity ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Multimedia ; Image compression ; Image enhancement ; Video compression</subject><ispartof>arXiv.org, 2024-03</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27902</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.2403.10936$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1109/LSP.2024.3411524$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Pakdaman, Farhad</creatorcontrib><creatorcontrib>Gabbouj, Moncef</creatorcontrib><title>Channel-wise Feature Decorrelation for Enhanced Learned Image Compression</title><title>arXiv.org</title><description>The emerging Learned Compression (LC) replaces the traditional codec modules with Deep Neural Networks (DNN), which are trained end-to-end for rate-distortion performance. This approach is considered as the future of image/video compression, and major efforts have been dedicated to improving its compression efficiency. However, most proposed works target compression efficiency by employing more complex DNNS, which contributes to higher computational complexity. Alternatively, this paper proposes to improve compression by fully exploiting the existing DNN capacity. To do so, the latent features are guided to learn a richer and more diverse set of features, which corresponds to better reconstruction. A channel-wise feature decorrelation loss is designed and is integrated into the LC optimization. Three strategies are proposed and evaluated, which optimize (1) the transformation network, (2) the context model, and (3) both networks. Experimental results on two established LC methods show that the proposed method improves the compression with a BD-Rate of up to 8.06%, with no added complexity. The proposed solution can be applied as a plug-and-play solution to optimize any similar LC method.</description><subject>Artificial neural networks</subject><subject>Codec</subject><subject>Complexity</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Multimedia</subject><subject>Image compression</subject><subject>Image enhancement</subject><subject>Video compression</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><sourceid>GOX</sourceid><recordid>eNotj71OwzAURi0kJKrSB2AiEnOK42s79ohCSyNVYukeXac3kCpNip3w8_aYlulbjj6dw9hdxpfSKMUf0X-3n0shOSwzbkFfsZkAyFIjhbhhixAOnHOhc6EUzFhZvGPfU5d-tYGSNeE4eUqeqR68pw7HduiTZvDJqo9cTftkS-j7uOUR3ygphuPJUwgRu2XXDXaBFv87Z7v1alds0u3rS1k8bVO0SqdmzzMDts6NdY2ULoPcNAIcOI4ZOOsURnWFjeHgNEnIGyM5SqVqnscCmLP7y-25szr59oj-p_rrrc69kXi4ECc_fEwUxuowTL6PTpWw2mjQwgL8AvqlVuI</recordid><startdate>20240316</startdate><enddate>20240316</enddate><creator>Pakdaman, Farhad</creator><creator>Gabbouj, Moncef</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PIMPY</scope><scope>PKEHL</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240316</creationdate><title>Channel-wise Feature Decorrelation for Enhanced Learned Image Compression</title><author>Pakdaman, Farhad ; Gabbouj, Moncef</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a956-8d01839c789bf44b1378f23b3b0a13b9b5a2405af803b6e437f840a455c074223</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial neural networks</topic><topic>Codec</topic><topic>Complexity</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Multimedia</topic><topic>Image compression</topic><topic>Image enhancement</topic><topic>Video compression</topic><toplevel>online_resources</toplevel><creatorcontrib>Pakdaman, Farhad</creatorcontrib><creatorcontrib>Gabbouj, Moncef</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied & Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Pakdaman, Farhad</au><au>Gabbouj, Moncef</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Channel-wise Feature Decorrelation for Enhanced Learned Image Compression</atitle><jtitle>arXiv.org</jtitle><date>2024-03-16</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>The emerging Learned Compression (LC) replaces the traditional codec modules with Deep Neural Networks (DNN), which are trained end-to-end for rate-distortion performance. This approach is considered as the future of image/video compression, and major efforts have been dedicated to improving its compression efficiency. However, most proposed works target compression efficiency by employing more complex DNNS, which contributes to higher computational complexity. Alternatively, this paper proposes to improve compression by fully exploiting the existing DNN capacity. To do so, the latent features are guided to learn a richer and more diverse set of features, which corresponds to better reconstruction. A channel-wise feature decorrelation loss is designed and is integrated into the LC optimization. Three strategies are proposed and evaluated, which optimize (1) the transformation network, (2) the context model, and (3) both networks. Experimental results on two established LC methods show that the proposed method improves the compression with a BD-Rate of up to 8.06%, with no added complexity. The proposed solution can be applied as a plug-and-play solution to optimize any similar LC method.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2403.10936</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-03
issn	2331-8422
language	eng
recordid	cdi_arxiv_primary_2403_10936
source	arXiv.org; Free E- Journals
subjects	Artificial neural networks Codec Complexity Computer Science - Computer Vision and Pattern Recognition Computer Science - Multimedia Image compression Image enhancement Video compression
title	Channel-wise Feature Decorrelation for Enhanced Learned Image Compression
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T22%3A07%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Channel-wise%20Feature%20Decorrelation%20for%20Enhanced%20Learned%20Image%20Compression&rft.jtitle=arXiv.org&rft.au=Pakdaman,%20Farhad&rft.date=2024-03-16&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2403.10936&rft_dat=%3Cproquest_arxiv%3E2968636293%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2968636293&rft_id=info:pmid/&rfr_iscdi=true