Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics

Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale. That is, one is with compactness and efficiency to serve for machine vision, and the other is with full...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing 2020-01, Vol.PP, p.1-1
Hauptverfasser:	Duan, Ling-Yu, Liu, Jiaying, Yang, Wenhan, Huang, Tiejun, Gao, Wen
Format:	Artikel
Sprache:	eng
Schlagworte:	Bowing Collaboration feature compression generative model Image coding Image compression Machine learning Machine vision MPEG encoders prediction model Standardization State-of-the-art reviews Sustainable development Video coding for machine Video compression Vision systems
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1
container_issue
container_start_page	1
container_title	IEEE transactions on image processing
container_volume	PP
creator	Duan, Ling-Yu Liu, Jiaying Yang, Wenhan Huang, Tiejun Gao, Wen
description	Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale. That is, one is with compactness and efficiency to serve for machine vision, and the other is with full fidelity, bowing to human perception. The recent endeavors in imminent trends of video compression, e.g. deep learning based coding tools and end-to-end image/video coding, and MPEG-7 compact feature descriptor standards, i.e. Compact Descriptors for Visual Search and Compact Descriptors for Video Analysis, promote the sustainable and fast development in their own directions, respectively. In this paper, thanks to booming AI technology, e.g. prediction and generation models, we carry out exploration in the new area, Video Coding for Machines (VCM), arising from the emerging MPEG standardization efforts1. Towards collaborative compression and intelligent analytics, VCM attempts to bridge the gap between feature coding for machine vision and video coding for human vision. Aligning with the rising Analyze then Compress instance Digital Retina, the definition, formulation, and paradigm of VCM are given first. Meanwhile, we systematically review state-of-the-art techniques in video compression and feature compression from the unique perspective of MPEG standardization, which provides the academic and industrial evidence to realize the collaborative compression of video and feature streams in a broad range of AI applications. Finally, we come up with potential VCM solutions, and the preliminary results have demonstrated the performance and efficiency gains. Further direction is discussed as well.
doi_str_mv	10.1109/TIP.2020.3016485
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_2438692067</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9180095</ieee_id><sourcerecordid>2441012176</sourcerecordid><originalsourceid>FETCH-LOGICAL-c460t-1ce49b058a59df5a10619f1d4071d37016863e050666d29d18d7b76bbdadd8ec3</originalsourceid><addsrcrecordid>eNpdkEtr3DAQgEVJSdI090IgCHLpxdsZWw8rt2XpYyGlOaQ9FYxsjTcKtrWRvIX8-2rZbQ49zQzzzTDzMfYBYYEI5tPD-n5RQgmLClCJWr5h52gEFgCiPMk5SF1oFOaMvUvpCQCFRHXKzqqylloZcc5-__KOAl8F56cN70Pk32336CdKt3zJ7220zm9GHvqMDINtQ7Sz_0O5GreRUvJh4nZyfD3NNAx-Q9PMl5MdXmbfpffsbW-HRJfHeMF-fvn8sPpW3P34ul4t74pOKJgL7EiYFmRtpXG9tAgKTY9OgEZX6fxarSoCCUopVxqHtdOtVm3rrHM1ddUF-3jYu43heUdpbkafunyPnSjsUlOKqlamBKUzevMf-hR2MR-8pwQClqhVpuBAdTGkFKlvttGPNr40CM3efJPNN3vzzdF8Hrk-Lt61I7nXgX-qM3B1ADwRvbYN1gBGVn8BzIyF1A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2441012176</pqid></control><display><type>article</type><title>Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics</title><source>IEEE/IET Electronic Library</source><creator>Duan, Ling-Yu ; Liu, Jiaying ; Yang, Wenhan ; Huang, Tiejun ; Gao, Wen</creator><creatorcontrib>Duan, Ling-Yu ; Liu, Jiaying ; Yang, Wenhan ; Huang, Tiejun ; Gao, Wen</creatorcontrib><description>Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale. That is, one is with compactness and efficiency to serve for machine vision, and the other is with full fidelity, bowing to human perception. The recent endeavors in imminent trends of video compression, e.g. deep learning based coding tools and end-to-end image/video coding, and MPEG-7 compact feature descriptor standards, i.e. Compact Descriptors for Visual Search and Compact Descriptors for Video Analysis, promote the sustainable and fast development in their own directions, respectively. In this paper, thanks to booming AI technology, e.g. prediction and generation models, we carry out exploration in the new area, Video Coding for Machines (VCM), arising from the emerging MPEG standardization efforts1. Towards collaborative compression and intelligent analytics, VCM attempts to bridge the gap between feature coding for machine vision and video coding for human vision. Aligning with the rising Analyze then Compress instance Digital Retina, the definition, formulation, and paradigm of VCM are given first. Meanwhile, we systematically review state-of-the-art techniques in video compression and feature compression from the unique perspective of MPEG standardization, which provides the academic and industrial evidence to realize the collaborative compression of video and feature streams in a broad range of AI applications. Finally, we come up with potential VCM solutions, and the preliminary results have demonstrated the performance and efficiency gains. Further direction is discussed as well.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2020.3016485</identifier><identifier>PMID: 32857694</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Bowing ; Collaboration ; feature compression ; generative model ; Image coding ; Image compression ; Machine learning ; Machine vision ; MPEG encoders ; prediction model ; Standardization ; State-of-the-art reviews ; Sustainable development ; Video coding for machine ; Video compression ; Vision systems</subject><ispartof>IEEE transactions on image processing, 2020-01, Vol.PP, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c460t-1ce49b058a59df5a10619f1d4071d37016863e050666d29d18d7b76bbdadd8ec3</citedby><cites>FETCH-LOGICAL-c460t-1ce49b058a59df5a10619f1d4071d37016863e050666d29d18d7b76bbdadd8ec3</cites><orcidid>0000-0002-1692-0069 ; 0000-0002-4234-6099 ; 0000-0002-4491-2023 ; 0000-0002-0468-9576</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9180095$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9180095$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32857694$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Duan, Ling-Yu</creatorcontrib><creatorcontrib>Liu, Jiaying</creatorcontrib><creatorcontrib>Yang, Wenhan</creatorcontrib><creatorcontrib>Huang, Tiejun</creatorcontrib><creatorcontrib>Gao, Wen</creatorcontrib><title>Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale. That is, one is with compactness and efficiency to serve for machine vision, and the other is with full fidelity, bowing to human perception. The recent endeavors in imminent trends of video compression, e.g. deep learning based coding tools and end-to-end image/video coding, and MPEG-7 compact feature descriptor standards, i.e. Compact Descriptors for Visual Search and Compact Descriptors for Video Analysis, promote the sustainable and fast development in their own directions, respectively. In this paper, thanks to booming AI technology, e.g. prediction and generation models, we carry out exploration in the new area, Video Coding for Machines (VCM), arising from the emerging MPEG standardization efforts1. Towards collaborative compression and intelligent analytics, VCM attempts to bridge the gap between feature coding for machine vision and video coding for human vision. Aligning with the rising Analyze then Compress instance Digital Retina, the definition, formulation, and paradigm of VCM are given first. Meanwhile, we systematically review state-of-the-art techniques in video compression and feature compression from the unique perspective of MPEG standardization, which provides the academic and industrial evidence to realize the collaborative compression of video and feature streams in a broad range of AI applications. Finally, we come up with potential VCM solutions, and the preliminary results have demonstrated the performance and efficiency gains. Further direction is discussed as well.</description><subject>Bowing</subject><subject>Collaboration</subject><subject>feature compression</subject><subject>generative model</subject><subject>Image coding</subject><subject>Image compression</subject><subject>Machine learning</subject><subject>Machine vision</subject><subject>MPEG encoders</subject><subject>prediction model</subject><subject>Standardization</subject><subject>State-of-the-art reviews</subject><subject>Sustainable development</subject><subject>Video coding for machine</subject><subject>Video compression</subject><subject>Vision systems</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkEtr3DAQgEVJSdI090IgCHLpxdsZWw8rt2XpYyGlOaQ9FYxsjTcKtrWRvIX8-2rZbQ49zQzzzTDzMfYBYYEI5tPD-n5RQgmLClCJWr5h52gEFgCiPMk5SF1oFOaMvUvpCQCFRHXKzqqylloZcc5-__KOAl8F56cN70Pk32336CdKt3zJ7220zm9GHvqMDINtQ7Sz_0O5GreRUvJh4nZyfD3NNAx-Q9PMl5MdXmbfpffsbW-HRJfHeMF-fvn8sPpW3P34ul4t74pOKJgL7EiYFmRtpXG9tAgKTY9OgEZX6fxarSoCCUopVxqHtdOtVm3rrHM1ddUF-3jYu43heUdpbkafunyPnSjsUlOKqlamBKUzevMf-hR2MR-8pwQClqhVpuBAdTGkFKlvttGPNr40CM3efJPNN3vzzdF8Hrk-Lt61I7nXgX-qM3B1ADwRvbYN1gBGVn8BzIyF1A</recordid><startdate>20200101</startdate><enddate>20200101</enddate><creator>Duan, Ling-Yu</creator><creator>Liu, Jiaying</creator><creator>Yang, Wenhan</creator><creator>Huang, Tiejun</creator><creator>Gao, Wen</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-1692-0069</orcidid><orcidid>https://orcid.org/0000-0002-4234-6099</orcidid><orcidid>https://orcid.org/0000-0002-4491-2023</orcidid><orcidid>https://orcid.org/0000-0002-0468-9576</orcidid></search><sort><creationdate>20200101</creationdate><title>Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics</title><author>Duan, Ling-Yu ; Liu, Jiaying ; Yang, Wenhan ; Huang, Tiejun ; Gao, Wen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c460t-1ce49b058a59df5a10619f1d4071d37016863e050666d29d18d7b76bbdadd8ec3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Bowing</topic><topic>Collaboration</topic><topic>feature compression</topic><topic>generative model</topic><topic>Image coding</topic><topic>Image compression</topic><topic>Machine learning</topic><topic>Machine vision</topic><topic>MPEG encoders</topic><topic>prediction model</topic><topic>Standardization</topic><topic>State-of-the-art reviews</topic><topic>Sustainable development</topic><topic>Video coding for machine</topic><topic>Video compression</topic><topic>Vision systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Duan, Ling-Yu</creatorcontrib><creatorcontrib>Liu, Jiaying</creatorcontrib><creatorcontrib>Yang, Wenhan</creatorcontrib><creatorcontrib>Huang, Tiejun</creatorcontrib><creatorcontrib>Gao, Wen</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Duan, Ling-Yu</au><au>Liu, Jiaying</au><au>Yang, Wenhan</au><au>Huang, Tiejun</au><au>Gao, Wen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2020-01-01</date><risdate>2020</risdate><volume>PP</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale. That is, one is with compactness and efficiency to serve for machine vision, and the other is with full fidelity, bowing to human perception. The recent endeavors in imminent trends of video compression, e.g. deep learning based coding tools and end-to-end image/video coding, and MPEG-7 compact feature descriptor standards, i.e. Compact Descriptors for Visual Search and Compact Descriptors for Video Analysis, promote the sustainable and fast development in their own directions, respectively. In this paper, thanks to booming AI technology, e.g. prediction and generation models, we carry out exploration in the new area, Video Coding for Machines (VCM), arising from the emerging MPEG standardization efforts1. Towards collaborative compression and intelligent analytics, VCM attempts to bridge the gap between feature coding for machine vision and video coding for human vision. Aligning with the rising Analyze then Compress instance Digital Retina, the definition, formulation, and paradigm of VCM are given first. Meanwhile, we systematically review state-of-the-art techniques in video compression and feature compression from the unique perspective of MPEG standardization, which provides the academic and industrial evidence to realize the collaborative compression of video and feature streams in a broad range of AI applications. Finally, we come up with potential VCM solutions, and the preliminary results have demonstrated the performance and efficiency gains. Further direction is discussed as well.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>32857694</pmid><doi>10.1109/TIP.2020.3016485</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-1692-0069</orcidid><orcidid>https://orcid.org/0000-0002-4234-6099</orcidid><orcidid>https://orcid.org/0000-0002-4491-2023</orcidid><orcidid>https://orcid.org/0000-0002-0468-9576</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1057-7149
ispartof	IEEE transactions on image processing, 2020-01, Vol.PP, p.1-1
issn	1057-7149 1941-0042
language	eng
recordid	cdi_proquest_miscellaneous_2438692067
source	IEEE/IET Electronic Library
subjects	Bowing Collaboration feature compression generative model Image coding Image compression Machine learning Machine vision MPEG encoders prediction model Standardization State-of-the-art reviews Sustainable development Video coding for machine Video compression Vision systems
title	Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T11%3A20%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Video%20Coding%20for%20Machines:%20A%20Paradigm%20of%20Collaborative%20Compression%20and%20Intelligent%20Analytics&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Duan,%20Ling-Yu&rft.date=2020-01-01&rft.volume=PP&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2020.3016485&rft_dat=%3Cproquest_RIE%3E2441012176%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2441012176&rft_id=info:pmid/32857694&rft_ieee_id=9180095&rfr_iscdi=true