Generalized Residual Vector Quantization and Aggregating Tree for Large Scale Search

Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector quantization framework that iteratively minimizes quantization er...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on multimedia 2017-08, Vol.19 (8), p.1785-1797
Hauptverfasser:	Liu, Shicong, Shao, Junru, Lu, Hongtao
Format:	Artikel
Sprache:	eng
Schlagworte:	Criteria Datasets Empirical analysis Encoding Euclidean space Feedback control systems High dimensional data Indexes Information entropy Information retrieval large scale data Measurement Methods nearest neighbor search Nearest neighbor searches Searching Similarity similarity search Vector quantization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1797
container_issue	8
container_start_page	1785
container_title	IEEE transactions on multimedia
container_volume	19
creator	Liu, Shicong Shao, Junru Lu, Hongtao
description	Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector quantization framework that iteratively minimizes quantization error. First, we provide a detailed review on a relevant vector quantization method named residual vector quantization (RVQ). Next, we propose generalized residual vector quantization (GRVQ) to further improve over RVQ. Many vector quantization methods can be viewed as special cases of our proposed method. To enable GRVQ on billion scale data, we introduce a nonexhaustive search scheme named aggregating tree (A-Tree) for high dimensional data that uses GRVQ encodings to build a radix tree and perform the nearest neighbor search by beam search. To search accurately and efficiently, VQ-encodings should satisfy locally aggregating encoding criterion: For any node of the corresponding A-Tree, neighboring vectors should aggregate in fewer subtrees to make beam search efficient. We show that the proposed GRVQ encodings best satisfy the suggested criterion, and the joint use of GRVQ and A-Tree shows significantly better performances on billion scale datasets. Our methods are validated on several standard benchmark datasets. Experimental results and empirical analysis show the superior efficiency and effectiveness of our proposed methods compared to the state-of-the-art for large scale search.
doi_str_mv	10.1109/TMM.2017.2692181
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_1920468189</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7894185</ieee_id><sourcerecordid>1920468189</sourcerecordid><originalsourceid>FETCH-LOGICAL-c291t-bbef79dfd13b9a26784fda4907c66ec1daa8dababbe3e7b6011d91ad1b90d66f3</originalsourceid><addsrcrecordid>eNo9kE1Lw0AQQBdRsFbvgpeA59SZJN3NHkvRKlREjV6XSXYSU2JSN8nB_nq3tHiZD-bNDDwhrhFmiKDvsufnWQSoZpHUEaZ4IiaoEwwBlDr19TyC0A_gXFz0_QYAkzmoichW3LKjpt6xDd64r-1ITfDJxdC54HWkdqh3NNRdG1Brg0VVOa5831ZB5piD0lNrchUH7wU1PjK54utSnJXU9Hx1zFPx8XCfLR_D9cvqablYh0WkcQjznEulbWkxzjVFUqVJaSnRoAopuUBLlFrKyXMxq1wCotVIFnMNVsoynorbw92t635G7gez6UbX-pcGdQSJTDHVnoIDVbiu7x2XZuvqb3K_BsHs3RnvzuzdmaM7v3JzWKmZ-R9XqReazuM_8fZrow</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1920468189</pqid></control><display><type>article</type><title>Generalized Residual Vector Quantization and Aggregating Tree for Large Scale Search</title><source>IEEE Electronic Library (IEL)</source><creator>Liu, Shicong ; Shao, Junru ; Lu, Hongtao</creator><creatorcontrib>Liu, Shicong ; Shao, Junru ; Lu, Hongtao</creatorcontrib><description>Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector quantization framework that iteratively minimizes quantization error. First, we provide a detailed review on a relevant vector quantization method named residual vector quantization (RVQ). Next, we propose generalized residual vector quantization (GRVQ) to further improve over RVQ. Many vector quantization methods can be viewed as special cases of our proposed method. To enable GRVQ on billion scale data, we introduce a nonexhaustive search scheme named aggregating tree (A-Tree) for high dimensional data that uses GRVQ encodings to build a radix tree and perform the nearest neighbor search by beam search. To search accurately and efficiently, VQ-encodings should satisfy locally aggregating encoding criterion: For any node of the corresponding A-Tree, neighboring vectors should aggregate in fewer subtrees to make beam search efficient. We show that the proposed GRVQ encodings best satisfy the suggested criterion, and the joint use of GRVQ and A-Tree shows significantly better performances on billion scale datasets. Our methods are validated on several standard benchmark datasets. Experimental results and empirical analysis show the superior efficiency and effectiveness of our proposed methods compared to the state-of-the-art for large scale search.</description><identifier>ISSN: 1520-9210</identifier><identifier>EISSN: 1941-0077</identifier><identifier>DOI: 10.1109/TMM.2017.2692181</identifier><identifier>CODEN: ITMUF8</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Criteria ; Datasets ; Empirical analysis ; Encoding ; Euclidean space ; Feedback control systems ; High dimensional data ; Indexes ; Information entropy ; Information retrieval ; large scale data ; Measurement ; Methods ; nearest neighbor search ; Nearest neighbor searches ; Searching ; Similarity ; similarity search ; Vector quantization</subject><ispartof>IEEE transactions on multimedia, 2017-08, Vol.19 (8), p.1785-1797</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c291t-bbef79dfd13b9a26784fda4907c66ec1daa8dababbe3e7b6011d91ad1b90d66f3</citedby><cites>FETCH-LOGICAL-c291t-bbef79dfd13b9a26784fda4907c66ec1daa8dababbe3e7b6011d91ad1b90d66f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7894185$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7894185$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Liu, Shicong</creatorcontrib><creatorcontrib>Shao, Junru</creatorcontrib><creatorcontrib>Lu, Hongtao</creatorcontrib><title>Generalized Residual Vector Quantization and Aggregating Tree for Large Scale Search</title><title>IEEE transactions on multimedia</title><addtitle>TMM</addtitle><description>Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector quantization framework that iteratively minimizes quantization error. First, we provide a detailed review on a relevant vector quantization method named residual vector quantization (RVQ). Next, we propose generalized residual vector quantization (GRVQ) to further improve over RVQ. Many vector quantization methods can be viewed as special cases of our proposed method. To enable GRVQ on billion scale data, we introduce a nonexhaustive search scheme named aggregating tree (A-Tree) for high dimensional data that uses GRVQ encodings to build a radix tree and perform the nearest neighbor search by beam search. To search accurately and efficiently, VQ-encodings should satisfy locally aggregating encoding criterion: For any node of the corresponding A-Tree, neighboring vectors should aggregate in fewer subtrees to make beam search efficient. We show that the proposed GRVQ encodings best satisfy the suggested criterion, and the joint use of GRVQ and A-Tree shows significantly better performances on billion scale datasets. Our methods are validated on several standard benchmark datasets. Experimental results and empirical analysis show the superior efficiency and effectiveness of our proposed methods compared to the state-of-the-art for large scale search.</description><subject>Criteria</subject><subject>Datasets</subject><subject>Empirical analysis</subject><subject>Encoding</subject><subject>Euclidean space</subject><subject>Feedback control systems</subject><subject>High dimensional data</subject><subject>Indexes</subject><subject>Information entropy</subject><subject>Information retrieval</subject><subject>large scale data</subject><subject>Measurement</subject><subject>Methods</subject><subject>nearest neighbor search</subject><subject>Nearest neighbor searches</subject><subject>Searching</subject><subject>Similarity</subject><subject>similarity search</subject><subject>Vector quantization</subject><issn>1520-9210</issn><issn>1941-0077</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1Lw0AQQBdRsFbvgpeA59SZJN3NHkvRKlREjV6XSXYSU2JSN8nB_nq3tHiZD-bNDDwhrhFmiKDvsufnWQSoZpHUEaZ4IiaoEwwBlDr19TyC0A_gXFz0_QYAkzmoichW3LKjpt6xDd64r-1ITfDJxdC54HWkdqh3NNRdG1Brg0VVOa5831ZB5piD0lNrchUH7wU1PjK54utSnJXU9Hx1zFPx8XCfLR_D9cvqablYh0WkcQjznEulbWkxzjVFUqVJaSnRoAopuUBLlFrKyXMxq1wCotVIFnMNVsoynorbw92t635G7gez6UbX-pcGdQSJTDHVnoIDVbiu7x2XZuvqb3K_BsHs3RnvzuzdmaM7v3JzWKmZ-R9XqReazuM_8fZrow</recordid><startdate>20170801</startdate><enddate>20170801</enddate><creator>Liu, Shicong</creator><creator>Shao, Junru</creator><creator>Lu, Hongtao</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20170801</creationdate><title>Generalized Residual Vector Quantization and Aggregating Tree for Large Scale Search</title><author>Liu, Shicong ; Shao, Junru ; Lu, Hongtao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c291t-bbef79dfd13b9a26784fda4907c66ec1daa8dababbe3e7b6011d91ad1b90d66f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Criteria</topic><topic>Datasets</topic><topic>Empirical analysis</topic><topic>Encoding</topic><topic>Euclidean space</topic><topic>Feedback control systems</topic><topic>High dimensional data</topic><topic>Indexes</topic><topic>Information entropy</topic><topic>Information retrieval</topic><topic>large scale data</topic><topic>Measurement</topic><topic>Methods</topic><topic>nearest neighbor search</topic><topic>Nearest neighbor searches</topic><topic>Searching</topic><topic>Similarity</topic><topic>similarity search</topic><topic>Vector quantization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Shicong</creatorcontrib><creatorcontrib>Shao, Junru</creatorcontrib><creatorcontrib>Lu, Hongtao</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on multimedia</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Liu, Shicong</au><au>Shao, Junru</au><au>Lu, Hongtao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Generalized Residual Vector Quantization and Aggregating Tree for Large Scale Search</atitle><jtitle>IEEE transactions on multimedia</jtitle><stitle>TMM</stitle><date>2017-08-01</date><risdate>2017</risdate><volume>19</volume><issue>8</issue><spage>1785</spage><epage>1797</epage><pages>1785-1797</pages><issn>1520-9210</issn><eissn>1941-0077</eissn><coden>ITMUF8</coden><abstract>Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector quantization framework that iteratively minimizes quantization error. First, we provide a detailed review on a relevant vector quantization method named residual vector quantization (RVQ). Next, we propose generalized residual vector quantization (GRVQ) to further improve over RVQ. Many vector quantization methods can be viewed as special cases of our proposed method. To enable GRVQ on billion scale data, we introduce a nonexhaustive search scheme named aggregating tree (A-Tree) for high dimensional data that uses GRVQ encodings to build a radix tree and perform the nearest neighbor search by beam search. To search accurately and efficiently, VQ-encodings should satisfy locally aggregating encoding criterion: For any node of the corresponding A-Tree, neighboring vectors should aggregate in fewer subtrees to make beam search efficient. We show that the proposed GRVQ encodings best satisfy the suggested criterion, and the joint use of GRVQ and A-Tree shows significantly better performances on billion scale datasets. Our methods are validated on several standard benchmark datasets. Experimental results and empirical analysis show the superior efficiency and effectiveness of our proposed methods compared to the state-of-the-art for large scale search.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TMM.2017.2692181</doi><tpages>13</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1520-9210
ispartof	IEEE transactions on multimedia, 2017-08, Vol.19 (8), p.1785-1797
issn	1520-9210 1941-0077
language	eng
recordid	cdi_proquest_journals_1920468189
source	IEEE Electronic Library (IEL)
subjects	Criteria Datasets Empirical analysis Encoding Euclidean space Feedback control systems High dimensional data Indexes Information entropy Information retrieval large scale data Measurement Methods nearest neighbor search Nearest neighbor searches Searching Similarity similarity search Vector quantization
title	Generalized Residual Vector Quantization and Aggregating Tree for Large Scale Search
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T10%3A25%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Generalized%20Residual%20Vector%20Quantization%20and%20Aggregating%20Tree%20for%20Large%20Scale%20Search&rft.jtitle=IEEE%20transactions%20on%20multimedia&rft.au=Liu,%20Shicong&rft.date=2017-08-01&rft.volume=19&rft.issue=8&rft.spage=1785&rft.epage=1797&rft.pages=1785-1797&rft.issn=1520-9210&rft.eissn=1941-0077&rft.coden=ITMUF8&rft_id=info:doi/10.1109/TMM.2017.2692181&rft_dat=%3Cproquest_RIE%3E1920468189%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1920468189&rft_id=info:pmid/&rft_ieee_id=7894185&rfr_iscdi=true