Generalized Residual Vector Quantization and Aggregating Tree for Large Scale Search
Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector quantization framework that iteratively minimizes quantization er...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on multimedia 2017-08, Vol.19 (8), p.1785-1797 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1797 |
---|---|
container_issue | 8 |
container_start_page | 1785 |
container_title | IEEE transactions on multimedia |
container_volume | 19 |
creator | Liu, Shicong Shao, Junru Lu, Hongtao |
description | Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector quantization framework that iteratively minimizes quantization error. First, we provide a detailed review on a relevant vector quantization method named residual vector quantization (RVQ). Next, we propose generalized residual vector quantization (GRVQ) to further improve over RVQ. Many vector quantization methods can be viewed as special cases of our proposed method. To enable GRVQ on billion scale data, we introduce a nonexhaustive search scheme named aggregating tree (A-Tree) for high dimensional data that uses GRVQ encodings to build a radix tree and perform the nearest neighbor search by beam search. To search accurately and efficiently, VQ-encodings should satisfy locally aggregating encoding criterion: For any node of the corresponding A-Tree, neighboring vectors should aggregate in fewer subtrees to make beam search efficient. We show that the proposed GRVQ encodings best satisfy the suggested criterion, and the joint use of GRVQ and A-Tree shows significantly better performances on billion scale datasets. Our methods are validated on several standard benchmark datasets. Experimental results and empirical analysis show the superior efficiency and effectiveness of our proposed methods compared to the state-of-the-art for large scale search. |
doi_str_mv | 10.1109/TMM.2017.2692181 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_1920468189</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7894185</ieee_id><sourcerecordid>1920468189</sourcerecordid><originalsourceid>FETCH-LOGICAL-c291t-bbef79dfd13b9a26784fda4907c66ec1daa8dababbe3e7b6011d91ad1b90d66f3</originalsourceid><addsrcrecordid>eNo9kE1Lw0AQQBdRsFbvgpeA59SZJN3NHkvRKlREjV6XSXYSU2JSN8nB_nq3tHiZD-bNDDwhrhFmiKDvsufnWQSoZpHUEaZ4IiaoEwwBlDr19TyC0A_gXFz0_QYAkzmoichW3LKjpt6xDd64r-1ITfDJxdC54HWkdqh3NNRdG1Brg0VVOa5831ZB5piD0lNrchUH7wU1PjK54utSnJXU9Hx1zFPx8XCfLR_D9cvqablYh0WkcQjznEulbWkxzjVFUqVJaSnRoAopuUBLlFrKyXMxq1wCotVIFnMNVsoynorbw92t635G7gez6UbX-pcGdQSJTDHVnoIDVbiu7x2XZuvqb3K_BsHs3RnvzuzdmaM7v3JzWKmZ-R9XqReazuM_8fZrow</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1920468189</pqid></control><display><type>article</type><title>Generalized Residual Vector Quantization and Aggregating Tree for Large Scale Search</title><source>IEEE Electronic Library (IEL)</source><creator>Liu, Shicong ; Shao, Junru ; Lu, Hongtao</creator><creatorcontrib>Liu, Shicong ; Shao, Junru ; Lu, Hongtao</creatorcontrib><description>Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector quantization framework that iteratively minimizes quantization error. First, we provide a detailed review on a relevant vector quantization method named residual vector quantization (RVQ). Next, we propose generalized residual vector quantization (GRVQ) to further improve over RVQ. Many vector quantization methods can be viewed as special cases of our proposed method. To enable GRVQ on billion scale data, we introduce a nonexhaustive search scheme named aggregating tree (A-Tree) for high dimensional data that uses GRVQ encodings to build a radix tree and perform the nearest neighbor search by beam search. To search accurately and efficiently, VQ-encodings should satisfy locally aggregating encoding criterion: For any node of the corresponding A-Tree, neighboring vectors should aggregate in fewer subtrees to make beam search efficient. We show that the proposed GRVQ encodings best satisfy the suggested criterion, and the joint use of GRVQ and A-Tree shows significantly better performances on billion scale datasets. Our methods are validated on several standard benchmark datasets. Experimental results and empirical analysis show the superior efficiency and effectiveness of our proposed methods compared to the state-of-the-art for large scale search.</description><identifier>ISSN: 1520-9210</identifier><identifier>EISSN: 1941-0077</identifier><identifier>DOI: 10.1109/TMM.2017.2692181</identifier><identifier>CODEN: ITMUF8</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Criteria ; Datasets ; Empirical analysis ; Encoding ; Euclidean space ; Feedback control systems ; High dimensional data ; Indexes ; Information entropy ; Information retrieval ; large scale data ; Measurement ; Methods ; nearest neighbor search ; Nearest neighbor searches ; Searching ; Similarity ; similarity search ; Vector quantization</subject><ispartof>IEEE transactions on multimedia, 2017-08, Vol.19 (8), p.1785-1797</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c291t-bbef79dfd13b9a26784fda4907c66ec1daa8dababbe3e7b6011d91ad1b90d66f3</citedby><cites>FETCH-LOGICAL-c291t-bbef79dfd13b9a26784fda4907c66ec1daa8dababbe3e7b6011d91ad1b90d66f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7894185$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7894185$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Liu, Shicong</creatorcontrib><creatorcontrib>Shao, Junru</creatorcontrib><creatorcontrib>Lu, Hongtao</creatorcontrib><title>Generalized Residual Vector Quantization and Aggregating Tree for Large Scale Search</title><title>IEEE transactions on multimedia</title><addtitle>TMM</addtitle><description>Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector quantization framework that iteratively minimizes quantization error. First, we provide a detailed review on a relevant vector quantization method named residual vector quantization (RVQ). Next, we propose generalized residual vector quantization (GRVQ) to further improve over RVQ. Many vector quantization methods can be viewed as special cases of our proposed method. To enable GRVQ on billion scale data, we introduce a nonexhaustive search scheme named aggregating tree (A-Tree) for high dimensional data that uses GRVQ encodings to build a radix tree and perform the nearest neighbor search by beam search. To search accurately and efficiently, VQ-encodings should satisfy locally aggregating encoding criterion: For any node of the corresponding A-Tree, neighboring vectors should aggregate in fewer subtrees to make beam search efficient. We show that the proposed GRVQ encodings best satisfy the suggested criterion, and the joint use of GRVQ and A-Tree shows significantly better performances on billion scale datasets. Our methods are validated on several standard benchmark datasets. Experimental results and empirical analysis show the superior efficiency and effectiveness of our proposed methods compared to the state-of-the-art for large scale search.</description><subject>Criteria</subject><subject>Datasets</subject><subject>Empirical analysis</subject><subject>Encoding</subject><subject>Euclidean space</subject><subject>Feedback control systems</subject><subject>High dimensional data</subject><subject>Indexes</subject><subject>Information entropy</subject><subject>Information retrieval</subject><subject>large scale data</subject><subject>Measurement</subject><subject>Methods</subject><subject>nearest neighbor search</subject><subject>Nearest neighbor searches</subject><subject>Searching</subject><subject>Similarity</subject><subject>similarity search</subject><subject>Vector quantization</subject><issn>1520-9210</issn><issn>1941-0077</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1Lw0AQQBdRsFbvgpeA59SZJN3NHkvRKlREjV6XSXYSU2JSN8nB_nq3tHiZD-bNDDwhrhFmiKDvsufnWQSoZpHUEaZ4IiaoEwwBlDr19TyC0A_gXFz0_QYAkzmoichW3LKjpt6xDd64r-1ITfDJxdC54HWkdqh3NNRdG1Brg0VVOa5831ZB5piD0lNrchUH7wU1PjK54utSnJXU9Hx1zFPx8XCfLR_D9cvqablYh0WkcQjznEulbWkxzjVFUqVJaSnRoAopuUBLlFrKyXMxq1wCotVIFnMNVsoynorbw92t635G7gez6UbX-pcGdQSJTDHVnoIDVbiu7x2XZuvqb3K_BsHs3RnvzuzdmaM7v3JzWKmZ-R9XqReazuM_8fZrow</recordid><startdate>20170801</startdate><enddate>20170801</enddate><creator>Liu, Shicong</creator><creator>Shao, Junru</creator><creator>Lu, Hongtao</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20170801</creationdate><title>Generalized Residual Vector Quantization and Aggregating Tree for Large Scale Search</title><author>Liu, Shicong ; Shao, Junru ; Lu, Hongtao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c291t-bbef79dfd13b9a26784fda4907c66ec1daa8dababbe3e7b6011d91ad1b90d66f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Criteria</topic><topic>Datasets</topic><topic>Empirical analysis</topic><topic>Encoding</topic><topic>Euclidean space</topic><topic>Feedback control systems</topic><topic>High dimensional data</topic><topic>Indexes</topic><topic>Information entropy</topic><topic>Information retrieval</topic><topic>large scale data</topic><topic>Measurement</topic><topic>Methods</topic><topic>nearest neighbor search</topic><topic>Nearest neighbor searches</topic><topic>Searching</topic><topic>Similarity</topic><topic>similarity search</topic><topic>Vector quantization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Shicong</creatorcontrib><creatorcontrib>Shao, Junru</creatorcontrib><creatorcontrib>Lu, Hongtao</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on multimedia</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Liu, Shicong</au><au>Shao, Junru</au><au>Lu, Hongtao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Generalized Residual Vector Quantization and Aggregating Tree for Large Scale Search</atitle><jtitle>IEEE transactions on multimedia</jtitle><stitle>TMM</stitle><date>2017-08-01</date><risdate>2017</risdate><volume>19</volume><issue>8</issue><spage>1785</spage><epage>1797</epage><pages>1785-1797</pages><issn>1520-9210</issn><eissn>1941-0077</eissn><coden>ITMUF8</coden><abstract>Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector quantization framework that iteratively minimizes quantization error. First, we provide a detailed review on a relevant vector quantization method named residual vector quantization (RVQ). Next, we propose generalized residual vector quantization (GRVQ) to further improve over RVQ. Many vector quantization methods can be viewed as special cases of our proposed method. To enable GRVQ on billion scale data, we introduce a nonexhaustive search scheme named aggregating tree (A-Tree) for high dimensional data that uses GRVQ encodings to build a radix tree and perform the nearest neighbor search by beam search. To search accurately and efficiently, VQ-encodings should satisfy locally aggregating encoding criterion: For any node of the corresponding A-Tree, neighboring vectors should aggregate in fewer subtrees to make beam search efficient. We show that the proposed GRVQ encodings best satisfy the suggested criterion, and the joint use of GRVQ and A-Tree shows significantly better performances on billion scale datasets. Our methods are validated on several standard benchmark datasets. Experimental results and empirical analysis show the superior efficiency and effectiveness of our proposed methods compared to the state-of-the-art for large scale search.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TMM.2017.2692181</doi><tpages>13</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1520-9210 |
ispartof | IEEE transactions on multimedia, 2017-08, Vol.19 (8), p.1785-1797 |
issn | 1520-9210 1941-0077 |
language | eng |
recordid | cdi_proquest_journals_1920468189 |
source | IEEE Electronic Library (IEL) |
subjects | Criteria Datasets Empirical analysis Encoding Euclidean space Feedback control systems High dimensional data Indexes Information entropy Information retrieval large scale data Measurement Methods nearest neighbor search Nearest neighbor searches Searching Similarity similarity search Vector quantization |
title | Generalized Residual Vector Quantization and Aggregating Tree for Large Scale Search |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T10%3A25%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Generalized%20Residual%20Vector%20Quantization%20and%20Aggregating%20Tree%20for%20Large%20Scale%20Search&rft.jtitle=IEEE%20transactions%20on%20multimedia&rft.au=Liu,%20Shicong&rft.date=2017-08-01&rft.volume=19&rft.issue=8&rft.spage=1785&rft.epage=1797&rft.pages=1785-1797&rft.issn=1520-9210&rft.eissn=1941-0077&rft.coden=ITMUF8&rft_id=info:doi/10.1109/TMM.2017.2692181&rft_dat=%3Cproquest_RIE%3E1920468189%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1920468189&rft_id=info:pmid/&rft_ieee_id=7894185&rfr_iscdi=true |