Residual Vector Product Quantization for approximate nearest neighbor search

Vector quantization is one of the most popular techniques for approximate nearest neighbor (ANN) search. Over the past decade, many vector quantization methods have been proposed for ANN search. However, these methods do not strike a satisfactory balance between accuracy and efficiency because of th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2023-12, Vol.232, p.120832, Article 120832
Hauptverfasser: Niu, Lushuai, Xu, Zhi, Zhao, Longyang, He, Daojing, Ji, Jianqiu, Yuan, Xiaoli, Xue, Mian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Vector quantization is one of the most popular techniques for approximate nearest neighbor (ANN) search. Over the past decade, many vector quantization methods have been proposed for ANN search. However, these methods do not strike a satisfactory balance between accuracy and efficiency because of their defects in quantization structures. To overcome this problem, a quantization method, named as Residual Vector Product Quantization (RVPQ), is proposed in this study. Under this method, the data space is decomposed into several subspaces, and a residual structure consisting of several ordered residual codebooks is constructed for each subspace. Learned by using an effective joint training algorithm, the quantization structure of RVPQ is much better than other methods, and it greatly enhances the performance of ANN search. Except that, an efficient residual quantization encoding method H-Variable Beam Search is also proposed to achieve higher encoding efficiency with negligible loss of accuracy. Furthermore, Inverted Multi-Index based on RVPQ is also designed to effectively solve the ANN search for a very large-scale database. Experimental results and theoretical evaluations show that RVPQ outperforms the-state-of-the-art methods on retrieval accuracy while retaining a comparable computational complexity. •Residual product structure achieves a good balance between accuracy and efficiency.•Jointly training algorithm obtain satisfying quantization error.•H-variable beam search improves encoding efficiency without loss of accuracy.•Quantization index structure effectively solves the large-scale retrieval tasks.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2023.120832