Searching Dense Representations with Inverted Indexes

Nearly all implementations of top-\(k\) retrieval with dense vector representations today take advantage of hierarchical navigable small-world network (HNSW) indexes. However, the generation of vector representations and efficiently searching large collections of vectors are distinct challenges that...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2023-12
Hauptverfasser: Lin, Jimmy, Teofili, Tommaso
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Nearly all implementations of top-\(k\) retrieval with dense vector representations today take advantage of hierarchical navigable small-world network (HNSW) indexes. However, the generation of vector representations and efficiently searching large collections of vectors are distinct challenges that can be decoupled. In this work, we explore the contrarian approach of performing top-\(k\) retrieval on dense vector representations using inverted indexes. We present experiments on the MS MARCO passage ranking dataset, evaluating three dimensions of interest: output quality, speed, and index size. Results show that searching dense representations using inverted indexes is possible. Our approach exhibits reasonable effectiveness with compact indexes, but is impractically slow. Thus, while workable, our solution does not provide a compelling tradeoff and is perhaps best characterized today as a "technical curiosity".
ISSN:2331-8422