Efficient k-NN Search with Cross-Encoders using Adaptive Multi-Round CUR Decomposition
Cross-encoder models, which jointly encode and score a query-item pair, are prohibitively expensive for direct k-nearest neighbor (k-NN) search. Consequently, k-NN search typically employs a fast approximate retrieval (e.g. using BM25 or dual-encoder vectors), followed by reranking with a cross-enco...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Cross-encoder models, which jointly encode and score a query-item pair, are
prohibitively expensive for direct k-nearest neighbor (k-NN) search.
Consequently, k-NN search typically employs a fast approximate retrieval (e.g.
using BM25 or dual-encoder vectors), followed by reranking with a
cross-encoder; however, the retrieval approximation often has detrimental
recall regret. This problem is tackled by ANNCUR (Yadav et al., 2022), a recent
work that employs a cross-encoder only, making search efficient using a
relatively small number of anchor items, and a CUR matrix factorization. While
ANNCUR's one-time selection of anchors tends to approximate the cross-encoder
distances on average, doing so forfeits the capacity to accurately estimate
distances to items near the query, leading to regret in the crucial end-task:
recall of top-k items. In this paper, we propose ADACUR, a method that
adaptively, iteratively, and efficiently minimizes the approximation error for
the practically important top-k neighbors. It does so by iteratively performing
k-NN search using the anchors available so far, then adding these retrieved
nearest neighbors to the anchor set for the next round. Empirically, on
multiple datasets, in comparison to previous traditional and state-of-the-art
methods such as ANNCUR and dual-encoder-based retrieve-and-rerank, our proposed
approach ADACUR consistently reduces recall error-by up to 70% on the important
k = 1 setting-while using no more compute than its competitors. |
---|---|
DOI: | 10.48550/arxiv.2305.02996 |