COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List
Classical information retrieval systems such as BM25 rely on exact lexical match and carry out search efficiently with inverted list index. Recent neural IR models shifts towards soft semantic matching all query document terms, but they lose the computation efficiency of exact match systems. This pa...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Classical information retrieval systems such as BM25 rely on exact lexical
match and carry out search efficiently with inverted list index. Recent neural
IR models shifts towards soft semantic matching all query document terms, but
they lose the computation efficiency of exact match systems. This paper
presents COIL, a contextualized exact match retrieval architecture that brings
semantic lexical matching. COIL scoring is based on overlapping query document
tokens' contextualized representations. The new architecture stores
contextualized token representations in inverted lists, bringing together the
efficiency of exact match and the representation power of deep language models.
Our experimental results show COIL outperforms classical lexical retrievers and
state-of-the-art deep LM retrievers with similar or smaller latency. |
---|---|
DOI: | 10.48550/arxiv.2104.07186 |