Reproducibility Analysis and Enhancements for Multi-Aspect Dense Retriever with Aspect Learning
Multi-aspect dense retrieval aims to incorporate aspect information (e.g., brand and category) into dual encoders to facilitate relevance matching. As an early and representative multi-aspect dense retriever, MADRAL learns several extra aspect embeddings and fuses the explicit aspects with an implic...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Multi-aspect dense retrieval aims to incorporate aspect information (e.g.,
brand and category) into dual encoders to facilitate relevance matching. As an
early and representative multi-aspect dense retriever, MADRAL learns several
extra aspect embeddings and fuses the explicit aspects with an implicit aspect
"OTHER" for final representation. MADRAL was evaluated on proprietary data and
its code was not released, making it challenging to validate its effectiveness
on other datasets. We failed to reproduce its effectiveness on the public
MA-Amazon data, motivating us to probe the reasons and re-examine its
components. We propose several component alternatives for comparisons,
including replacing "OTHER" with "CLS" and representing aspects with the first
several content tokens. Through extensive experiments, we confirm that learning
"OTHER" from scratch in aspect fusion is harmful. In contrast, our proposed
variants can greatly enhance the retrieval performance. Our research not only
sheds light on the limitations of MADRAL but also provides valuable insights
for future studies on more powerful multi-aspect dense retrieval models. Code
will be released at:
https://github.com/sunxiaojie99/Reproducibility-for-MADRAL. |
---|---|
DOI: | 10.48550/arxiv.2401.03648 |