Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering
Multi-Hop Question Answering (MHQA) tasks present a significant challenge for large language models (LLMs) due to the intensive knowledge required. Current solutions, like Retrieval-Augmented Generation, typically retrieve potential documents from an external corpus to read an answer. However, the p...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Multi-Hop Question Answering (MHQA) tasks present a significant challenge for
large language models (LLMs) due to the intensive knowledge required. Current
solutions, like Retrieval-Augmented Generation, typically retrieve potential
documents from an external corpus to read an answer. However, the performance
of this retrieve-then-read paradigm is constrained by the retriever and the
inevitable noise in the retrieved documents. To mitigate these challenges, we
introduce a novel generate-then-ground (GenGround) framework, synergizing the
parametric knowledge of LLMs and external documents to solve a multi-hop
question. GenGround empowers LLMs to alternate two phases until the final
answer is derived: (1) formulate a simpler, single-hop question and directly
generate the answer; (2) ground the question-answer pair in retrieved
documents, amending any wrong predictions in the answer. We also propose an
instructional grounding distillation method to generalize our method into
smaller models. Extensive experiments conducted on four datasets illustrate the
superiority of our method. |
---|---|
DOI: | 10.48550/arxiv.2406.14891 |