Retrieval-Augmented Generation with Estimation of Source Reliability
Retrieval-augmented generation (RAG) addresses key limitations of large language models (LLMs), such as hallucinations and outdated knowledge, by incorporating external databases. These databases typically consult multiple sources to encompass up-to-date and various information. However, standard RA...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Retrieval-augmented generation (RAG) addresses key limitations of large
language models (LLMs), such as hallucinations and outdated knowledge, by
incorporating external databases. These databases typically consult multiple
sources to encompass up-to-date and various information. However, standard RAG
methods often overlook the heterogeneous source reliability in the multi-source
database and retrieve documents solely based on relevance, making them prone to
propagating misinformation. To address this, we propose Reliability-Aware RAG
(RA-RAG) which estimates the reliability of multiple sources and incorporates
this information into both retrieval and aggregation processes. Specifically,
it iteratively estimates source reliability and true answers for a set of
queries with no labelling. Then, it selectively retrieves relevant documents
from a few of reliable sources and aggregates them using weighted majority
voting, where the selective retrieval ensures scalability while not
compromising the performance. We also introduce a benchmark designed to reflect
real-world scenarios with heterogeneous source reliability and demonstrate the
effectiveness of RA-RAG compared to a set of baselines. |
---|---|
DOI: | 10.48550/arxiv.2410.22954 |