Learning When Not to Answer: A Ternary Reward Structure for Reinforcement Learning based Question Answering
In this paper, we investigate the challenges of using reinforcement learning agents for question-answering over knowledge graphs for real-world applications. We examine the performance metrics used by state-of-the-art systems and determine that they are inadequate for such settings. More specificall...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, we investigate the challenges of using reinforcement learning
agents for question-answering over knowledge graphs for real-world
applications. We examine the performance metrics used by state-of-the-art
systems and determine that they are inadequate for such settings. More
specifically, they do not evaluate the systems correctly for situations when
there is no answer available and thus agents optimized for these metrics are
poor at modeling confidence. We introduce a simple new performance metric for
evaluating question-answering agents that is more representative of practical
usage conditions, and optimize for this metric by extending the binary reward
structure used in prior work to a ternary reward structure which also rewards
an agent for not answering a question rather than giving an incorrect answer.
We show that this can drastically improve the precision of answered questions
while only not answering a limited number of previously correctly answered
questions. Employing a supervised learning strategy using depth-first-search
paths to bootstrap the reinforcement learning algorithm further improves
performance. |
---|---|
DOI: | 10.48550/arxiv.1902.10236 |