LEAF-QA: Locate, Encode & Attend for Figure Question Answering
We introduce LEAF-QA, a comprehensive dataset of $250,000$ densely annotated figures/charts, constructed from real-world open data sources, along with ~2 million question-answer (QA) pairs querying the structure and semantics of these charts. LEAF-QA highlights the problem of multimodal QA, which is...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We introduce LEAF-QA, a comprehensive dataset of $250,000$ densely annotated
figures/charts, constructed from real-world open data sources, along with ~2
million question-answer (QA) pairs querying the structure and semantics of
these charts. LEAF-QA highlights the problem of multimodal QA, which is notably
different from conventional visual QA (VQA), and has recently gained interest
in the community. Furthermore, LEAF-QA is significantly more complex than
previous attempts at chart QA, viz. FigureQA and DVQA, which present only
limited variations in chart data. LEAF-QA being constructed from real-world
sources, requires a novel architecture to enable question answering. To this
end, LEAF-Net, a deep architecture involving chart element localization,
question and answer encoding in terms of chart elements, and an attention
network is proposed. Different experiments are conducted to demonstrate the
challenges of QA on LEAF-QA. The proposed architecture, LEAF-Net also
considerably advances the current state-of-the-art on FigureQA and DVQA. |
---|---|
DOI: | 10.48550/arxiv.1907.12861 |