Understanding tables with intermediate pre-training
Table entailment, the binary classification task of finding if a sentence is supported or refuted by the content of a table, requires parsing language and table structure as well as numerical and discrete reasoning. While there is extensive work on textual entailment, table entailment is less well s...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Table entailment, the binary classification task of finding if a sentence is
supported or refuted by the content of a table, requires parsing language and
table structure as well as numerical and discrete reasoning. While there is
extensive work on textual entailment, table entailment is less well studied. We
adapt TAPAS (Herzig et al., 2020), a table-based BERT model, to recognize
entailment. Motivated by the benefits of data augmentation, we create a
balanced dataset of millions of automatically created training examples which
are learned in an intermediate step prior to fine-tuning. This new data is not
only useful for table entailment, but also for SQA (Iyyer et al., 2017), a
sequential table QA task. To be able to use long examples as input of BERT
models, we evaluate table pruning techniques as a pre-processing step to
drastically improve the training and prediction efficiency at a moderate drop
in accuracy. The different methods set the new state-of-the-art on the TabFact
(Chen et al., 2020) and SQA datasets. |
---|---|
DOI: | 10.48550/arxiv.2010.00571 |