Is This a Bad Table? A Closer Look at the Evaluation of Table Generation from Text
Understanding whether a generated table is of good quality is important to be able to use it in creating or editing documents using automatic methods. In this work, we underline that existing measures for table quality evaluation fail to capture the overall semantics of the tables, and sometimes unf...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Understanding whether a generated table is of good quality is important to be
able to use it in creating or editing documents using automatic methods. In
this work, we underline that existing measures for table quality evaluation
fail to capture the overall semantics of the tables, and sometimes unfairly
penalize good tables and reward bad ones. We propose TabEval, a novel table
evaluation strategy that captures table semantics by first breaking down a
table into a list of natural language atomic statements and then compares them
with ground truth statements using entailment-based measures. To validate our
approach, we curate a dataset comprising of text descriptions for 1,250 diverse
Wikipedia tables, covering a range of topics and structures, in contrast to the
limited scope of existing datasets. We compare TabEval with existing metrics
using unsupervised and supervised text-to-table generation methods,
demonstrating its stronger correlation with human judgments of table quality
across four datasets. |
---|---|
DOI: | 10.48550/arxiv.2406.14829 |