Assessing Data Efficiency in Task-Oriented Semantic Parsing
Data efficiency, despite being an attractive characteristic, is often challenging to measure and optimize for in task-oriented semantic parsing; unlike exact match, it can require both model- and domain-specific setups, which have, historically, varied widely across experiments. In our work, as a st...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Data efficiency, despite being an attractive characteristic, is often
challenging to measure and optimize for in task-oriented semantic parsing;
unlike exact match, it can require both model- and domain-specific setups,
which have, historically, varied widely across experiments. In our work, as a
step towards providing a unified solution to data-efficiency-related questions,
we introduce a four-stage protocol which gives an approximate measure of how
much in-domain, "target" data a parser requires to achieve a certain quality
bar. Specifically, our protocol consists of (1) sampling target subsets of
different cardinalities, (2) fine-tuning parsers on each subset, (3) obtaining
a smooth curve relating target subset (%) vs. exact match (%), and (4)
referencing the curve to mine ad-hoc (target subset, exact match) points. We
apply our protocol in two real-world case studies -- model generalizability and
intent complexity -- illustrating its flexibility and applicability to
practitioners in task-oriented semantic parsing. |
---|---|
DOI: | 10.48550/arxiv.2107.04736 |