Active Learning for Natural Language Generation
The field of Natural Language Generation (NLG) suffers from a severe shortage of labeled data due to the extremely expensive and time-consuming process involved in manual annotation. A natural approach for coping with this problem is active learning (AL), a well-known machine learning technique for...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The field of Natural Language Generation (NLG) suffers from a severe shortage
of labeled data due to the extremely expensive and time-consuming process
involved in manual annotation. A natural approach for coping with this problem
is active learning (AL), a well-known machine learning technique for improving
annotation efficiency by selectively choosing the most informative examples to
label. However, while AL has been well-researched in the context of text
classification, its application to NLG remains largely unexplored. In this
paper, we present a first systematic study of active learning for NLG,
considering a diverse set of tasks and multiple leading selection strategies,
and harnessing a strong instruction-tuned model. Our results indicate that the
performance of existing AL strategies is inconsistent, surpassing the baseline
of random example selection in some cases but not in others. We highlight some
notable differences between the classification and generation scenarios, and
analyze the selection behaviors of existing AL strategies. Our findings
motivate exploring novel approaches for applying AL to generation tasks. |
---|---|
DOI: | 10.48550/arxiv.2305.15040 |