Active learning with semi-automatic annotation for extractive speech summarization

We propose using active learning for extractive speech summarization in order to reduce human effort in generating reference summaries. Active learning chooses a selective set of samples to be labeled. We propose a combination of informativeness and representativeness criteria for selection. We furt...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM transactions on speech and language processing 2012-02, Vol.8 (4), p.1-25
Hauptverfasser:	Zhang, Justin Jian, Fung, Pascale
Format:	Artikel
Sprache:	eng
Schlagworte:	Criteria
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We propose using active learning for extractive speech summarization in order to reduce human effort in generating reference summaries. Active learning chooses a selective set of samples to be labeled. We propose a combination of informativeness and representativeness criteria for selection. We further propose a semi-automatic method to generate reference summaries for presentation speech by using Relaxed Dynamic Time Warping (RDTW) alignment between presentation speech and its accompanied slides. Our summarization results show that the amount of labeled data needed for a given summarization accuracy can be reduced by more than 23% compared to random sampling.
ISSN:	1550-4875 1550-4883
DOI:	10.1145/2093153.2093155