Bi-LSTM-attention Based on ACNN Model for Disfluency Detection

Disfluencies are self-corrections in spontaneous speech, including filled pauses, repetitions, repairs and false starts. The task of disfluency detection is to identify these disfluencies phenomena in spoken speech and make them consistent with written text. Recent research has applied machine learn...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of physics. Conference series 2022-07, Vol.2303 (1), p.12018
Hauptverfasser: Tian, Xin, Fang, Bei, He, Juhou, He, Xiuqing
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Disfluencies are self-corrections in spontaneous speech, including filled pauses, repetitions, repairs and false starts. The task of disfluency detection is to identify these disfluencies phenomena in spoken speech and make them consistent with written text. Recent research has applied machine learning and deep learning approaches to disfluency recognition and classification. In addition, long distance dependence is one of the core issues in disfluency detection. In order to solve this problem, most existing approaches had combined plenty of hand-crafted features and words as input. However, hand-crafted model features need lots of time and energy. Some studies reduced the dependence on hand-crafted features, but they ignored the dependence between long sentences. In this article, we consider disfluency detection as a problem of sequence labeling, then apply Bi-LSTM and attention mechanisms to disfluency detection. In particular, on the basis of obtaining the dependency of rough copy by Auto-correlational neural network (ACNN), we improve the ACNN model to handle the long-term dependency, so that we can better capture dependency between words and words. In other words, our method can not only find the rough copy relationship between sentences without additional hand-crafted features, but also obtain the dependency relationship between long sentences. Experiments on commonly used English Switchboard test sets show that our approach achieves good performance compared to previous models that used texts only without other hand-crafted features.
ISSN:1742-6588
1742-6596
DOI:10.1088/1742-6596/2303/1/012018