DNN-Based Voice Activity Detection with Multi-Task Learning

Recently, notable improvements in voice activity detection (VAD) problem have been achieved by adopting several machine learning techniques. Among them, the deep neural network (DNN) which learns the mapping between the noisy speech features and the corresponding voice activity status with its deep...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEICE Transactions on Information and Systems 2016/02/01, Vol.E99.D(2), pp.550-553
Hauptverfasser: KANG, Tae Gyoon, KIM, Nam Soo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recently, notable improvements in voice activity detection (VAD) problem have been achieved by adopting several machine learning techniques. Among them, the deep neural network (DNN) which learns the mapping between the noisy speech features and the corresponding voice activity status with its deep hidden structure has been one of the most popular techniques. In this letter, we propose a novel approach which enhances the robustness of DNN in mismatched noise conditions with multi-task learning (MTL) framework. In the proposed algorithm, a feature enhancement task for speech features is jointly trained with the conventional VAD task. The experimental results show that the DNN with the proposed framework outperforms the conventional DNN-based VAD algorithm.
ISSN:0916-8532
1745-1361
DOI:10.1587/transinf.2015EDL8168