Meta-SE: A Meta-Learning Framework for Few-Shot Speech Enhancement

Separating target speech from noisy signal is important for many realistic applications. Recently, deep neural network (DNN) has been widely used in speech enhancement (SE) and obtained prominent performance improvements. However, the current deep models require a large amount of training data to ob...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2021, Vol.9, p.46068-46078
Hauptverfasser: Zhou, Weili, Lu, Mingliang, Ji, Ruijie
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Separating target speech from noisy signal is important for many realistic applications. Recently, deep neural network (DNN) has been widely used in speech enhancement (SE) and obtained prominent performance improvements. However, the current deep models require a large amount of training data to obtain a good performance. It is still challenging to construct an effective deep speech enhancement model with actual few training samples. At present, meta-learning has become the research focus of few-shot learning due to its capability of quickly process new tasks with few samples by the prior meta-knowledge, but there are very few works applying meta-learning on few-shot speech enhancement. In this paper, we propose a generic meta-learning framework Meta-SE which applies the U-Net as the meta-learner, to tackle the few-shot speech enhancement problem. Meta-SE is trained and optimized with the changed speech enhancement tasks to obtain meta-knowledge, and towards better capability of fast and good generalizing to the new unseen noises with few training samples. The experiment results show that the proposed method not only outperforms the state-of-the-arts DNN-SE models under the few-shot conditions, but also learns a more general and flexible model for task adaption.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2021.3066609