Metric Learning for User-defined Keyword Spotting
The goal of this work is to detect new spoken terms defined by users. While most previous works address Keyword Spotting (KWS) as a closed-set classification problem, this limits their transferability to unseen terms. The ability to define custom keywords has advantages in terms of user experience....
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The goal of this work is to detect new spoken terms defined by users. While
most previous works address Keyword Spotting (KWS) as a closed-set
classification problem, this limits their transferability to unseen terms. The
ability to define custom keywords has advantages in terms of user experience.
In this paper, we propose a metric learning-based training strategy for
user-defined keyword spotting. In particular, we make the following
contributions: (1) we construct a large-scale keyword dataset with an existing
speech corpus and propose a filtering method to remove data that degrade model
training; (2) we propose a metric learning-based two-stage training strategy,
and demonstrate that the proposed method improves the performance on the
user-defined keyword spotting task by enriching their representations; (3) to
facilitate the fair comparison in the user-defined KWS field, we propose
unified evaluation protocol and metrics.
Our proposed system does not require an incremental training on the
user-defined keywords, and outperforms previous works by a significant margin
on the Google Speech Commands dataset using the proposed as well as the
existing metrics. |
---|---|
DOI: | 10.48550/arxiv.2211.00439 |