EURO: ESPnet Unsupervised ASR Open-source Toolkit
This paper describes the ESPnet Unsupervised ASR Open-source Toolkit (EURO), an end-to-end open-source toolkit for unsupervised automatic speech recognition (UASR). EURO adopts the state-of-the-art UASR learning method introduced by the Wav2vec-U, originally implemented at FAIRSEQ, which leverages s...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper describes the ESPnet Unsupervised ASR Open-source Toolkit (EURO),
an end-to-end open-source toolkit for unsupervised automatic speech recognition
(UASR). EURO adopts the state-of-the-art UASR learning method introduced by the
Wav2vec-U, originally implemented at FAIRSEQ, which leverages self-supervised
speech representations and adversarial training. In addition to wav2vec2, EURO
extends the functionality and promotes reproducibility for UASR tasks by
integrating S3PRL and k2, resulting in flexible frontends from 27
self-supervised models and various graph-based decoding strategies. EURO is
implemented in ESPnet and follows its unified pipeline to provide UASR recipes
with a complete setup. This improves the pipeline's efficiency and allows EURO
to be easily applied to existing datasets in ESPnet. Extensive experiments on
three mainstream self-supervised models demonstrate the toolkit's effectiveness
and achieve state-of-the-art UASR performance on TIMIT and LibriSpeech
datasets. EURO will be publicly available at https://github.com/espnet/espnet,
aiming to promote this exciting and emerging research area based on UASR
through open-source activity. |
---|---|
DOI: | 10.48550/arxiv.2211.17196 |