Graph-PIT: Generalized permutation invariant training for continuous separation of arbitrary numbers of speakers

Automatic transcription of meetings requires handling of overlapped speech, which calls for continuous speech separation (CSS) systems. The uPIT criterion was proposed for utterance-level separation with neural networks and introduces the constraint that the total number of speakers must not exceed...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2021-09
Hauptverfasser: Thilo von Neumann, Kinoshita, Keisuke, Boeddeker, Christoph, Delcroix, Marc, Haeb-Umbach, Reinhold
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!