BS-PLCNet: Band-split Packet Loss Concealment Network with Multi-task Learning Framework and Multi-discriminators
Packet loss is a common and unavoidable problem in voice over internet phone (VoIP) systems. To deal with the problem, we propose a band-split packet loss concealment network (BS-PLCNet). Specifically, we split the full-band signal into wide-band (0-8kHz) and high-band (8-24kHz). The wide-band signa...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Packet loss is a common and unavoidable problem in voice over internet phone
(VoIP) systems. To deal with the problem, we propose a band-split packet loss
concealment network (BS-PLCNet). Specifically, we split the full-band signal
into wide-band (0-8kHz) and high-band (8-24kHz). The wide-band signals are
processed by a gated convolutional recurrent network (GCRN), while the
high-band counterpart is processed by a simple GRU network. To ensure high
speech quality and automatic speech recognition (ASR) compatibility, multi-task
learning (MTL) framework including fundamental frequency (f0) prediction,
linguistic awareness, and multi-discriminators are used. The proposed approach
tied for 1st place in the ICASSP 2024 PLC Challenge. |
---|---|
DOI: | 10.48550/arxiv.2401.03687 |