Leveraging Non-Causal Knowledge via Cross-Network Knowledge Distillation for Real-Time Speech Enhancement
To improve real-time speech enhancement (SE) while maintaining efficiency, researchers have adopted knowledge distillation (KD). However, when the same network type as the real-time SE student model is used as a teacher model, the performance of the teacher model can be unsatisfactory, thereby limit...
Gespeichert in:
Veröffentlicht in: | IEEE signal processing letters 2024, Vol.31, p.1129-1133 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | To improve real-time speech enhancement (SE) while maintaining efficiency, researchers have adopted knowledge distillation (KD). However, when the same network type as the real-time SE student model is used as a teacher model, the performance of the teacher model can be unsatisfactory, thereby limiting the effectiveness of KD. To overcome this limitation, we propose cross-network non-causal knowledge distillation (CNNC-Distill). CNNC-Distill enables knowledge transfer between networks of different types, allowing the use of a teacher model with a different network type compared to the real-time SE student model. To maximize the KD effect, a non-real-time SE model unconstrained by causality conditions is adopted as the teacher model. CNNC-Distill transfers the non-causal knowledge of the non-real-time SE teacher model to a real-time SE student model using feature and output distillation. We also introduce a time-domain network, RT-SENet, used as the real-time SE student model. Results on the Valentini dataset show the efficiency of RT-SENet and the significant performance improvement achieved by CNNC-Distill. |
---|---|
ISSN: | 1070-9908 1558-2361 |
DOI: | 10.1109/LSP.2024.3388956 |