A Single-Input/Binaural-Output Antiphasic Speech Enhancement Method for Speech Intelligibility Improvement
Improving intelligibility of a speech signal of interest from its observations (with a single microphone) corrupted by additive noise has long been a challenging problem. Motivated by important findings achieved in the psychoacoustic field, we propose in this work a deep learning based method to ren...
Gespeichert in:
Veröffentlicht in: | IEEE signal processing letters 2021, Vol.28, p.1445-1449 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Improving intelligibility of a speech signal of interest from its observations (with a single microphone) corrupted by additive noise has long been a challenging problem. Motivated by important findings achieved in the psychoacoustic field, we propose in this work a deep learning based method to render the noise and desired speech in the perceptual space such that the perception of the desired speech is least affected by the noise. Specifically, we adopt the temporal convolutional network (TCN) based structure to map the single-channel noisy observations into two binaural signals, one for the left ear and the other for the right ear. The TCN is trained in such a way that the desired speech and noise will be perceived to be in opposite directions when the listener listens to the binaural signals. This antiphasic binaural presentation enables the listener to better distinguish the desired speech from the annoying noise for improved speech intelligibility. The modified rhyme test is performed for evaluation and the results justify the superiority of the proposed method for speech intelligibility improvement. |
---|---|
ISSN: | 1070-9908 1558-2361 |
DOI: | 10.1109/LSP.2021.3095016 |