Single-channel speech enhancement based on joint constrained dictionary learning

To improve the performance of speech enhancement in a complex noise environment, a joint constrained dictionary learning method for single-channel speech enhancement is proposed, which solves the “cross projection” problem of signals in the joint dictionary. In the method, the new optimization funct...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:EURASIP journal on audio, speech, and music processing speech, and music processing, 2021-07, Vol.2021 (1), p.1-14, Article 29
Hauptverfasser: Sun, Linhui, Bu, Yunyi, Li, Pingan, Wu, Zihao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To improve the performance of speech enhancement in a complex noise environment, a joint constrained dictionary learning method for single-channel speech enhancement is proposed, which solves the “cross projection” problem of signals in the joint dictionary. In the method, the new optimization function not only constrains the sparse representation of the noisy signal in the joint dictionary, and controls the projection error of the speech signal and noise signal on the corresponding sub-dictionary, but also minimizes the cross projection error and the correlation between the sub-dictionaries. In addition, the adjustment factors are introduced to balance the weight of constraint terms to obtain the joint dictionary more discriminatively. When the method is applied to the single-channel speech enhancement, speech components of the noisy signal can be more projected onto the clean speech sub-dictionary of the joint dictionary without being affected by the noise sub-dictionary, which makes the quality and intelligibility of the enhanced speech higher. The experimental results verify that our algorithm has better performance than the speech enhancement algorithm based on discriminative dictionary learning under white noise and colored noise environments in time domain waveform, spectrogram, global signal-to-noise ratio, subjective evaluation of speech quality, and logarithmic spectrum distance.
ISSN:1687-4722
1687-4714
1687-4722
DOI:10.1186/s13636-021-00218-3