Exemplar-based voice conversion using joint nonnegative matrix factorization
Exemplar-based sparse representation is a nonparametric framework for voice conversion. In this framework, a target spectrum is generated as a weighted linear combination of a set of basis spectra, namely exemplars, extracted from the training data. This framework adopts coupled source-target dictio...
Gespeichert in:
Veröffentlicht in: | Multimedia tools and applications 2015-11, Vol.74 (22), p.9943-9958 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Exemplar-based sparse representation is a nonparametric framework for voice conversion. In this framework, a target spectrum is generated as a weighted linear combination of a set of basis spectra, namely exemplars, extracted from the training data. This framework adopts coupled source-target dictionaries consisting of acoustically aligned source-target exemplars, and assumes they can share the same activation matrix. At runtime, a source spectrogram is factorized as a product of the source dictionary and the common activation matrix, which is applied to the target dictionary to generate the target spectrogram. In practice, either low-resolution mel-scale filter bank energies or high-resolution spectra are adopted in the source dictionary. Low-resolution features are flexible in capturing the temporal information without increasing the computational cost and the memory occupation significantly, while high-resolution spectra contain significant spectral details. In this paper, we propose a joint nonnegative matrix factorization technique to find the common activation matrix using low- and high-resolution features at the same time. In this way, the common activation matrix is able to benefit from low- and high-resolution features directly. We conducted experiments on the VOICES database to evaluate the performance of the proposed method. Both objective and subjective evaluations confirmed the effectiveness of the proposed methods. |
---|---|
ISSN: | 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-014-2180-2 |