Speaker Anonymization Using X-vector and Neural Waveform Models
The social media revolution has produced a plethora of web services to which users can easily upload and share multimedia documents. Despite the popularity and convenience of such services, the sharing of such inherently personal data, including speech data, raises obvious security and privacy conce...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The social media revolution has produced a plethora of web services to which
users can easily upload and share multimedia documents. Despite the popularity
and convenience of such services, the sharing of such inherently personal data,
including speech data, raises obvious security and privacy concerns. In
particular, a user's speech data may be acquired and used with speech synthesis
systems to produce high-quality speech utterances which reflect the same user's
speaker identity. These utterances may then be used to attack speaker
verification systems. One solution to mitigate these concerns involves the
concealing of speaker identities before the sharing of speech data. For this
purpose, we present a new approach to speaker anonymization. The idea is to
extract linguistic and speaker identity features from an utterance and then to
use these with neural acoustic and waveform models to synthesize anonymized
speech. The original speaker identity, in the form of timbre, is suppressed and
replaced with that of an anonymous pseudo identity. The approach exploits
state-of-the-art x-vector speaker representations. These are used to derive
anonymized pseudo speaker identities through the combination of multiple,
random speaker x-vectors. Experimental results show that the proposed approach
is effective in concealing speaker identities. It increases the equal error
rate of a speaker verification system while maintaining high quality,
anonymized speech. |
---|---|
DOI: | 10.48550/arxiv.1905.13561 |