CUPVC: A Constraint-based Unsupervised Prosody Transfer for Improving Telephone Banking Services
Low efficiency in telephone banking services reduces customer satisfaction. Therefore, some recent studies have concentrated on applying voice conversion models to improve telephone banking services. However, building such a model raises three huge challenges, as practical telephone banking services...
Gespeichert in:
Veröffentlicht in: | IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2023-01, Vol.31, p.1-12 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Low efficiency in telephone banking services reduces customer satisfaction. Therefore, some recent studies have concentrated on applying voice conversion models to improve telephone banking services. However, building such a model raises three huge challenges, as practical telephone banking services require natural and high-quality conversations. These challenges include the lack of parallel speech data, difficulty in generating natural speech, and difficulty in modeling long speech. To tackle such challenges, we propose a novel unsupervised prosody transfer for improving customer satisfaction in telephone conversations relying on grounded theoretical foundations. Our model consists of a solo-encoding disentanglement module and a forge module. (i) The disentanglement module uses three unique constraints to effectively reduce manual feature engineering and training costs and decompose extremely long speech without parallel data. (ii) The forge module hammers at converting the source prosody to the target one and guarantees correct fine-grained alignments, thereby generating natural speech. Finally, extensive experiments are conducted on large-scale telephone recordings from XWbank in China and suggest that our model can achieve promising outcomes. Moreover, we open-source our codes and unique datasets on GitHub. |
---|---|
ISSN: | 2329-9290 2329-9304 |
DOI: | 10.1109/TASLP.2023.3293042 |