Gaussian Flow Bridges for Audio Domain Transfer with Unpaired Data
Audio domain transfer is the process of modifying audio signals to match characteristics of a different domain, while retaining the original content. This paper investigates the potential of Gaussian Flow Bridges, an emerging approach in generative modeling, for this problem. The presented framework...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Audio domain transfer is the process of modifying audio signals to match
characteristics of a different domain, while retaining the original content.
This paper investigates the potential of Gaussian Flow Bridges, an emerging
approach in generative modeling, for this problem. The presented framework
addresses the transport problem across different distributions of audio signals
through the implementation of a series of two deterministic probability flows.
The proposed framework facilitates manipulation of the target distribution
properties through a continuous control variable, which defines a certain
aspect of the target domain. Notably, this approach does not rely on paired
examples for training. To address identified challenges on maintaining the
speech content consistent, we recommend a training strategy that incorporates
chunk-based minibatch Optimal Transport couplings of data samples and noise.
Comparing our unsupervised method with established baselines, we find
competitive performance in tasks of reverberation and distortion manipulation.
Despite encoutering limitations, the intriguing results obtained in this study
underscore potential for further exploration. |
---|---|
DOI: | 10.48550/arxiv.2405.19497 |