H-Voice: Fake voice histograms (Imitation+DeepVoice)

This data set consists of (6672) histograms of original voice recordings and fake voice recordings obtained by Imitation [1, 2] and Deep Voice [3]. The histograms provided in this dataset can be used to train a machine learning system to classify original and fake voice recordings obtained with the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Ballesteros L, Dora Maria
Format: Dataset
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This data set consists of (6672) histograms of original voice recordings and fake voice recordings obtained by Imitation [1, 2] and Deep Voice [3]. The histograms provided in this dataset can be used to train a machine learning system to classify original and fake voice recordings obtained with the imitation and Deep Voice algorithms. Each directory has the following composition: -- corrupted images have been fixed -- Training_fake: 2088 histograms of fake voice recordings (2016 with Imitation and with 72 Deep Voice) Training_original: 2020 histograms of original voice recordings Validation_fake: 864 histograms of fake voice recordings (all with Imitation) Validation_original: 864 histograms of original voice recordings External_test1: 760 histograms (380 original + 380 fake with Imitation) External_test2: 76 histograms (4 original + 72 fake with Deep Voice) References: [1] DM Ballesteros L, JM Moreno A. Highly transparent steganography model of speech signals using Efficient Wavelet Masking. Expert Systems with Applications 39 (10), 2012, 9141-9149, https://doi.org/10.1016/j.eswa.2012.02.066 [2] DM Ballesteros L, JM Moreno A. On the ability of adaptation of speech signals and data hiding, Expert Systems with Applications 39 (16), 2012, 12574-12579, https://doi.org/10.1016/j.eswa.2012.05.027 [3] S.O. Arik, M. Chrzanowski, A. Coates, G. Diamos, A. Gibiansky, Y. Kang, X. Li, J. Miller, A. Ng, J. Raiman, S. Sengupta, M. Shoeybi. Deep Voice: Real-time Neural Text-to-Speech. 2017. https://arxiv.org/abs/1702.07825
DOI:10.17632/k47yd3m28w