Voice Signal Processing System

Audio data is classified using voice descriptors (ie. desired style or voice quality) by extracting non-semantic speech features (eg. pitch, tone, volume, facial expression, gestures, body language) from vocal samples, generating a reduced representation of the vocal sample and then applying a train...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	André Ferreira Gil, Matthias Eichner, Felix Schaeffler
Format:	Patent
Sprache:	eng
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Audio data is classified using voice descriptors (ie. desired style or voice quality) by extracting non-semantic speech features (eg. pitch, tone, volume, facial expression, gestures, body language) from vocal samples, generating a reduced representation of the vocal sample and then applying a trained model to predict the voice descriptor (eg. warm, professional, serious, conversational, direct, clear, friendly, natural, sensual, trustworthy, formal, see fig. 8). A second trained model may be applied to an intermediate representation based on a vocal profile (eg. vocal tract, muscular tension or phonetic features).