Voice Signal Processing System
Audio data is classified using voice descriptors (ie. desired style or voice quality) by extracting non-semantic speech features (eg. pitch, tone, volume, facial expression, gestures, body language) from vocal samples, generating a reduced representation of the vocal sample and then applying a train...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Audio data is classified using voice descriptors (ie. desired style or voice quality) by extracting non-semantic speech features (eg. pitch, tone, volume, facial expression, gestures, body language) from vocal samples, generating a reduced representation of the vocal sample and then applying a trained model to predict the voice descriptor (eg. warm, professional, serious, conversational, direct, clear, friendly, natural, sensual, trustworthy, formal, see fig. 8). A second trained model may be applied to an intermediate representation based on a vocal profile (eg. vocal tract, muscular tension or phonetic features). |
---|