CUT AND PASTE SPOOFING DETECTION USING DYNAMIC TIME WRAPING

The invention refers to a method for comparing voice utterances, the method comprising the steps: extracting a plurality of features (201) from a first voice utterance of a given text sample and extracting a plurality of features (201) from a second voice utterance of said given text sample, wherein...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	GARCLA GOMAR, MARTA, VILLALBA LOPEZ, JESUS, ANTONIO, VARELA REDONDO, SARA, LLEIDA SOLANO, EDUARDO, ORTEGA GIMENEZ, ALFONSO
Format:	Patent
Sprache:	eng ; fre
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention refers to a method for comparing voice utterances, the method comprising the steps: extracting a plurality of features (201) from a first voice utterance of a given text sample and extracting a plurality of features (201) from a second voice utterance of said given text sample, wherein each feature is extracted as a function of time, and wherein each feature of the second voice utterance corresponds to a feature of the first voice utterance; applying dynamic time warping (202) to one or more time dependent characteristics of the first and/or second voice utterance e.g. by minimizing one or more distance measures, wherein a distance measure is a measure for the difference of a time dependent characteristic of the first voice utterance and a corresponding time dependent characteristic of the second voice utterance, and wherein a time dependent characteristic of a voice utterance is a time dependent characteristic of either a single feature or a combination of two or more features; calculating a total distance measure (203), wherein the total distance measure is a measure for the difference between the first voice utterance of the given text sample and the second voice utterance of said given text sample, wherein the total distance measure is calculated based on one or more pairs of said time dependent characteristic, and wherein a pair of time dependent characteristic is composed of a time dependent characteristic of the first or second voice utterance and of a dynamically time warped (202) time dependent characteristic of the respectively second or first voice utterance, or wherein a pair of time dependent characteristic is composed of a dynamically time warped (202) time dependent characteristic of the first voice utterance and of a dynamically time warped (202) time dependent characteristic of the second voice utterance. La présente invention concerne un procédé permettant la comparaison d'énoncés vocaux, le procédé comprenant les étapes suivantes: l'extraction d'une pluralité d'éléments distinctifs (201) à partir d'un premier énoncé vocal d'un échantillon de texte donné et l'extraction d'une pluralité d'éléments distinctifs (201) à partir d'un second énoncé vocal dudit échantillon de texte, chaque élément distinctif étant extrait en tant que fonction du temps, et chaque élément distinctif du second énoncé vocal correspondant à un élément distinctif du premier énoncé vocal; l'application d'une comparaison dynamique (DTW) (202) d'une ou de pl