METHOD FOR DETECTING AN AUDIO ADVERSARIAL ATTACK WITH RESPECT TO A VOICE INPUT PROCESSED BY AN AUTOMATIC SPEECH RECOGNITION SYSTEM, CORRESPONDING DEVICE, COMPUTER PROGRAM PRODUCT AND COMPUTER-READABLE CARRIER MEDIUM

The disclosure relates to a method and device for detecting an audio adversarial attack with respect to a voice input (VI) processed by an automatic speech recognition system (ASR). The method includes: obtaining (12) a transcript (T) resulting from the processing, by the automatic speech recognitio...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	GAUTIER, Eric, NADEAU, Pascal, DELAUNAY, Christophe, GILBERTON, Philippe
Format:	Patent
Sprache:	eng ; fre
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	GAUTIER, Eric NADEAU, Pascal DELAUNAY, Christophe GILBERTON, Philippe
description	The disclosure relates to a method and device for detecting an audio adversarial attack with respect to a voice input (VI) processed by an automatic speech recognition system (ASR). The method includes: obtaining (12) a transcript (T) resulting from the processing, by the automatic speech recognition system, of an input audio signal of a voice input; converting (13) the transcript (T) into a synthesized audio signal (SAS); extracting (15, 15'), from the input audio signal and from the synthesized audio signal, acoustic features and converting them into sequences of feature vectors (sFV1, sFV2); computing (17) a dynamic time warping distance (D) between the sequences of converted features vectors; and delivering (18) a piece of data representative of a detection of an audio adversarial attack, as a function of a result of a comparison between the dynamic time warping distance and a predetermined threshold. La divulgation concerne un procédé et un dispositif de détection d'attaque audio antagoniste par rapport à une entrée vocale (VI) traitée par un système de reconnaissance automatique de la parole (ASR). Le procédé consiste à : obtenir (12) une transcription (T) obtenue du traitement, par le système de reconnaissance automatique de la parole, d'un signal audio d'entrée d'une entrée vocale ; convertir (13) la transcription (T) en un signal audio synthétisé (SAS) ; extraire (15, 15'), du signal audio d'entrée et du signal audio synthétisé, des caractéristiques acoustiques et les convertir en séquences de vecteurs de caractéristiques (sFV1, sFV2) ; calculer (17) une distance d'alignement temporel dynamique (D) entre les séquences de vecteurs de caractéristiques converties ; et délivrer (18) un élément de données indiquant une détection d'une attaque audio antagoniste, en fonction d'un résultat d'une comparaison entre la distance d'alignement temporel dynamique et un seuil prédéfini.
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_WO2022083969A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>WO2022083969A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_WO2022083969A13</originalsourceid><addsrcrecordid>eNqNjcFKA0EMhvfiQdR3CHi1UFuQ9phO0t1gZ7Jkslt6KkXWk2ihvquv4wyKZ08_JN___dfNV2TvlGCrBsTOwSW1gAlwIFFAGtkymuAO0B3DM-zFOzDOfWHBCwKjSmCQ1A8OvWngnJlgc_jRuEZ0CVAKHGozaJvERRPkQ3aODxDUqlAT1XHisfjqNRYjW3W2hrEmDWUUE_09Z8ZIuNkxBDSTQkcmGeJtc_V6ertMd79509xv2UM3m84fx-lyPr1M79Pnca-L-WIxXy3XT2t8XP6P-gYNllTu</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>METHOD FOR DETECTING AN AUDIO ADVERSARIAL ATTACK WITH RESPECT TO A VOICE INPUT PROCESSED BY AN AUTOMATIC SPEECH RECOGNITION SYSTEM, CORRESPONDING DEVICE, COMPUTER PROGRAM PRODUCT AND COMPUTER-READABLE CARRIER MEDIUM</title><source>esp@cenet</source><creator>GAUTIER, Eric ; NADEAU, Pascal ; DELAUNAY, Christophe ; GILBERTON, Philippe</creator><creatorcontrib>GAUTIER, Eric ; NADEAU, Pascal ; DELAUNAY, Christophe ; GILBERTON, Philippe</creatorcontrib><description>The disclosure relates to a method and device for detecting an audio adversarial attack with respect to a voice input (VI) processed by an automatic speech recognition system (ASR). The method includes: obtaining (12) a transcript (T) resulting from the processing, by the automatic speech recognition system, of an input audio signal of a voice input; converting (13) the transcript (T) into a synthesized audio signal (SAS); extracting (15, 15'), from the input audio signal and from the synthesized audio signal, acoustic features and converting them into sequences of feature vectors (sFV1, sFV2); computing (17) a dynamic time warping distance (D) between the sequences of converted features vectors; and delivering (18) a piece of data representative of a detection of an audio adversarial attack, as a function of a result of a comparison between the dynamic time warping distance and a predetermined threshold. La divulgation concerne un procédé et un dispositif de détection d'attaque audio antagoniste par rapport à une entrée vocale (VI) traitée par un système de reconnaissance automatique de la parole (ASR). Le procédé consiste à : obtenir (12) une transcription (T) obtenue du traitement, par le système de reconnaissance automatique de la parole, d'un signal audio d'entrée d'une entrée vocale ; convertir (13) la transcription (T) en un signal audio synthétisé (SAS) ; extraire (15, 15'), du signal audio d'entrée et du signal audio synthétisé, des caractéristiques acoustiques et les convertir en séquences de vecteurs de caractéristiques (sFV1, sFV2) ; calculer (17) une distance d'alignement temporel dynamique (D) entre les séquences de vecteurs de caractéristiques converties ; et délivrer (18) un élément de données indiquant une détection d'une attaque audio antagoniste, en fonction d'un résultat d'une comparaison entre la distance d'alignement temporel dynamique et un seuil prédéfini.</description><language>eng ; fre</language><subject>ACOUSTICS ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2022</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20220428&DB=EPODOC&CC=WO&NR=2022083969A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,776,881,25543,76293</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20220428&DB=EPODOC&CC=WO&NR=2022083969A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>GAUTIER, Eric</creatorcontrib><creatorcontrib>NADEAU, Pascal</creatorcontrib><creatorcontrib>DELAUNAY, Christophe</creatorcontrib><creatorcontrib>GILBERTON, Philippe</creatorcontrib><title>METHOD FOR DETECTING AN AUDIO ADVERSARIAL ATTACK WITH RESPECT TO A VOICE INPUT PROCESSED BY AN AUTOMATIC SPEECH RECOGNITION SYSTEM, CORRESPONDING DEVICE, COMPUTER PROGRAM PRODUCT AND COMPUTER-READABLE CARRIER MEDIUM</title><description>The disclosure relates to a method and device for detecting an audio adversarial attack with respect to a voice input (VI) processed by an automatic speech recognition system (ASR). The method includes: obtaining (12) a transcript (T) resulting from the processing, by the automatic speech recognition system, of an input audio signal of a voice input; converting (13) the transcript (T) into a synthesized audio signal (SAS); extracting (15, 15'), from the input audio signal and from the synthesized audio signal, acoustic features and converting them into sequences of feature vectors (sFV1, sFV2); computing (17) a dynamic time warping distance (D) between the sequences of converted features vectors; and delivering (18) a piece of data representative of a detection of an audio adversarial attack, as a function of a result of a comparison between the dynamic time warping distance and a predetermined threshold. La divulgation concerne un procédé et un dispositif de détection d'attaque audio antagoniste par rapport à une entrée vocale (VI) traitée par un système de reconnaissance automatique de la parole (ASR). Le procédé consiste à : obtenir (12) une transcription (T) obtenue du traitement, par le système de reconnaissance automatique de la parole, d'un signal audio d'entrée d'une entrée vocale ; convertir (13) la transcription (T) en un signal audio synthétisé (SAS) ; extraire (15, 15'), du signal audio d'entrée et du signal audio synthétisé, des caractéristiques acoustiques et les convertir en séquences de vecteurs de caractéristiques (sFV1, sFV2) ; calculer (17) une distance d'alignement temporel dynamique (D) entre les séquences de vecteurs de caractéristiques converties ; et délivrer (18) un élément de données indiquant une détection d'une attaque audio antagoniste, en fonction d'un résultat d'une comparaison entre la distance d'alignement temporel dynamique et un seuil prédéfini.</description><subject>ACOUSTICS</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2022</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNjcFKA0EMhvfiQdR3CHi1UFuQ9phO0t1gZ7Jkslt6KkXWk2ihvquv4wyKZ08_JN___dfNV2TvlGCrBsTOwSW1gAlwIFFAGtkymuAO0B3DM-zFOzDOfWHBCwKjSmCQ1A8OvWngnJlgc_jRuEZ0CVAKHGozaJvERRPkQ3aODxDUqlAT1XHisfjqNRYjW3W2hrEmDWUUE_09Z8ZIuNkxBDSTQkcmGeJtc_V6ertMd79509xv2UM3m84fx-lyPr1M79Pnca-L-WIxXy3XT2t8XP6P-gYNllTu</recordid><startdate>20220428</startdate><enddate>20220428</enddate><creator>GAUTIER, Eric</creator><creator>NADEAU, Pascal</creator><creator>DELAUNAY, Christophe</creator><creator>GILBERTON, Philippe</creator><scope>EVB</scope></search><sort><creationdate>20220428</creationdate><title>METHOD FOR DETECTING AN AUDIO ADVERSARIAL ATTACK WITH RESPECT TO A VOICE INPUT PROCESSED BY AN AUTOMATIC SPEECH RECOGNITION SYSTEM, CORRESPONDING DEVICE, COMPUTER PROGRAM PRODUCT AND COMPUTER-READABLE CARRIER MEDIUM</title><author>GAUTIER, Eric ; NADEAU, Pascal ; DELAUNAY, Christophe ; GILBERTON, Philippe</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_WO2022083969A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng ; fre</language><creationdate>2022</creationdate><topic>ACOUSTICS</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>GAUTIER, Eric</creatorcontrib><creatorcontrib>NADEAU, Pascal</creatorcontrib><creatorcontrib>DELAUNAY, Christophe</creatorcontrib><creatorcontrib>GILBERTON, Philippe</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>GAUTIER, Eric</au><au>NADEAU, Pascal</au><au>DELAUNAY, Christophe</au><au>GILBERTON, Philippe</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>METHOD FOR DETECTING AN AUDIO ADVERSARIAL ATTACK WITH RESPECT TO A VOICE INPUT PROCESSED BY AN AUTOMATIC SPEECH RECOGNITION SYSTEM, CORRESPONDING DEVICE, COMPUTER PROGRAM PRODUCT AND COMPUTER-READABLE CARRIER MEDIUM</title><date>2022-04-28</date><risdate>2022</risdate><abstract>The disclosure relates to a method and device for detecting an audio adversarial attack with respect to a voice input (VI) processed by an automatic speech recognition system (ASR). The method includes: obtaining (12) a transcript (T) resulting from the processing, by the automatic speech recognition system, of an input audio signal of a voice input; converting (13) the transcript (T) into a synthesized audio signal (SAS); extracting (15, 15'), from the input audio signal and from the synthesized audio signal, acoustic features and converting them into sequences of feature vectors (sFV1, sFV2); computing (17) a dynamic time warping distance (D) between the sequences of converted features vectors; and delivering (18) a piece of data representative of a detection of an audio adversarial attack, as a function of a result of a comparison between the dynamic time warping distance and a predetermined threshold. La divulgation concerne un procédé et un dispositif de détection d'attaque audio antagoniste par rapport à une entrée vocale (VI) traitée par un système de reconnaissance automatique de la parole (ASR). Le procédé consiste à : obtenir (12) une transcription (T) obtenue du traitement, par le système de reconnaissance automatique de la parole, d'un signal audio d'entrée d'une entrée vocale ; convertir (13) la transcription (T) en un signal audio synthétisé (SAS) ; extraire (15, 15'), du signal audio d'entrée et du signal audio synthétisé, des caractéristiques acoustiques et les convertir en séquences de vecteurs de caractéristiques (sFV1, sFV2) ; calculer (17) une distance d'alignement temporel dynamique (D) entre les séquences de vecteurs de caractéristiques converties ; et délivrer (18) un élément de données indiquant une détection d'une attaque audio antagoniste, en fonction d'un résultat d'une comparaison entre la distance d'alignement temporel dynamique et un seuil prédéfini.</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	eng ; fre
recordid	cdi_epo_espacenet_WO2022083969A1
source	esp@cenet
subjects	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
title	METHOD FOR DETECTING AN AUDIO ADVERSARIAL ATTACK WITH RESPECT TO A VOICE INPUT PROCESSED BY AN AUTOMATIC SPEECH RECOGNITION SYSTEM, CORRESPONDING DEVICE, COMPUTER PROGRAM PRODUCT AND COMPUTER-READABLE CARRIER MEDIUM
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T10%3A23%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=GAUTIER,%20Eric&rft.date=2022-04-28&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EWO2022083969A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true