VOICE COMMAND SCRUBBING

The invention is directed towards a an audio scrubbing system that allows for scrubbing recognized voice commands from audio data and replacing the recognized voice commands with environment audio data. Specifically, as a user captures video and audio data via a HMD, audio data captured by the HMD m...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	PARKINSON, Christopher Iain
Format:	Patent
Sprache:	eng ; fre
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention is directed towards a an audio scrubbing system that allows for scrubbing recognized voice commands from audio data and replacing the recognized voice commands with environment audio data. Specifically, as a user captures video and audio data via a HMD, audio data captured by the HMD may be processed by an audio scrubbing module to identify voice commands in the audio data that are used for controlling the HMD. When a voice command is identified in the audio data, timestamps corresponding to the voice command may be determined. Filler audio data may then be generated to imitate the environment by processing at least a portion of the audio data by a neural network of a machine learning model. The filler audio data may then be used to replace the audio data corresponding to the identified voice commands, thereby scrubbing the voice command from the audio data. L'invention concerne un système de repérage audio qui permet de repérer des commandes vocales reconnues parmi des données audio et de remplacer les commandes vocales reconnues par des données audio de l'environnement. En particulier, lorsqu'un utilisateur capture des données audio et vidéo par l'intermédiaire d'un visiocasque (HMD), les données audio capturées par le visiocasque peuvent être traitées par un module de repérage audio pour identifier parmi les données audio des commandes vocales servant à commander le visiocasque. Lorsqu'une commande vocale est identifiée parmi les données audio, des estampilles temporelles correspondant à la commande vocale peuvent être déterminées. Des données audio de remplissage peuvent ensuite être générées pour imiter l'environnement en faisant traiter au moins une partie des données audio par un réseau neuronal d'un modèle d'apprentissage machine. Les données audio de remplissage peuvent ensuite être utilisées pour remplacer les données audio correspondant aux commandes vocales identifiées, ce qui permet de repérer la commande vocale parmi les données audio.