SPEECH ACTIVITY DETECTION USING DUAL SENSORY BASED LEARNING

A dual sensory input speech detection method includes receiving, at a first time, a first video image input of a conference participant of the video conference and a first audio input of the conference participant; communicating the first video image input to the video conference; identifying the fi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	SIRCAR, Shiladitya
Format:	Patent
Sprache:	eng ; fre ; ger
Schlagworte:	ACOUSTICS ELECTRIC COMMUNICATION TECHNIQUE ELECTRICITY MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION TELEPHONIC COMMUNICATION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A dual sensory input speech detection method includes receiving, at a first time, a first video image input of a conference participant of the video conference and a first audio input of the conference participant; communicating the first video image input to the video conference; identifying the first video image input as a first facial image of the conference participant; determining, based on the first facial image, the first video image input indicates the conference participant is in a speaking state; identifying the first audio input as a first speech sound; determining, while in the speaking state, the first speech sound originates from the conference participant; and communicating the first audio input to an audio output for the video conference.