Positional audio metadata generation

At a video conference endpoint including a camera, a microphone array, and one or more microphone assemblies, the video conference endpoint may divide a video output of the camera into one or more tracking sectors and detect a head position for each participant in the video output. The video confere...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Stuan, Øivind, Nielsen, Johan Ludvig
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:At a video conference endpoint including a camera, a microphone array, and one or more microphone assemblies, the video conference endpoint may divide a video output of the camera into one or more tracking sectors and detect a head position for each participant in the video output. The video conference endpoint may determine within which tracking sector each detected head position is located. The video conference endpoint may determine active sound source positions of the actively speaking participants based on sound being detected or captured by the microphone array and microphone assemblies, and may determine within which tracking sector the active sound source positions are located. For each tracking sector that contains an active sound source position, the video conference endpoint may update the positional audio metadata for that particular tracking sector based on the active sound source positions and the detected head positions located in that tracking sector.