Spatial Audio Upscaling Using Machine Learning

A sound scene is represented as first order Ambisonics (FOA) audio. A processor formats each signal of the FOA audio to a stream of audio frames, provides the formatted FOA audio to a machine learning model that reformats the formatted FOA audio in a target or desired higher order Ambisonics (HOA) f...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Atkins, Joshua D, Souden, Mehrez, Nawfal, Ismael H, Delikaris Manias, Symeon
Format:	Patent
Sprache:	eng
Schlagworte:	ACOUSTICS ELECTRIC COMMUNICATION TECHNIQUE ELECTRICITY MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION STEREOPHONIC SYSTEMS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A sound scene is represented as first order Ambisonics (FOA) audio. A processor formats each signal of the FOA audio to a stream of audio frames, provides the formatted FOA audio to a machine learning model that reformats the formatted FOA audio in a target or desired higher order Ambisonics (HOA) format, and obtains output audio of the sound scene in the desired HOA format from the machine learning model. The output audio in the desired HOA format may then be rendered according to a playback audio format of choice. Other aspects are also described and claimed.