SYSTEMS AND METHODS FOR CROSS-MODAL RETRIEVAL BASED ON A SOUND MODALITY AND A NON-SOUND MODALITY

Systems and methods for cross-modal retrieval are provided. According to one aspect, a method for cross-modal retrieval includes obtaining a query describing a sound using a query modality other than a sound modality; encoding the query to obtain a query embedding using a query encoder network for t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Wu, Ho-Hsiang, Nieto, Oriol, Salamon, Justin Jonathan
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Systems and methods for cross-modal retrieval are provided. According to one aspect, a method for cross-modal retrieval includes obtaining a query describing a sound using a query modality other than a sound modality; encoding the query to obtain a query embedding using a query encoder network for the query modality and a query projection network, wherein the query projection network includes a self-attention layer, and wherein the query embedding is in a joint embedding space for the query modality and the sound modality; and providing a response including an audio sample based on the query embedding, wherein the audio sample includes the sound.