Conditional Object-Centric Learning with Slot Attention for Video and Other Sequential Data

A method includes obtaining first feature vectors and second feature vectors representing contents of a first and second image frame, respectively, of an input video. The method may also include generating, based on the first feature vectors, first slot vectors, where each slot vector represents att...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Dosovitskiy, Alexey, Mahendran, Aravindh, Aghdam, Sara Sabour Rouh, Jonschkowski, Rico, Heigold, Georg, Kipf, Thomas, Greff, Klaus, Stone, Austin Charles, Elsayed, Gamaleldin
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A method includes obtaining first feature vectors and second feature vectors representing contents of a first and second image frame, respectively, of an input video. The method may also include generating, based on the first feature vectors, first slot vectors, where each slot vector represents attributes of a corresponding entity as represented in the first image frame, and generating, based on the first slot vectors, predicted slot vectors including a corresponding predicted slot vector that represents a transition of the attributes of the corresponding entity from the first to the second image frame. The method may additionally include generating, based on the predicted slot vectors and the second feature vectors, second slot vectors including a corresponding slot vector that represents the attributes of the corresponding entity as represented in the second image frame, and determining an output based on the predicted slot vectors or the second slot vectors.