DYNAMIC TEMPORAL FUSION FOR VIDEO RECOGNITION

Systems and techniques are described herein for performing dynamic temporal fusion for video classification, such as recognition, detection, and/or other form of classification. For example, a computing device can generate, via a first network, frame-level features obtained from a set of input frame...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: YUN, Sungrack, LEE, Juntae
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Systems and techniques are described herein for performing dynamic temporal fusion for video classification, such as recognition, detection, and/or other form of classification. For example, a computing device can generate, via a first network, frame-level features obtained from a set of input frames. The computing device can generate, via a first multi-scale temporal feature fusion engine, first local temporal context features from a first neighboring sub-sequence of the set of input frames. The computing device can generate, via a second multi-scale temporal feature fusion engine, second local temporal context features from a second neighboring sub-sequence of the set of input frames. The computing device can further classify the set of input frames based on the first local temporal context features and the second local temporal context features.