DYNAMIC TEMPORAL FUSION FOR VIDEO RECOGNITION
Systems and techniques are described herein for performing dynamic temporal fusion for video classification, such as recognition, detection, and/or other form of classification. For example, a computing device can generate, via a first network, frame-level features obtained from a set of input frame...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Systems and techniques are described herein for performing dynamic temporal fusion for video classification, such as recognition, detection, and/or other form of classification. For example, a computing device can generate, via a first network, frame-level features obtained from a set of input frames. The computing device can generate, via a first multi-scale temporal feature fusion engine, first local temporal context features from a first neighboring sub-sequence of the set of input frames. The computing device can generate, via a second multi-scale temporal feature fusion engine, second local temporal context features from a second neighboring sub-sequence of the set of input frames. The computing device can further classify the set of input frames based on the first local temporal context features and the second local temporal context features. |
---|