Exemplar-based action recognition in video

In this work, we present a method for action localization and recognition using an exemplar-based approach. It starts from local dense yet scale-invariant spatio-temporal features. The most discriminative visual words are selected and used to cast bounding box hypotheses, which are then verified and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Willems, Geert, Becker, Jan Hendrik, Tuytelaars, Tinne, Van Gool, Luc
Format: Tagungsbericht
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this work, we present a method for action localization and recognition using an exemplar-based approach. It starts from local dense yet scale-invariant spatio-temporal features. The most discriminative visual words are selected and used to cast bounding box hypotheses, which are then verified and further grouped into the final detections. To the best of our knowledge, we are the first to extend the exemplar-based approach using local features into the spatio-temporal domain. This allows us to avoid the problems that typically plague sliding window-based approaches - in particular the exhaustive search over spatial coordinates, time, and spatial as well as temporal scales. We report state-of-the-art results on challenging datasets, extracted from real movies, for both classification and localization. © 2009. The copyright of this document resides with its authors.