Improved use of descriptors for early recognition of actions in video

Action recognition is a popular research topic in the computer vision community. A new trend has emerged in this field which seeks to recognise the action with as few frames as possible, called early action recognition. Visual bag-of-words methods that rely on local descriptors and visual words are...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2023, Vol.82 (2), p.2617-2633
Hauptverfasser: Saremi, Mehrin, Yaghmaee, Farzin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Action recognition is a popular research topic in the computer vision community. A new trend has emerged in this field which seeks to recognise the action with as few frames as possible, called early action recognition. Visual bag-of-words methods that rely on local descriptors and visual words are one of the tools that have been used in both offline and early action recognition. In this paper, we propose an improvement to bag-of-words approaches by means of what we name patterns , i.e. co-occurrences of visual words. We compare our method with basic bag-of-words. Experiments on benchmark datasets suggest that our method achieves better accuracy than simple bag-of-words. Also, our method performs better than some of the state of the art methods at some observation ratios. Furthermore, some methods proposed in the literature require segments or video partitions as their working unit. Our method, however, is more granular and can update its prediction as soon as a new descriptor arrives.
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-022-13316-x