Recognizing object manipulation activities using depth and visual cues
•An algorithm that parses 3D-cloud representing the human body into torso and arms.•A technique to locate the held object, and incorporate size into a recognizer.•A temporal smoothing scheme to improve object and activity recognition. We propose a framework, consisting of several algorithms to recog...
Gespeichert in:
Veröffentlicht in: | Journal of visual communication and image representation 2014-05, Vol.25 (4), p.719-726 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •An algorithm that parses 3D-cloud representing the human body into torso and arms.•A technique to locate the held object, and incorporate size into a recognizer.•A temporal smoothing scheme to improve object and activity recognition.
We propose a framework, consisting of several algorithms to recognize human activities that involve manipulating objects. Our proposed algorithm identifies objects being manipulated and models high-level tasks being performed accordingly. Realistic settings for such tasks pose several problems for computer vision, including sporadic occlusion by subjects, non-frontal poses, and objects with few local features. We show how size and segmentation information derived from depth data can address these challenges using simple and fast techniques. In particular, we show how to robustly and without supervision find the manipulating hand, properly detect/recognize objects and properly use the temporal information to fill in the gaps between sporadically detected objects, all through careful inclusion of depth cues. We evaluate our approach on a challenging dataset of 12 kitchen tasks that involve 24 objects performed by 2 subjects. The entire framework yields 82%/84% precision (74%/83%recall) for task/object recognition. Our techniques outperform the state-of-the-art significantly in activity/object recognition. |
---|---|
ISSN: | 1047-3203 1095-9076 |
DOI: | 10.1016/j.jvcir.2013.03.015 |