MECCANO: A multimodal egocentric dataset for humans behavior understanding in the industrial-like domain

Wearable cameras allow to acquire images and videos from the user’s perspective. These data can be processed to understand humans behavior. Despite human behavior analysis has been thoroughly investigated in third person vision, it is still understudied in egocentric settings and in particular in in...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer vision and image understanding 2023-10, Vol.235, p.103764, Article 103764
Hauptverfasser: Ragusa, Francesco, Furnari, Antonino, Farinella, Giovanni Maria
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Wearable cameras allow to acquire images and videos from the user’s perspective. These data can be processed to understand humans behavior. Despite human behavior analysis has been thoroughly investigated in third person vision, it is still understudied in egocentric settings and in particular in industrial scenarios. To encourage research in this field, we present MECCANO, a multimodal dataset of egocentric videos to study humans behavior understanding in industrial-like settings. The multimodality is characterized by the presence of gaze signals, depth maps and RGB videos acquired simultaneously with a custom headset. The dataset has been explicitly labeled for fundamental tasks in the context of human behavior understanding from a first person view, such as recognizing and anticipating human–object interactions. With the MECCANO dataset, we explored six different tasks including (1) Action Recognition, (2) Active Objects Detection and Recognition, (3) Egocentric Human–Objects Interaction Detection, (4) Egocentric Gaze Estimation, (5) Action Anticipation and (6) Next-Active Objects Detection. We propose a benchmark aimed to study human behavior in the considered industrial-like scenario which demonstrates that the investigated tasks and the considered scenario are challenging for state-of-the-art algorithms. To support research in this field, we publicy release the dataset at https://iplab.dmi.unict.it/MECCANO/. •MECCANO is a new egocentric multimodal dataset related to the industrial-like domain.•We explored the HOI definition for the egocentric paradigm (EHOI).•We study the Next-Active Object Detection task from the egocentric perspective.•We show how the proposed MECCANO dataset is useful to study 6 different tasks.•We show the limits of current state-of-the-art approaches in the industrial settings.
ISSN:1077-3142
1090-235X
DOI:10.1016/j.cviu.2023.103764