Egocentric Meets Top-View

Thanks to the availability and increasing popularity of wearable devices such as GoPro cameras, smart phones, and glasses, we have access to a plethora of videos captured from first person perspective. Surveillance cameras and Unmanned Aerial Vehicles (UAVs) also offer tremendous amounts of video da...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2019-06, Vol.41 (6), p.1353-1366
Hauptverfasser: Ardeshir, Shervin, Borji, Ali
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Thanks to the availability and increasing popularity of wearable devices such as GoPro cameras, smart phones, and glasses, we have access to a plethora of videos captured from first person perspective. Surveillance cameras and Unmanned Aerial Vehicles (UAVs) also offer tremendous amounts of video data recorded from top and oblique view points. Egocentric and surveillance vision have been studied extensively but separately in the computer vision community. The relationship between these two domains, however, remains unexplored. In this study, we make the first attempt in this direction by addressing two basic yet challenging questions. First, having a set of egocentric videos and a top-view surveillance video, does the top-view video contain all or some of the egocentric viewers? In other words, have these videos been shot in the same environment at the same time? Second, if so, can we identify the egocentric viewers in the top-view video? These problems can become extremely challenging when videos are not temporally aligned. Each view, egocentric or top, is modeled by a graph and the assignment and time-delays are computed iteratively using the spectral graph matching framework. We evaluate our method in terms of ranking and assigning egocentric viewers to identities present in the top-view video over a dataset of 50 top-view and 188 egocentric videos captured under different conditions. We also evaluate the capability of our proposed approaches in terms of temporal alignment. The experiments and results demonstrate the capability of the proposed approaches in terms of jointly addressing the temporal alignment and assignment tasks.
ISSN:0162-8828
1939-3539
2160-9292
DOI:10.1109/TPAMI.2018.2832121