k-Reciprocal Harmonious Attention Network for Video-Based Person Re-Identification

Video-based person re-identification aims to retrieve video sequences of the same person in the multi-camera system. In this paper, we propose a k -reciprocal harmonious attention network (KHAN) to jointly learn discriminative spatiotemporal features and the similarity metrics. In KHAN, the harmoni...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2019, Vol.7, p.22457-22470
Hauptverfasser: Su, Xinxing, Qu, Xiaoye, Zou, Zhikang, Zhou, Pan, Wei, Wei, Wen, Shiping, Hu, Menglan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Video-based person re-identification aims to retrieve video sequences of the same person in the multi-camera system. In this paper, we propose a k -reciprocal harmonious attention network (KHAN) to jointly learn discriminative spatiotemporal features and the similarity metrics. In KHAN, the harmonious attention module adaptively calibrates response at each spatial position and each channel by explicitly inspecting position-wise and channel-wise interactions over feature maps. Besides, the k -reciprocal attention module attends key features from all frame-level features with a discriminative feature selection algorithm; thus, useful temporal information within contextualized key features can be assimilated to produce more robust clip-level representation. Compared with commonly used local-context based approaches, the proposed KHAN captures long dependency of different spatial regions and visual patterns while incorporating informative context at each time-step in a non-parametric manner. The extensive experiments on three public benchmark datasets show that the performance of our proposed approach outperforms the state-of-the-art methods.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2019.2898269