k-Reciprocal Harmonious Attention Network for Video-Based Person Re-Identification
Video-based person re-identification aims to retrieve video sequences of the same person in the multi-camera system. In this paper, we propose a k -reciprocal harmonious attention network (KHAN) to jointly learn discriminative spatiotemporal features and the similarity metrics. In KHAN, the harmoni...
Gespeichert in:
Veröffentlicht in: | IEEE access 2019, Vol.7, p.22457-22470 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Video-based person re-identification aims to retrieve video sequences of the same person in the multi-camera system. In this paper, we propose a k -reciprocal harmonious attention network (KHAN) to jointly learn discriminative spatiotemporal features and the similarity metrics. In KHAN, the harmonious attention module adaptively calibrates response at each spatial position and each channel by explicitly inspecting position-wise and channel-wise interactions over feature maps. Besides, the k -reciprocal attention module attends key features from all frame-level features with a discriminative feature selection algorithm; thus, useful temporal information within contextualized key features can be assimilated to produce more robust clip-level representation. Compared with commonly used local-context based approaches, the proposed KHAN captures long dependency of different spatial regions and visual patterns while incorporating informative context at each time-step in a non-parametric manner. The extensive experiments on three public benchmark datasets show that the performance of our proposed approach outperforms the state-of-the-art methods. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2019.2898269 |