Learning Identity-Consistent Feature for Cross-Modality Person Re-Identification via Pixel and Feature Alignment

RGB-IR cross-modality person re-identification (ReID) can be seen as a multicamera retrieval problem that aims to match pedestrian images captured by visible and infrared cameras. Most of the existing methods focus on reducing modality differences through feature representation learning. However, th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Mobile information systems 2022-10, Vol.2022, p.1-9
Hauptverfasser:	Chan, Sixian, Du, Feng, Lei, Yanjing, Lai, Zhounian, Mao, Jiafa, Li, Chao
Format:	Artikel
Sprache:	eng
Schlagworte:	Alignment Cameras Discriminators Feature extraction Image quality Image retrieval Infrared cameras Learning Modules Pixels
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	RGB-IR cross-modality person re-identification (ReID) can be seen as a multicamera retrieval problem that aims to match pedestrian images captured by visible and infrared cameras. Most of the existing methods focus on reducing modality differences through feature representation learning. However, they ignore the huge difference in pixel space between the two modalities. Unlike these methods, we utilize the pixel and feature alignment network (PFANet) to reduce modal differences in pixel space while aligning features in feature space in this paper. Our model contains three components, including a feature extractor, a generator, and a joint discriminator. Like previous methods, the generator and the joint discriminator are used to generate high-quality cross-modality images; however, we make substantial improvements to the feature extraction module. Firstly, we fuse batch normalization and global attention (BNG) which can pay attention to channel information while conducting information interaction between channels and spaces. Secondly, to alleviate the modal difference in feature space, we propose the modal mitigation module (MMM). Then, by jointly training the entire model, our model is able to not only mitigate the cross-modality and intramodality variations but also learn identity-consistent features. Finally, extensive experimental results show that our model outperforms other methods. On the SYSU-MM01 dataset, our model achieves a rank-1 accuracy of 40.83% and an mAP of 39.84%.
ISSN:	1574-017X 1875-905X
DOI:	10.1155/2022/4131322