A triple-path global–local feature complementary network for visible-infrared person re-identification

Cross-modality visible-infrared person re-identification (VI-ReID) aims to match visible and infrared images of pedestrians from different cameras. Most existing VI-ReID methods learn global features of pedestrians from the original image subspace. However, they are not only susceptible to backgroun...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Signal, image and video processing image and video processing, 2024-02, Vol.18 (1), p.911-921
Hauptverfasser: Guo, Jiangtao, Ye, Yanfang, Du, Haishun, Hao, Xinxin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Cross-modality visible-infrared person re-identification (VI-ReID) aims to match visible and infrared images of pedestrians from different cameras. Most existing VI-ReID methods learn global features of pedestrians from the original image subspace. However, they are not only susceptible to background clutter, but also do not explicitly handle the discrepancy between the two modalities. In addition, some local-based person re-identification methods extract the local features of pedestrians by slicing pedestrian feature maps. However, most of them simply concatenate these local features to obtain the final local features of pedestrians, ignoring the importance of each of these local features. To this end, we propose a triple-path global–local feature complementary network (TGLFC-Net). Specifically, we introduce intermediate modality images to weaken the impact of modality discrepancy and thus obtain the robust global features of pedestrians. Moreover, we design a local comprehensive discriminative feature mining module, which improves the network’s capability of mining the local comprehensive discriminative features of pedestrians by performing dynamic weighted fusion of local features. Since the final representations of pedestrians incorporate the robust global features and the local comprehensive discriminative features, they have stronger robustness and discriminative capability. In addition, we also design a weighted regularization center triplet loss, which can not only eliminate the negative impact of anomalous triplets, but also reduce the computational complexity of the network. Experimental results on RegDB and SYSU-MM01 datasets demonstrate that TGLFC-Net can achieve a satisfactory VI-ReID performance. In particular, it achieves 92.36% Rank-1 accuracy and 80.32% mAP on the RegDB dataset, respectively.
ISSN:1863-1703
1863-1711
DOI:10.1007/s11760-023-02789-4