Intra-Inter Domain Similarity for Unsupervised Person Re-Identification

Most of unsupervised person Re-Identification (ReID) works produce pseudo-labels by measuring the feature similarity without considering the domain discrepancy among cameras, leading to degraded accuracy in pseudo-label computation across cameras. This paper targets to address this challenge by deco...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence 2024-03, Vol.46 (3), p.1711-1726
Hauptverfasser:	Xuan, Shiyu, Zhang, Shiliang
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Adaptation models Cameras Computation Convolutional neural networks Datasets domain generalization Feature extraction Labels Person re-identification Robustness Similarity Task analysis Training unsupervised learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Most of unsupervised person Re-Identification (ReID) works produce pseudo-labels by measuring the feature similarity without considering the domain discrepancy among cameras, leading to degraded accuracy in pseudo-label computation across cameras. This paper targets to address this challenge by decomposing the similarity computation into two stages, i.e., the intra-domain and inter-domain computations, respectively. The intra-domain similarity directly leverages CNN features learned within each camera, hence generates pseudo-labels on different cameras to train the ReID model in a multi-branch network. The inter-domain similarity considers the classification scores of each sample on different cameras as a new feature vector. This new feature effectively alleviates the domain discrepancy among cameras and generates more reliable pseudo-labels. We further propose the Instance and Camera Style Normalization (ICSN) to enhance the robustness to domain discrepancy. ICSN alleviates the intra-camera variations by adaptively learning a combination of instance and batch normalization. ICSN also boosts the robustness to inter-camera variations through TNorm which converts the original style of features into target styles. The proposed method achieves competitive performance on multiple datasets under fully unsupervised, intra-camera supervised and domain generalization settings, e.g., it achieves rank-1 accuracy of 64.4% on the MSMT17 dataset, outperforming the recent unsupervised methods by 20+%.
ISSN:	0162-8828 1939-3539 2160-9292
DOI:	10.1109/TPAMI.2022.3163451