Attribute-Image Person Re-identification via Modal-Consistent Metric Learning

Attribute-image person re-identification (AIPR) is a cross-modal retrieval task that searches person images who meet a list of attributes. Due to large modal gaps between attributes and images, current AIPR methods generally depend on cross-modal feature alignment, but they do not pay enough attenti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of computer vision 2023-11, Vol.131 (11), p.2959-2976
Hauptverfasser: Zhu, Jianqing, Liu, Liu, Zhan, Yibing, Zhu, Xiaobin, Zeng, Huanqiang, Tao, Dacheng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Attribute-image person re-identification (AIPR) is a cross-modal retrieval task that searches person images who meet a list of attributes. Due to large modal gaps between attributes and images, current AIPR methods generally depend on cross-modal feature alignment, but they do not pay enough attention to similarity metric jitters among varying modal configurations (i.e., attribute probe vs. image gallery, image probe vs. attribute gallery, image probe vs. image gallery, and attribute probe vs. attribute gallery). In this paper, we propose a modal-consistent metric learning (MCML) method that stably measures comprehensive similarities between attributes and images. Our MCML is with favorable properties that differ in two significant ways from previous methods. First, MCML provides a complete multi-modal triplet (CMMT) loss function that pulls the distance between the farthest positive pair as close as possible while pushing the distance between the nearest negative pair as far as possible, independent of their modalities. Second, MCML develops a modal-consistent matching regularization (MCMR) to reduce the diversity of matching matrices and guide consistent matching behaviors on varying modal configurations. Therefore, our MCML integrates the CMMT loss function and MCMR, requiring no complex cross-modal feature alignments. Theoretically, we offer the generalization bound to establish the stability of our MCML model by applying on-average stability. Experimentally, extensive results on PETA and Market-1501 datasets show that the proposed MCML is superior to the state-of-the-art approaches.
ISSN:0920-5691
1573-1405
DOI:10.1007/s11263-023-01841-7