Text-based person search by non-saliency enhancing and dynamic label smoothing

The current text-based person re-identification (re-ID) models tend to learn salient features of image and text, which however is prone to failure in identifying persons with very similar dress, because their image contents with observable but indescribable difference may have identical textual desc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural computing & applications 2024-07, Vol.36 (21), p.13327-13339
Hauptverfasser: Pang, Yonghua, Zhang, Canlong, Li, Zhixin, Wei, Chunrong, Wang, Zhiwen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The current text-based person re-identification (re-ID) models tend to learn salient features of image and text, which however is prone to failure in identifying persons with very similar dress, because their image contents with observable but indescribable difference may have identical textual description. To address this problem, we propose a re-ID model based on saliency masking to learn non-salient but highly discriminative features, which can work together with the salient features to provide more robust pedestrian identification. To further improve the performance of the model, a cross-modal projection matching loss with dynamic label smoothing (named CMPM-DS) is proposed to train our model, and our CMPM-DS can adaptively adjust the smoothing degree of the true distribution. We conduct extensive ablation and comparison experiments on two popular re-ID benchmarks to demonstrate the efficiency of our model and loss function, and our model achieves SOTA, improving the existing best R@1 by 0.33% on CUHK-PEDE and 4.45% on RSTPReID.
ISSN:0941-0643
1433-3058
DOI:10.1007/s00521-024-09691-1