Text-based person search by non-saliency enhancing and dynamic label smoothing
The current text-based person re-identification (re-ID) models tend to learn salient features of image and text, which however is prone to failure in identifying persons with very similar dress, because their image contents with observable but indescribable difference may have identical textual desc...
Gespeichert in:
Veröffentlicht in: | Neural computing & applications 2024-07, Vol.36 (21), p.13327-13339 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The current text-based person re-identification (re-ID) models tend to learn salient features of image and text, which however is prone to failure in identifying persons with very similar dress, because their image contents with observable but indescribable difference may have identical textual description. To address this problem, we propose a re-ID model based on saliency masking to learn non-salient but highly discriminative features, which can work together with the salient features to provide more robust pedestrian identification. To further improve the performance of the model, a cross-modal projection matching loss with dynamic label smoothing (named CMPM-DS) is proposed to train our model, and our CMPM-DS can adaptively adjust the smoothing degree of the true distribution. We conduct extensive ablation and comparison experiments on two popular re-ID benchmarks to demonstrate the efficiency of our model and loss function, and our model achieves SOTA, improving the existing best R@1 by 0.33% on CUHK-PEDE and 4.45% on RSTPReID. |
---|---|
ISSN: | 0941-0643 1433-3058 |
DOI: | 10.1007/s00521-024-09691-1 |