Adaptive Generation of Privileged Intermediate Information for Visible-Infrared Person Re-Identification
Visible-infrared person re-identification seeks to retrieve images of the same individual captured over a distributed network of RGB and IR sensors. Several V-I ReID approaches directly integrate both V and I modalities to discriminate persons within a shared representation space. However, given the...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Visible-infrared person re-identification seeks to retrieve images of the
same individual captured over a distributed network of RGB and IR sensors.
Several V-I ReID approaches directly integrate both V and I modalities to
discriminate persons within a shared representation space. However, given the
significant gap in data distributions between V and I modalities, cross-modal
V-I ReID remains challenging. Some recent approaches improve generalization by
leveraging intermediate spaces that can bridge V and I modalities, yet
effective methods are required to select or generate data for such informative
domains. In this paper, the Adaptive Generation of Privileged Intermediate
Information training approach is introduced to adapt and generate a virtual
domain that bridges discriminant information between the V and I modalities.
The key motivation behind AGPI^2 is to enhance the training of a deep V-I ReID
backbone by generating privileged images that provide additional information.
These privileged images capture shared discriminative features that are not
easily accessible within the original V or I modalities alone. Towards this
goal, a non-linear generative module is trained with an adversarial objective,
translating V images into intermediate spaces with a smaller domain shift
w.r.t. the I domain. Meanwhile, the embedding module within AGPI^2 aims to
produce similar features for both V and generated images, encouraging the
extraction of features that are common to all modalities. In addition to these
contributions, AGPI^2 employs adversarial objectives for adapting the
intermediate images, which play a crucial role in creating a
non-modality-specific space to address the large domain shifts between V and I
domains. Experimental results conducted on challenging V-I ReID datasets
indicate that AGPI^2 increases matching accuracy without extra computational
resources during inference. |
---|---|
DOI: | 10.48550/arxiv.2307.03240 |