Tran-GCN: A Transformer-Enhanced Graph Convolutional Network for Person Re-Identification in Monitoring Videos
Person Re-Identification (Re-ID) has gained popularity in computer vision, enabling cross-camera pedestrian recognition. Although the development of deep learning has provided a robust technical foundation for person Re-ID research, most existing person Re-ID methods overlook the potential relations...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Person Re-Identification (Re-ID) has gained popularity in computer vision,
enabling cross-camera pedestrian recognition. Although the development of deep
learning has provided a robust technical foundation for person Re-ID research,
most existing person Re-ID methods overlook the potential relationships among
local person features, failing to adequately address the impact of pedestrian
pose variations and local body parts occlusion. Therefore, we propose a
Transformer-enhanced Graph Convolutional Network (Tran-GCN) model to improve
Person Re-Identification performance in monitoring videos. The model comprises
four key components: (1) A Pose Estimation Learning branch is utilized to
estimate pedestrian pose information and inherent skeletal structure data,
extracting pedestrian key point information; (2) A Transformer learning branch
learns the global dependencies between fine-grained and semantically meaningful
local person features; (3) A Convolution learning branch uses the basic ResNet
architecture to extract the person's fine-grained local features; (4) A Graph
Convolutional Module (GCM) integrates local feature information, global feature
information, and body information for more effective person identification
after fusion. Quantitative and qualitative analysis experiments conducted on
three different datasets (Market-1501, DukeMTMC-ReID, and MSMT17) demonstrate
that the Tran-GCN model can more accurately capture discriminative person
features in monitoring videos, significantly improving identification accuracy. |
---|---|
DOI: | 10.48550/arxiv.2409.09391 |