Video person re-identification based on convolutional neural network and Transformer

o solve the problem of poor effect of person feature extraction using only convolutional neural network in the field of video person re-identification,a network model ResTNet（ResNet and Transformer Network） based on convolutional neural network and Transformer was proposed.ResNet50 network was used...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	河南理工大学学报. 自然科学版 2023-01, Vol.42 (6), p.149
Hauptverfasser:	Zhao, Yanru, Niu, Dongjie, Sun, Donghong, Yang, Huimeng
Format:	Artikel
Sprache:	chi
Schlagworte:	Artificial neural networks Datasets Feature extraction Feature maps Neural networks Pedestrians Transformers
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	o solve the problem of poor effect of person feature extraction using only convolutional neural network in the field of video person re-identification,a network model ResTNet（ResNet and Transformer Network） based on convolutional neural network and Transformer was proposed.ResNet50 network was used to obtain local features and the output of its middle layer was input to Transformer as prior knowledge in ResTNet.In the Transformer branch,the size of the feature map was continuously reduced,the field of perception was expanded,and the relationship among local features was fully explored to generate the global features of pedestrians,while the model computation was decreased with the shift window method.The Rank-1 and mAP on the large-scale MARS dataset reached 86.8% and 80.3%,respectively,which were 3.8% and 3.3% higher than the benchmark. Meanwhile,excellent performance was also achieved on the two smallscale datasets.In this paper,not only the Transformer model was successfully applied to the field of video p
ISSN:	1673-9787
DOI:	10.16186/j.cnki.1673-9787.2021120013