Video-based Person re-identification with parallel correction and fusion of pedestrian area features

Deep learning has provided powerful support for person re-identification (person re-id) over the years, and superior performance has been achieved by state-of-the-art. While under practical application scenarios such as public monitoring, the cameras' resolutions are usually 720p, the captured...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Mathematical biosciences and engineering : MBE 2023-01, Vol.20 (2), p.3504-3527
Hauptverfasser:	She, Liang, You, Meiyue, Wang, Jianyuan, Zeng, Yangyan
Format:	Artikel
Sprache:	eng
Schlagworte:	alignment deep learning feature fusion person re-identification pixel attention
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Deep learning has provided powerful support for person re-identification (person re-id) over the years, and superior performance has been achieved by state-of-the-art. While under practical application scenarios such as public monitoring, the cameras' resolutions are usually 720p, the captured pedestrian areas tend to be closer to 128×64 small pixel size. Research on person re-id at 128×64 small pixel size is limited by less effective pixel information. The frame image qualities are degraded and inter-frame information complementation requires a more careful selection of beneficial frames. Meanwhile, there are various large differences in person images, such as misalignment and image noise, which are harder to distinguish from person information at the small size, and eliminating a specific sub-variance is still not robust enough. The Person Feature Correction and Fusion Network (FCFNet) proposed in this paper introduces three sub-modules, which strive to extract discriminate video-level features from the perspectives of "using complementary valid information between frames" and "correcting large variances of person features". The inter-frame attention mechanism is introduced through frame quality assessment, guiding informative features to dominate the fusion process and generating a preliminary frame quality score to filter low-quality frames. Two other feature correction modules are fitted to optimize the model's ability to perceive information from small-sized images. The experiments on four benchmark datasets confirm the effectiveness of FCFNet.
ISSN:	1551-0018 1551-0018
DOI:	10.3934/mbe.2023164