SCGTracker: Spatio-temporal correlation and graph neural networks for multiple object tracking

Multiple Object Tracking is an important vitally important fundamental task in computer vision. Visual tracking becomes challenging when objects move in groups and are obscured from each other. There are two mainstream solution strategies for these group models. One is to transform the data associat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition 2024-05, Vol.149, p.110249, Article 110249
Hauptverfasser: Zhang, Yajuan, Liang, Yongquan, Leng, Jiaxu, Wang, Zhihui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Multiple Object Tracking is an important vitally important fundamental task in computer vision. Visual tracking becomes challenging when objects move in groups and are obscured from each other. There are two mainstream solution strategies for these group models. One is to transform the data association problem into a graph matching problem for solving, while the other is to apply the social power model as an advanced constraint for group tracking. In the former case, the solving difficulty geometric growth as the number of tracked objects increases, and the computing efficiency for real-time tracking demand cannot be met. The latter strategy tends to set up fixed-size groups or offline training rules, resulting in a lack of flexibility that limits their scenario generalization. According to the shortcomings of existing methods, this paper proposes a novel multiple object tracking method with spatio-temporal correlation and graph neural networks. Firstly, the relational features of the historical trajectories are extracted through the spatio-temporal relationship learning module, which models the spatio-temporal correlations of the objects and dynamically constructs the group structure online. Then, the graph neural network is combined with appearance and motion information, and the similarity between each detection and tracklet is used as a weight in node feature aggregation to make powerful distinctions between node features. Meanwhile, the spatio-temporal correlation method is also used to solve target loss issues caused by occlusion. Even collocated with linearly assigned data association method, good tracking results are still achieved, with a low computational complexity. Experiments on three challenging public datasets, namely MOT16, MOT17, and MOT20, validated the accuracy and efficiency of the proposed tracking method. •Spatio-temporal correlation is modeled based on historical trajectories.•Undetected objects can be tracked with spatio-temporal correlation constraints.•Appearance and motion information are used to select graph nodes for aggregation.•Secondary matching enhances target association between detections and tracklets.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2023.110249