Track initialization and re-identification for 3D multi-view multi-object tracking

We propose a 3D multi-object tracking (MOT) solution using only 2D detections from monocular cameras, which automatically initiates/terminates tracks as well as resolves track appearance–reappearance and occlusions. Moreover, this approach does not require detector retraining when cameras are reconf...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information fusion 2024-11, Vol.111, p.102496, Article 102496
Hauptverfasser:	Ma, Linh Van, Nguyen, Tran Thien Dat, Vo, Ba-Ngu, Jang, Hyunsung, Jeon, Moongu
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptive birth Generalized labeled multi-Bernoulli Multi-object visual tracking Multi-sensor Multi-view Occlusion handling Re-identification
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We propose a 3D multi-object tracking (MOT) solution using only 2D detections from monocular cameras, which automatically initiates/terminates tracks as well as resolves track appearance–reappearance and occlusions. Moreover, this approach does not require detector retraining when cameras are reconfigured but only the camera matrices of reconfigured cameras need to be updated. Our approach is based on a Bayesian multi-object formulation that integrates track initiation/termination, re-identification, occlusion handling, and data association into a single Bayes filtering recursion. However, the exact filter that utilizes all these functionalities is numerically intractable due to the exponentially growing number of terms in the (multi-object) filtering density, while existing approximations trade-off some of these functionalities for speed. To this end, we develop a more efficient approximation suitable for online MOT by incorporating object features and kinematics into the measurement model, which improves data association and subsequently reduces the number of terms. Specifically, we exploit the 2D detections and extracted features from multiple cameras to provide a better approximation of the multi-object filtering density to realize the track initiation/termination and re-identification functionalities. Further, incorporating a tractable geometric occlusion model based on 2D projections of 3D objects on the camera planes realizes the occlusion handling functionality of the filter. Evaluation of the proposed solution on challenging datasets demonstrates significant improvements and robustness when camera configurations change on-the-fly, compared to existing multi-view MOT solutions. •Novel 3D multi-object tracking models with re-identification features.•A filter that performs 3D tracking by fusing multi-view 2D camera detections.•Our Method automatically initializes/terminates, re-identifies, and handles occlusion.•An efficient filter with linear complexity in the number of detections.•Extensive experiments to evaluate the performance on challenging benchmark. datasets.
ISSN:	1566-2535 1872-6305
DOI:	10.1016/j.inffus.2024.102496