End-to-end pedestrian trajectory prediction via Efficient Multi-modal Predictors

Pedestrian trajectory prediction plays a key role in understanding human behavior and guiding autonomous driving. It is a difficult task due to the multi-modal nature of human motion. Recent advances have mainly focused on modeling this multi-modality, either by using implicit generative models or e...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Computer vision and image understanding 2024-11, Vol.248, p.104107, Article 104107
Hauptverfasser:	Wu, Qi, Zhou, Sanping, Wang, Le, Shi, Liushuai, Dong, Yonghao, Hua, Gang
Format:	Artikel
Sprache:	eng
Schlagworte:	Multimodal prediction Parallel predictors Pedestrian trajectory prediction
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Pedestrian trajectory prediction plays a key role in understanding human behavior and guiding autonomous driving. It is a difficult task due to the multi-modal nature of human motion. Recent advances have mainly focused on modeling this multi-modality, either by using implicit generative models or explicit pre-defined anchors. However, the former is limited by the sampling problem, while the latter introduces strong prior to the data, both of which require extra tricks to achieve better performance. To address these issues, we propose a simple yet effective framework called Efficient Multi-modal Predictors (EMP), which casts off the generative paradigm and predicts multi-modal trajectories in an end-to-end style. It is achieved by combining a set of parallel predictors with a model error based sparse selector. During training, the entire set of parallel multi-modal predictors will converge into disjoint subsets, with each subset specializing in one mode, thus obtaining multi-modal prediction with no human prior and reducing the problems of above two genres. Experiments on SDD/ETH-UCY/NBA datasets show that EMP achieves state-of-the-art performance with the highest inference speed. Additionally, we show that by replacing multi-modal modules with EMP, state-of-the-art works outperform their baselines, which further validate the versatility of EMP. Moreover, we formally prove that EMP can alleviate the problem of modal collapse and has a low test error bound. •We proposed EMP, an end-to-end multi-modal framework achieving accurate prediction with parallel predictors.•EMP’s posterior policy eliminates the need for implicit distributions and anchors.•EMP achieved state-of-the-art performance in benchmarks and improved performance on four typical models.
ISSN:	1077-3142
DOI:	10.1016/j.cviu.2024.104107