MonStereo: When Monocular and Stereo Meet at the Tail of 3D Human Localization
Monocular and stereo visions are cost-effective solutions for 3D human localization in the context of self-driving cars or social robots. However, they are usually developed independently and have their respective strengths and limitations. We propose a novel unified learning framework that leverage...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Monocular and stereo visions are cost-effective solutions for 3D human
localization in the context of self-driving cars or social robots. However,
they are usually developed independently and have their respective strengths
and limitations. We propose a novel unified learning framework that leverages
the strengths of both monocular and stereo cues for 3D human localization. Our
method jointly (i) associates humans in left-right images, (ii) deals with
occluded and distant cases in stereo settings by relying on the robustness of
monocular cues, and (iii) tackles the intrinsic ambiguity of monocular
perspective projection by exploiting prior knowledge of the human height
distribution. We specifically evaluate outliers as well as challenging
instances, such as occluded and far-away pedestrians, by analyzing the entire
error distribution and by estimating calibrated confidence intervals. Finally,
we critically review the official KITTI 3D metrics and propose a practical 3D
localization metric tailored for humans. |
---|---|
DOI: | 10.48550/arxiv.2008.10913 |