MVLidarNet: Real-Time Multi-Class Scene Understanding for Autonomous Driving Using Multiple Views
Autonomous driving requires the inference of actionable information such as detecting and classifying objects, and determining the drivable space. To this end, we present Multi-View LidarNet (MVLidarNet), a two-stage deep neural network for multi-class object detection and drivable space segmentatio...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Autonomous driving requires the inference of actionable information such as
detecting and classifying objects, and determining the drivable space. To this
end, we present Multi-View LidarNet (MVLidarNet), a two-stage deep neural
network for multi-class object detection and drivable space segmentation using
multiple views of a single LiDAR point cloud. The first stage processes the
point cloud projected onto a perspective view in order to semantically segment
the scene. The second stage then processes the point cloud (along with semantic
labels from the first stage) projected onto a bird's eye view, to detect and
classify objects. Both stages use an encoder-decoder architecture. We show that
our multi-view, multi-stage, multi-class approach is able to detect and
classify objects while simultaneously determining the drivable space using a
single LiDAR scan as input, in challenging scenes with more than one hundred
vehicles and pedestrians at a time. The system operates efficiently at 150 fps
on an embedded GPU designed for a self-driving car, including a postprocessing
step to maintain identities over time. We show results on both KITTI and a much
larger internal dataset, thus demonstrating the method's ability to scale by an
order of magnitude. |
---|---|
DOI: | 10.48550/arxiv.2006.05518 |