DaF-BEVSeg: Distortion-aware Fisheye Camera based Bird's Eye View Segmentation with Occlusion Reasoning
Semantic segmentation is an effective way to perform scene understanding. Recently, segmentation in 3D Bird's Eye View (BEV) space has become popular as its directly used by drive policy. However, there is limited work on BEV segmentation for surround-view fisheye cameras, commonly used in comm...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Semantic segmentation is an effective way to perform scene understanding.
Recently, segmentation in 3D Bird's Eye View (BEV) space has become popular as
its directly used by drive policy. However, there is limited work on BEV
segmentation for surround-view fisheye cameras, commonly used in commercial
vehicles. As this task has no real-world public dataset and existing synthetic
datasets do not handle amodal regions due to occlusion, we create a synthetic
dataset using the Cognata simulator comprising diverse road types, weather, and
lighting conditions. We generalize the BEV segmentation to work with any camera
model; this is useful for mixing diverse cameras. We implement a baseline by
applying cylindrical rectification on the fisheye images and using a standard
LSS-based BEV segmentation model. We demonstrate that we can achieve better
performance without undistortion, which has the adverse effects of increased
runtime due to pre-processing, reduced field-of-view, and resampling artifacts.
Further, we introduce a distortion-aware learnable BEV pooling strategy that is
more effective for the fisheye cameras. We extend the model with an occlusion
reasoning module, which is critical for estimating in BEV space. Qualitative
performance of DaF-BEVSeg is showcased in the video at
https://streamable.com/ge4v51. |
---|---|
DOI: | 10.48550/arxiv.2404.06352 |