The revenge of BiSeNet: Efficient Multi-Task Image Segmentation
Recent advancements in image segmentation have focused on enhancing the efficiency of the models to meet the demands of real-time applications, especially on edge devices. However, existing research has primarily concentrated on single-task settings, especially on semantic segmentation, leading to r...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent advancements in image segmentation have focused on enhancing the
efficiency of the models to meet the demands of real-time applications,
especially on edge devices. However, existing research has primarily
concentrated on single-task settings, especially on semantic segmentation,
leading to redundant efforts and specialized architectures for different tasks.
To address this limitation, we propose a novel architecture for efficient
multi-task image segmentation, capable of handling various segmentation tasks
without sacrificing efficiency or accuracy. We introduce BiSeNetFormer, that
leverages the efficiency of two-stream semantic segmentation architectures and
it extends them into a mask classification framework. Our approach maintains
the efficient spatial and context paths to capture detailed and semantic
information, respectively, while leveraging an efficient transformed-based
segmentation head that computes the binary masks and class probabilities. By
seamlessly supporting multiple tasks, namely semantic and panoptic
segmentation, BiSeNetFormer offers a versatile solution for multi-task
segmentation. We evaluate our approach on popular datasets, Cityscapes and
ADE20K, demonstrating impressive inference speeds while maintaining competitive
accuracy compared to state-of-the-art architectures. Our results indicate that
BiSeNetFormer represents a significant advancement towards fast, efficient, and
multi-task segmentation networks, bridging the gap between model efficiency and
task adaptability. |
---|---|
DOI: | 10.48550/arxiv.2404.09570 |