SOTR: Segmenting Objects with Transformers
Most recent transformer-based models show impressive performance on vision tasks, even better than Convolution Neural Networks (CNN). In this work, we present a novel, flexible, and effective transformer-based model for high-quality instance segmentation. The proposed method, Segmenting Objects with...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Most recent transformer-based models show impressive performance on vision
tasks, even better than Convolution Neural Networks (CNN). In this work, we
present a novel, flexible, and effective transformer-based model for
high-quality instance segmentation. The proposed method, Segmenting Objects
with TRansformers (SOTR), simplifies the segmentation pipeline, building on an
alternative CNN backbone appended with two parallel subtasks: (1) predicting
per-instance category via transformer and (2) dynamically generating
segmentation mask with the multi-level upsampling module. SOTR can effectively
extract lower-level feature representations and capture long-range context
dependencies by Feature Pyramid Network (FPN) and twin transformer,
respectively. Meanwhile, compared with the original transformer, the proposed
twin transformer is time- and resource-efficient since only a row and a column
attention are involved to encode pixels. Moreover, SOTR is easy to be
incorporated with various CNN backbones and transformer model variants to make
considerable improvements for the segmentation accuracy and training
convergence. Extensive experiments show that our SOTR performs well on the MS
COCO dataset and surpasses state-of-the-art instance segmentation approaches.
We hope our simple but strong framework could serve as a preferment baseline
for instance-level recognition. Our code is available at
https://github.com/easton-cau/SOTR. |
---|---|
DOI: | 10.48550/arxiv.2108.06747 |