SkipViT: Speeding Up Vision Transformers with a Token-Level Skip Connection

Vision transformers are known to be more computationally and data-intensive than CNN models. These transformer models such as ViT, require all the input image tokens to learn the relationship among them. However, many of these tokens are not informative and may contain irrelevant information such as...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Ataiefard, Foozhan, Ahmed, Walid, Hajimolahoseini, Habib, Asani, Saina, Javadi, Farnoosh, Hassanpour, Mohammad, Awad, Omar Mohamed, Wen, Austin, Liu, Kangling, Liu, Yang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!