Research on the transformer model in computer vision

In the realm of computer vision, the visual Transformer model is commonly utilized as researchers have improved the Transformer model. The purpose of this paper is to study the transformer model in the field of computer vision, to summarise the visual transformer model, and to introduce the applicat...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Li, Kaipeng, Song, Yizhen
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Attention Computer vision Task complexity Visual fields
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In the realm of computer vision, the visual Transformer model is commonly utilized as researchers have improved the Transformer model. The purpose of this paper is to study the transformer model in the field of computer vision, to summarise the visual transformer model, and to introduce the applications in the field of computer vision. This paper firstly introduces the idea and structure of the transformer model, including the Self-Attention Mechanism, Multi-Head Attention and secondly gives an overview of the research progress of Transformer model the second is a review of the research progress of the Transformer model, including the Vision Transformer, Swin Transformer and other improved models, selecting representative models and their applications to be introduced, laying a foundation for the subsequent research work. As the field of computer vision continues to evolve, the Transformer model needs to be further developed to handle different types of data and tasks to meet the needs of increasingly complex applications.
ISSN:	0094-243X 1551-7616
DOI:	10.1063/5.0222908