Transformer for object detection: Review and benchmark
Object detection is a crucial task in computer vision (CV). With the rapid advancement of Transformer-based models in natural language processing (NLP) and various visual tasks, Transformer structures are becoming increasingly prevalent in CV tasks. In recent years, numerous Transformer-based object...
Gespeichert in:
Veröffentlicht in: | Engineering applications of artificial intelligence 2023-11, Vol.126, p.107021, Article 107021 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Object detection is a crucial task in computer vision (CV). With the rapid advancement of Transformer-based models in natural language processing (NLP) and various visual tasks, Transformer structures are becoming increasingly prevalent in CV tasks. In recent years, numerous Transformer-based object detectors have been proposed, achieving performance comparable to mainstream convolutional neural network-based (CNN-based) approaches. To provide researchers with a comprehensive understanding of the development, advantages, disadvantages, and future potential of Transformer-based object detectors in Artificial Intelligence (AI), this paper systematically reviews the mainstream methods and analyzes the limitations and challenges encountered in their current applications, while also offering insights into future research directions. We have reviewed a large number of papers, selected the most prominent Transformer detection methods, and divided them into Transformer Neck and Transformer Backbone categories for introduction and comparative analysis. Furthermore, we have constructed a benchmark using the COCO2017 dataset to evaluate different object detection algorithms. Finally, we summarize the challenges and prospects in this field. |
---|---|
ISSN: | 0952-1976 |
DOI: | 10.1016/j.engappai.2023.107021 |