Self-Prompting Tracking: A Fast and Efficient Tracking Pipeline for UAV Videos

In the realm of visual tracking, remote sensing videos captured by Unmanned Aerial Vehicles (UAVs) have seen significant advancements with wide applications. However, there remain challenges to conventional Transformer-based trackers in balancing tracking accuracy and inference speed. This problem i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Remote sensing (Basel, Switzerland) Switzerland), 2024-03, Vol.16 (5), p.748
Hauptverfasser: Wang, Zhixing, Zhou, Gaofan, Yao, Jinzhen, Zhang, Jianlin, Bao, Qiliang, Hu, Qintao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In the realm of visual tracking, remote sensing videos captured by Unmanned Aerial Vehicles (UAVs) have seen significant advancements with wide applications. However, there remain challenges to conventional Transformer-based trackers in balancing tracking accuracy and inference speed. This problem is further exacerbated when Transformers are extensively implemented at larger model scales. To address this challenge, we present a fast and efficient UAV tracking framework, denoted as SiamPT, aiming to reduce the number of Transformer layers without losing the discriminative ability of the model. To realize it, we transfer the conventional prompting theories in multi-model tracking into UAV tracking, where a novel self-prompting method is proposed by utilizing the target’s inherent characteristics in the search branch to discriminate targets from the background. Specifically, a self-distribution strategy is introduced to capture feature-level relationships, which segment tokens into distinct smaller patches. Subsequently, salient tokens within the full attention map are identified as foreground targets, enabling the fusion of local region information. These fused tokens serve as prompters to enhance the identification of distractors, thereby avoiding the demand for model expansion. SiamPT has demonstrated impressive results on the UAV123 benchmark, achieving success and precision rates of 0.694 and 0.890 respectively, while maintaining an inference speed of 91.0 FPS.
ISSN:2072-4292
2072-4292
DOI:10.3390/rs16050748