A Benchmark for UAV-View Natural Language-Guided Tracking

We propose a new benchmark, UAVNLT (Unmanned Aerial Vehicle Natural Language Tracking), for the UAV-view natural language-guided tracking task. UAVNLT consists of videos taken from UAV cameras from four cities for vehicles on city roads. For each video, vehicles’ bounding boxes, trajectories, and na...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Electronics (Basel) 2024-05, Vol.13 (9), p.1706
Hauptverfasser: Li, Hengyou, Liu, Xinyan, Li, Guorong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We propose a new benchmark, UAVNLT (Unmanned Aerial Vehicle Natural Language Tracking), for the UAV-view natural language-guided tracking task. UAVNLT consists of videos taken from UAV cameras from four cities for vehicles on city roads. For each video, vehicles’ bounding boxes, trajectories, and natural language are carefully annotated. Compared to the existing data sets, which are only annotated with bounding boxes, the natural language sentences in our data set can be more suitable for many application fields where humans take part in the system for that language, being not only more friendly for human–computer interaction but also capable of overcoming the appearance features’ low uniqueness for tracking. We tested several existing methods on our new benchmarks and found that the performance of the existing methods was not satisfactory. To pave the way for future work, we propose a baseline method suitable for this task, achieving state-of-the-art performance. We believe our new data set and proposed baseline method will be helpful in many fields, such as smart city, smart transportation, vehicle management, etc.
ISSN:2079-9292
2079-9292
DOI:10.3390/electronics13091706