From Two-Stream to One-Stream: Efficient RGB-T Tracking via Mutual Prompt Learning and Knowledge Distillation
Due to the complementary nature of visible light and thermal infrared modalities, object tracking based on the fusion of visible light images and thermal images (referred to as RGB-T tracking) has received increasing attention from researchers in recent years. How to achieve more comprehensive fusio...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Due to the complementary nature of visible light and thermal infrared
modalities, object tracking based on the fusion of visible light images and
thermal images (referred to as RGB-T tracking) has received increasing
attention from researchers in recent years. How to achieve more comprehensive
fusion of information from the two modalities at a lower cost has been an issue
that researchers have been exploring. Inspired by visual prompt learning, we
designed a novel two-stream RGB-T tracking architecture based on cross-modal
mutual prompt learning, and used this model as a teacher to guide a one-stream
student model for rapid learning through knowledge distillation techniques.
Extensive experiments have shown that, compared to similar RGB-T trackers, our
designed teacher model achieved the highest precision rate, while the student
model, with comparable precision rate to the teacher model, realized an
inference speed more than three times faster than the teacher model.(Codes will
be available if accepted.) |
---|---|
DOI: | 10.48550/arxiv.2403.16834 |