Dual Siamese network for RGBT tracking via fusing predicted position maps
Visual object tracking is a basic task in the field of computer vision. Despite the rapid development of visual object tracking, it is not reliable to use only visible light images for object tracking in some cases. Since visible light and thermal infrared images have complementary advantages in ima...
Gespeichert in:
Veröffentlicht in: | The Visual computer 2022-07, Vol.38 (7), p.2555-2567 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Visual object tracking is a basic task in the field of computer vision. Despite the rapid development of visual object tracking, it is not reliable to use only visible light images for object tracking in some cases. Since visible light and thermal infrared images have complementary advantages in imaging, and the use of them as a joint input for tracking becomes more noted, this kind of tracking is RGBT tracking. The existing RGBT tracking can be divided into image-level fusion tracking, feature-level fusion tracking, and response-level fusion tracking. Compared with the first two, response-level fusion tracking can use deeper dual-mode image information, but most of them use traditional tracking methods and introduce weights at inappropriate stages. Based on the above, we propose a response-level fusion tracking algorithm that employed deep learning. And the weight distribution is placed in the feature extraction stage, for which we design the joint modal channel attention module. We adopt the Siamese framework and expand it into a dual Siamese subnetwork. In the meantime, we improve the regional proposal subnetwork and propose the strategy for fusing two modal predicted position maps. To verify the performance of our algorithm, we conducted experiments on two tracking benchmarks. After testing, our algorithm has very good performance and runs at 116 frames per second, which far exceeds the real-time requirement of 25 frames per second. |
---|---|
ISSN: | 0178-2789 1432-2315 |
DOI: | 10.1007/s00371-021-02131-4 |