Asymmetric Global-Local Mutual Integration Network for RGBT Tracking

RGB and thermal infrared (RGBT) tracking as a solution in complex environments has gradually become a research hotspot. The powerful complementarity between RGB and thermal infrared data enables trackers to work 24/7. Existing works usually adopt the symmetric network structure that deploys the iden...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on instrumentation and measurement 2022, Vol.71, p.1-17
Hauptverfasser: Mei, Jiatian, Liu, Yanyu, Wang, Changcheng, Zhou, Dongming, Nie, Rencan, Cao, Jinde
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:RGB and thermal infrared (RGBT) tracking as a solution in complex environments has gradually become a research hotspot. The powerful complementarity between RGB and thermal infrared data enables trackers to work 24/7. Existing works usually adopt the symmetric network structure that deploys the identical strategy to mine modalities with different properties, ignoring the heterogeneity among modalities. In this article, we propose a novel asymmetric global-local mutual integration network via comprehensively considering symmetric structure, heterogeneity-based global association, and interframe communication. It consists of asymmetric mode-distinguishing parallel structure (AMPS), cross-modal global-local interaction, and interframe monitoring strategy (IMS). Specifically, the AMPS performs discriminative mining on the information of the two modalities by combining the discount module and the branch cement module, and extracts multiscale cues through the multiscale auxiliary module to handle the challenges of scale variation and small-size objects. Then, the global mining module is deployed in the cross-modal global-local interaction section to jointly perform intramodal and intermodal global correlation while acting as the global complement to local feature extraction. Finally, the IMS employs a fast optical flow algorithm to detect interframe displacement to assist the network in better handling camera and fast object motion. Extensive experiments on GTOT, RGBT234, and LasHeR datasets adequately verify the effectiveness of the proposed network, and further ablation experiments also confirm the efficacy of the asymmetric structure and components.
ISSN:0018-9456
1557-9662
DOI:10.1109/TIM.2022.3193971