Multiscale Modality-Similar Learning Guided Weakly Supervised RGB-T Crowd Counting
With the development of sensor technology and its numerous applications in intelligent surveillance systems, RGB-thermal (RGB-T) cross-modal crowd counting uses data from different sensors as source data and has received extensive attention from academia and industry. From the feature extraction asp...
Gespeichert in:
Veröffentlicht in: | IEEE sensors journal 2024-09, Vol.24 (18), p.29121-29134 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | With the development of sensor technology and its numerous applications in intelligent surveillance systems, RGB-thermal (RGB-T) cross-modal crowd counting uses data from different sensors as source data and has received extensive attention from academia and industry. From the feature extraction aspect, the existing cross-modal methods mainly adopt multiple parallel large convolution kernels for the notable crowd-scale variation problem, resulting in a large number of parameters. From the supervision aspect, the existing cross-modal crowd-counting methods adopt a fully supervised framework, and it requires time-consuming and laborious pixel-level supervision. In this regard, this article proposes a multiscale modality-similar guided weakly supervised cross-modal crowd-counting method, including a designed multiscale context-level feature fusion (MCFF) module and a modality-similar weakly supervised framework. In particular, the proposed multiscale module decouples the square convolution in different directions equivalently to solve the problems of feature redundancy and parameter increase. The proposed weakly supervised framework explores the similarity of cross-modal crowd semantic features to bootstrap the model with only image-level supervised information. Experimental results on two public RGB-T benchmarks, one RGB-D benchmark, and the collected real-world data show that the proposed weakly supervised method can achieve counting accuracy competitive with existing representative fully supervised methods. The extensive ablation studies validate the positive gain of the core modules on the final counting performance improvement. |
---|---|
ISSN: | 1530-437X 1558-1748 |
DOI: | 10.1109/JSEN.2024.3436859 |