Contrastive Attention for Video Anomaly Detection

We consider weakly-supervised video anomaly detection in this work. This task aims to learn to localize video frames containing anomaly events with only binary video-level annotation, i.e. , anomaly vs. normal. Traditional approaches usually formulate it as a multiple instance learning problem, whic...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on multimedia 2022, Vol.24, p.4067-4076
Hauptverfasser:	Chang, Shuning, Li, Yanchao, Shen, Shengmei, Feng, Jiashi, Zhou, Zhiying
Format:	Artikel
Sprache:	eng
Schlagworte:	Annotations Anomalies Anomaly detection Attention Consistency Loss Classifiers Contrastive Attention Module Crime Data models Deep learning Feature extraction Modules Predictive models Segments Selectivity Task analysis Training Video Weight reduction
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We consider weakly-supervised video anomaly detection in this work. This task aims to learn to localize video frames containing anomaly events with only binary video-level annotation, i.e. , anomaly vs. normal. Traditional approaches usually formulate it as a multiple instance learning problem, which ignore the intrinsic data imbalance issue that positive samples are very scarce compared to negative ones. In this paper, we focus on addressing this issue to boost detection performance further. We develop a new light-weight anomaly detection model that fully utilizes enough normal videos to train a classifier with a good discriminative ability for normal videos, and we employ it to improve the selectivity for anomalous segments and filter out normal segments. Specifically, in addition to boosting anomalous prediction, a novel contrastive attention module additionally produces a converted normal feature from anomalous video to refined anomalous predictions by maximizing the classifier making a mistake. Moreover, to remove the stubborn normal segments selected by the attention module, we also design an attention consistency loss to employ the classifier with high confidence for normal features to guide the attention module. Extensive experiments on two large-scale datasets, UCF-Crime, ShanghaiTech and XD-Violence, clearly demonstrate that our model largely improves frame-level AUC over the state-of-the-art. Code is released at https://github.com/changsn/Contrastive-Attention-for-Video-Anomaly-Detection .
ISSN:	1520-9210 1941-0077
DOI:	10.1109/TMM.2021.3112814