Large-Scale Weakly Supervised Object Localization via Latent Category Learning

Localizing objects in cluttered backgrounds is challenging under large-scale weakly supervised conditions. Due to the cluttered image condition, objects usually have large ambiguity with backgrounds. Besides, there is also a lack of effective algorithm for large-scale weakly supervised localization...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing 2015-04, Vol.24 (4), p.1371-1385
Hauptverfasser:	Chong Wang, Kaiqi Huang, Weiqiang Ren, Junge Zhang, Maybank, Steve
Format:	Artikel
Sprache:	eng
Schlagworte:	analysis Feature extraction Histograms large-scale latent semantic object localization Proposals Search problems Semantics Training Visualization weakly supervised learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Localizing objects in cluttered backgrounds is challenging under large-scale weakly supervised conditions. Due to the cluttered image condition, objects usually have large ambiguity with backgrounds. Besides, there is also a lack of effective algorithm for large-scale weakly supervised localization in cluttered backgrounds. However, backgrounds contain useful latent information, e.g., the sky in the aeroplane class. If this latent information can be learned, object-background ambiguity can be largely reduced and background can be suppressed effectively. In this paper, we propose the latent category learning (LCL) in large-scale cluttered conditions. LCL is an unsupervised learning method which requires only image-level class labels. First, we use the latent semantic analysis with semantic object representation to learn the latent categories, which represent objects, object parts or backgrounds. Second, to determine which category contains the target object, we propose a category selection strategy by evaluating each category's discrimination. Finally, we propose the online LCL for use in large-scale conditions. Evaluation on the challenging PASCAL Visual Object Class (VOC) 2007 and the large-scale imagenet large-scale visual recognition challenge 2013 detection data sets shows that the method can improve the annotation precision by 10% over previous methods. More importantly, we achieve the detection precision which outperforms previous results by a large margin and can be competitive to the supervised deformable part model 5.0 baseline on both data sets.
ISSN:	1057-7149 1941-0042
DOI:	10.1109/TIP.2015.2396361