Learning descriptive visual representation for image classification and annotation

This paper presents a novel semantic regularized matrix factorization method for learning descriptive visual bag-of-words (BOW) representation. Although very influential in image classification, the traditional visual BOW representation has one distinct drawback. That is, for efficiency purposes, th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Pattern recognition 2015-02, Vol.48 (2), p.498-508
Hauptverfasser:	Lu, Zhiwu, Wang, Liwei
Format:	Artikel
Sprache:	eng
Schlagworte:	Annotations Applied sciences Detection, estimation, filtering, equalization, prediction Exact sciences and technology Factorization Image annotation Image classification Image processing Information, signal and communications theory Matrix factorization Representations Semantics Signal and communications theory Signal processing Signal representation. Spectral analysis Signal, noise Tags Telecommunications and information theory Visual Visual representation
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper presents a novel semantic regularized matrix factorization method for learning descriptive visual bag-of-words (BOW) representation. Although very influential in image classification, the traditional visual BOW representation has one distinct drawback. That is, for efficiency purposes, this visual representation is often generated by directly clustering the low-level visual feature vectors extracted from local keypoints or regions, without considering the high-level semantics of images. In other words, it still suffers from the semantic gap and may lead to significant performance degradation in more challenging tasks, e.g., image classification over social collections with large intra-class variations. To learn descriptive visual BOW representation for such image classification task, we develop a semantic regularized matrix factorization method by adding Laplacian regularization defined with the tags (easy to access) of social images into matrix factorization. Moreover, given that image annotation only provides the tags of training images in advance (while the tags of all social images are available), we can readily apply the proposed method to image annotation by first running a round of image annotation to predict the tags (maybe incorrect) of test images and thus obtaining the tags of all images. Experimental results show the promising performance of the proposed method. •We propose a novel method for learning descriptive visual representation.•Our method leads to promising results in both image classification and annotation.•Our method can readily be extended to other challenging tasks.
ISSN:	0031-3203 1873-5142
DOI:	10.1016/j.patcog.2014.08.008