Weakly-supervised hashing in kernel space

The explosive growth of the vision data motivates the recent studies on efficient data indexing methods such as locality-sensitive hashing (LSH). Most existing approaches perform hashing in an unsupervised way. In this paper we move one step forward and propose a supervised hashing method, i.e., the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Yadong Mu, Jialie Shen, Shuicheng Yan
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The explosive growth of the vision data motivates the recent studies on efficient data indexing methods such as locality-sensitive hashing (LSH). Most existing approaches perform hashing in an unsupervised way. In this paper we move one step forward and propose a supervised hashing method, i.e., the LAbel-regularized Max-margin Partition (LAMP) algorithm. The proposed method generates hash functions in weakly-supervised setting, where a small portion of sample pairs are manually labeled to be "similar" or "dissimilar". We formulate the task as a Constrained Convex-Concave Procedure (CCCP), which can be relaxed into a series of convex sub-problems solvable with efficient Quadratic-Program (QP). The proposed hashing method possesses other characteristics including: 1) most existing LSH approaches rely on linear feature representation. Unfortunately, kernel tricks are often more natural to gauge the similarity between visual objects in vision research, which corresponds to probably infinite-dimensional Hilbert spaces. The proposed LAMP has a natural support for kernel-based feature representation. 2) traditional hashing methods assume uniform data distributions. Typically, the collision probability of two samples in hash buckets is only determined by pairwise similarity, unrelated to contextual data distribution. In contrast, we provide such a collision bound which is beyond pairwise data interaction based on Markov random fields theory. Extensive empirical evaluations are conducted on five widely-used benchmarks. It takes only several seconds to generate a new hashing function, and the adopted random supporting-vector scheme enables the LAMP algorithm scalable to large-scale problems. Experimental results well validate the superiorities of the LAMP algorithm over the state-of-the-art kernel-based hashing methods.
ISSN:1063-6919
DOI:10.1109/CVPR.2010.5540024