Generalized Semantic Preserving Hashing for Cross-Modal Retrieval

Cross-modal retrieval is gaining importance due to the availability of large amounts of multimedia data. Hashing-based techniques provide an attractive solution to this problem when the data size is large. For cross-modal retrieval, data from the two modalities may be associated with a single label...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing 2019-01, Vol.28 (1), p.102-112
Hauptverfasser:	Mandal, Devraj, Chaudhury, Kunal N., Biswas, Soma
Format:	Artikel
Sprache:	eng
Schlagworte:	Cross-modal retrieval hashing Information services Iterative solution Kernel kernel logistic regression Logistics multi-label data Multimedia Optimization Retrieval Semantics Task analysis Training Training data unpaired matching scenarios
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Cross-modal retrieval is gaining importance due to the availability of large amounts of multimedia data. Hashing-based techniques provide an attractive solution to this problem when the data size is large. For cross-modal retrieval, data from the two modalities may be associated with a single label or multiple labels, and in addition, may or may not have a one-to-one correspondence. This work proposes a simple hashing framework which has the capability to work with different scenarios while effectively capturing the semantic relationship between the data items. The work proceeds in two stages in which the first stage learns the optimum hash codes by factorizing an affinity matrix, constructed using the label information. In the second stage, ridge regression and kernel logistic regression is used to learn the hash functions for mapping the input data to the bit domain. We also propose a novel iterative solution for cases where the training data is very large, or when the whole training data is not available at once. Extensive experiments on single label data set like Wiki and multi-label datasets like MirFlickr, NUS-WIDE, Pascal, and LabelMe, and comparisons with the state-of-the-art, shows the usefulness of the proposed approach.
ISSN:	1057-7149 1941-0042
DOI:	10.1109/TIP.2018.2863040