Similarity Graph-correlation Reconstruction Network for unsupervised cross-modal hashing
Existing cross-modal hash retrieval methods can simultaneously enhance retrieval speed and reduce storage space. However, these methods face a major challenge in determining the similarity metric between two modalities. Specifically, the accuracy of intra-modal and inter-modal similarity measurement...
Gespeichert in:
Veröffentlicht in: | Expert systems with applications 2024-03, Vol.237, p.121516, Article 121516 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Existing cross-modal hash retrieval methods can simultaneously enhance retrieval speed and reduce storage space. However, these methods face a major challenge in determining the similarity metric between two modalities. Specifically, the accuracy of intra-modal and inter-modal similarity measurements is inadequate, and the large gap between modalities leads to semantic bias. In this paper, we propose a Similarity Graph-correlation Reconstruction Network (SGRN) for unsupervised cross-modal hashing. Particularly, the local relation graph rebasing module is used to filter out graph nodes with weak similarity and associate graph nodes with strong similarity, resulting in fine-grained intra-modal similarity relation graphs. The global relation graph reconstruction module is further strengthens cross-modal correlation and implements fine-grained similarity alignment between modalities. In addition, in order to bridge the modal gap, we combine the similarity representation of real-valued and hash features to design the intra-modal and inter-modal training strategies. SGRN conducted extensive experiments on two cross-modal retrieval datasets, and the experimental results effectively validated the superiority of the proposed method and significantly improved the retrieval performance.
•We construct the relation graphs for image modality and text modality separately.•We rebase intra-modal relation graphs through the similarity correlation.•We combine the rebase graphs of two modalities to obtain the joint relation graphs.•We reconstruct the joint relation graphs to obtain fine-grained similarity alignment.•We design a combined intra-modal and inter-modal training strategy. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2023.121516 |