Interactive Re-ranking via Object Entropy-Guided Question Answering for Cross-Modal Image Retrieval

Cross-modal image-retrieval methods retrieve desired images from a query text by learning relationships between texts and images. Such a retrieval approach is one of the most effective ways of achieving the easiness of query preparation. Recent cross-modal image-retrieval methods are convenient and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM transactions on multimedia computing communications and applications 2022-08, Vol.18 (3), p.1-17
Hauptverfasser: Yanagi, Rintaro, Togo, Ren, Ogawa, Takahiro, Haseyama, Miki
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Cross-modal image-retrieval methods retrieve desired images from a query text by learning relationships between texts and images. Such a retrieval approach is one of the most effective ways of achieving the easiness of query preparation. Recent cross-modal image-retrieval methods are convenient and accurate when users input a query text that can be used to uniquely identify the desired image. However, in reality, users frequently input ambiguous query texts, and these ambiguous queries make it difficult to obtain desired images. To overcome these difficulties, in this study, we propose a novel interactive cross-modal image-retrieval method based on question answering. The proposed method analyzes candidate images and asks users questions to obtain information that can narrow down retrieval candidates. By only answering questions generated by the proposed method, users can reach their desired images, even when using an ambiguous query text. Experimental results show the proposed method’s effectiveness.
ISSN:1551-6857
1551-6865
DOI:10.1145/3485042