Diverse image search with explanations

In this paper, we propose a novel content based image search framework with explanations, which can not only compare the similarity among images from different perspectives, but also describe the commonalities of two images with language. Specifically, we develop a graph matching method to calculate...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2024-03, Vol.83 (8), p.23067-23082
Hauptverfasser: Zhu, Xinying, Liu, Linhu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper, we propose a novel content based image search framework with explanations, which can not only compare the similarity among images from different perspectives, but also describe the commonalities of two images with language. Specifically, we develop a graph matching method to calculate the similarity of two images and locate their commonalities, where each graph includes perceptual information, conceptual information and relational information. Furthermore, we utilize a language model based method to generate sentences to describe the similarities of two images. Comparing with different perspectives, we follow the principle that rich structured representations are more important than simple ones. To evaluate this principle, we conduct the experiment on the Visual Genome dataset, where each image contains lots of objects and multiple object relationships. The experimental results demonstrate the effectiveness of the principle. We also evaluate our method in the explanation of similar images, and the experimental results demonstrate that our method can obtain comparable performance.
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-023-16393-8