Knowledge Graph Enhancement for Fine-Grained Zero-Shot Learning on ImageNet21K

Fine-grained Zero-shot Learning on the large-scale dataset ImageNet21K is an important task that has promising perspectives in many real-world scenarios. One typical solution is to explicitly model the knowledge passing using a Knowledge Graph (KG) to transfer knowledge from seen to unseen instances...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on circuits and systems for video technology 2024-10, Vol.34 (10), p.9090-9101
Hauptverfasser:	Chen, Xingyu, Liu, Jiaxu, Liu, Zeyang, Wan, Lipeng, Lan, Xuguang, Zheng, Nanning
Format:	Artikel
Sprache:	eng
Schlagworte:	Circuits and systems Fine-grained zero-shot learning graph convolutional neural network Graph theory Knowledge knowledge graph Knowledge graphs Knowledge management Knowledge representation Large language models Semantics Task analysis Training Visualization Zero-shot learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Fine-grained Zero-shot Learning on the large-scale dataset ImageNet21K is an important task that has promising perspectives in many real-world scenarios. One typical solution is to explicitly model the knowledge passing using a Knowledge Graph (KG) to transfer knowledge from seen to unseen instances. By analyzing the hierarchical structure and the word descriptions on ImageNet21K, we find that the noisy semantic information, the sparseness of seen classes, and the lack of supervision of unseen classes make the knowledge passing insufficient, which limits the KG-based fine-grained ZSL. To resolve this problem, in this paper, we enhance the knowledge passing from three aspects. First, we use more powerful models such as the Large Language Model and Vision-Language Model to get more reliable semantic embeddings. Then we propose a strategy that globally enhances the knowledge graph based on the convex combination relationship of the semantic embeddings. It effectively connects the edges between the non-kinship seen and unseen classes that have strong correlations while assigning an importance score to each edge. Based on the enhanced knowledge graph, we further present a novel regularizer that locally enhances the knowledge passing during training. We extensively conducted comparative evaluations to demonstrate the advantages of our method over state-of-the-art approaches.
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2024.3396215