Knowledge Graph Enhancement for Fine-Grained Zero-Shot Learning on ImageNet21K
Fine-grained Zero-shot Learning on the large-scale dataset ImageNet21K is an important task that has promising perspectives in many real-world scenarios. One typical solution is to explicitly model the knowledge passing using a Knowledge Graph (KG) to transfer knowledge from seen to unseen instances...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on circuits and systems for video technology 2024-10, Vol.34 (10), p.9090-9101 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Fine-grained Zero-shot Learning on the large-scale dataset ImageNet21K is an important task that has promising perspectives in many real-world scenarios. One typical solution is to explicitly model the knowledge passing using a Knowledge Graph (KG) to transfer knowledge from seen to unseen instances. By analyzing the hierarchical structure and the word descriptions on ImageNet21K, we find that the noisy semantic information, the sparseness of seen classes, and the lack of supervision of unseen classes make the knowledge passing insufficient, which limits the KG-based fine-grained ZSL. To resolve this problem, in this paper, we enhance the knowledge passing from three aspects. First, we use more powerful models such as the Large Language Model and Vision-Language Model to get more reliable semantic embeddings. Then we propose a strategy that globally enhances the knowledge graph based on the convex combination relationship of the semantic embeddings. It effectively connects the edges between the non-kinship seen and unseen classes that have strong correlations while assigning an importance score to each edge. Based on the enhanced knowledge graph, we further present a novel regularizer that locally enhances the knowledge passing during training. We extensively conducted comparative evaluations to demonstrate the advantages of our method over state-of-the-art approaches. |
---|---|
ISSN: | 1051-8215 1558-2205 |
DOI: | 10.1109/TCSVT.2024.3396215 |