Dual-aligned Feature Confusion Alleviation for Generalized Zero-shot Learning

Generalized zero-shot learning (GZSL) aims to recognize both seen and unseen samples by leveraging the connections between semantic and visual representations. Recently, a majority of GZSL methods focus on generating visual features for unseen categories conditioned on category-level semantic attrib...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems for video technology 2023-08, Vol.33 (8), p.1-1
Hauptverfasser: Su, Hongzu, Li, Jingjing, Lu, Ke, Zhu, Lei, Shen, Heng Tao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Generalized zero-shot learning (GZSL) aims to recognize both seen and unseen samples by leveraging the connections between semantic and visual representations. Recently, a majority of GZSL methods focus on generating visual features for unseen categories conditioned on category-level semantic attributes. However, there is a considerable gap between generated features and real unseen features since the generator is trained with only seen samples. The final classifier may get confused by the unfaithful generated features and make misclassification. To alleviate this issue, we propose a dual-aligned feature confusion alleviation (DFCA) framework that simultaneously generates faithful and discriminative features for unseen categories. Specifically, our DFCA attains the faithfulness via a conditional invertible neural network (cINN) and aligns the generated visual features and reconstructed semantic conditions with their real counterparts, respectively. To further encourage distinguishable synthetic features, we learn discriminative category-level semantic conditions for cINN with an attributes mapping layer. To verify the proposed method, we conduct extensive experiments on five widely used benchmarks. Experimental results show that our method outperforms previous state-of-the-arts and successfully alleviates features confusion problem in GZSL. For instance, our method achieves the best performance in terms of seen accuracy, unseen accuracy and harmonic mean accuracy on FLO.
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2023.3239390