Relation-Specific Feature Augmentation for unbiased scene graph generation

Scene Graph Generation (SGG) models suffer from the long-tailed distribution of relations, which results in biased predictions that favor head relations (e.g., on) over informative tail ones (e.g., sitting on, laying on, standing on). Existing solutions typically adopt class re-balancing strategies...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition 2025-01, Vol.157, p.110936, Article 110936
Hauptverfasser: Liu, Zhihong, Wang, Jianji, Chen, Hui, Ma, Yongqiang, Zheng, Nanning
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Scene Graph Generation (SGG) models suffer from the long-tailed distribution of relations, which results in biased predictions that favor head relations (e.g., on) over informative tail ones (e.g., sitting on, laying on, standing on). Existing solutions typically adopt class re-balancing strategies to balance data distribution. However, they do not essentially solve the lack of information due to insufficient tail data. To this end, we propose a Relation-Specific Feature Augmentation (RSFA) framework to mitigate the long-tailed bias by augmenting relations in the feature space. To perform augmentation effectively, we design an augmentation scheme and a novel Dual Attention Network (DAN). The augmentation scheme augments each relation uniformly based on the reciprocal number of samples to avoid over-fitting. By extracting relation-specific information from new object features generated by a Conditional Variational AutoEncoder (CVAE), DAN generates reliable virtual relation representations to provide useful information to guide optimizing relation classifier. Extensive ablation studies and comprehensive analysis demonstrate the effectiveness of our method in debiasing. And results on the Visual Genome benchmark show that our method significantly outperforms the existing state-of-the-art methods. •A novel framework to tackle the long-tailed distribution in SGG.•Relation-specific feature augmentation to introduce helpful information.•Performance superiority over state-of-the-art methods.
ISSN:0031-3203
DOI:10.1016/j.patcog.2024.110936