Complemental Attention Multi-Feature Fusion Network for Fine-Grained Classification

Transformer-based architecture network has shown excellent performance in the coarse-grained image classification. However, it remains a challenge for the fine-grained image classification task, which needs more significant regional information. As one of the attention mechanisms, transformer pays a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE signal processing letters 2021, Vol.28, p.1983-1987
Hauptverfasser: Miao, Zhuang, Zhao, Xun, Wang, Jiabao, Li, Yang, Li, Hang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Transformer-based architecture network has shown excellent performance in the coarse-grained image classification. However, it remains a challenge for the fine-grained image classification task, which needs more significant regional information. As one of the attention mechanisms, transformer pays attention to the most significant region while neglecting other sub-significant regions. To use more regional information, in this letter, we propose a complemental attention multi-feature fusion network (CAMF), which extracts multiple attention features to obtain more effective features. In CAMF, we propose two novel modules: (i) a complemental attention module (CAM) that extracts the most salient attention feature and the complemental attention feature. (ii) a multi-feature fusion module (MFM) that uses different branches to extract multiple regional discriminative features. Furthermore, a new feature similarity loss is proposed to measure the diversity of inter-class features. Experiments were conducted on four public fine-grained classification datasets. Our CAMF achieves 91.2%, 92.8%, 93.3%, 95.3% on CUB-200-2011, Stanford Dogs, FGVC-Aircraft, and Stanford Cars. The ablation study verified that CAM and MFM can focus on more local discriminative regions and improve fine-grained classification performance.
ISSN:1070-9908
1558-2361
DOI:10.1109/LSP.2021.3114622