A novel visual classification framework on panoramic attention mechanism network

Fine‐grained classification is a challenging task due to the difficulty of finding discriminative features and the localization of feature regions. To handle these challenges, a novel visual classification framework on panoramic attention mechanism that combines multiple attention networks to locate...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IET Computer Vision 2022-09, Vol.16 (6), p.479-488
Hauptverfasser: Li, Wenshu, Li, Shenhao, Yin, Lingzhi, Guo, Xiaoying, Yang, Xu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Fine‐grained classification is a challenging task due to the difficulty of finding discriminative features and the localization of feature regions. To handle these challenges, a novel visual classification framework on panoramic attention mechanism that combines multiple attention networks to locate and identify features with more semantic interest is proposed. Firstly, based on the classical convolutional neural network, the global information of the image feature is expressed by linear fusion. Secondly, the foreground attention branch is used to further extract the distinguishing details of the salient features. Then, more features are mined from the complementary object area through the background attention branch to learn more perfect fine‐grained feature expression. Finally, three network branches are trained together to enhance the network's ability to express representative features of fine‐grained images. Our model can be viewed as a multi‐branch network, which benefits each other and optimizes the network together. Experiments were conducted on CUB‐200‐2011, Stanford Dogs and FGVC‐Aircraft datasets, and the accuracy was used as the quantitative measurement. Experimental results show that the proposed method has the highest accuracy; the average accuracy is 89.8%. It is effective and superior to the current advanced methods.
ISSN:1751-9632
1751-9640
DOI:10.1049/cvi2.12105