Grouping Bilinear Pooling for Fine-Grained Image Classification

Fine-grained image classification is a challenging computer visual task due to the small interclass variations and large intra-class variations. Extracting expressive feature representation is an effective way to improve the accuracy of fine-grained image classification. Bilinear pooling is a simple...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied sciences 2022-05, Vol.12 (10), p.5063
Hauptverfasser: Zeng, Rui, He, Jingsong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Fine-grained image classification is a challenging computer visual task due to the small interclass variations and large intra-class variations. Extracting expressive feature representation is an effective way to improve the accuracy of fine-grained image classification. Bilinear pooling is a simple and effective high-order feature interaction method. Compared with common pooling methods, bilinear pooling can obtain better feature representation by capturing complex associations between high-order features. However, the dimensions of bilinear representation are often up to hundreds of thousands or even millions. In order to get compact bilinear representation, we propose grouping bilinear pooling (GBP) for fine-grained image classification in this paper. Firstly, by dividing the feature layers into different groups, and then carrying out intra-group bilinear pooling or inter-group bilinear pooling. The representation captured by GBP can achieve the same accuracy with less than 0.4% parameters compared with full bilinear representation when using the same backbone. This extreme compact representation largely overcomes the high redundancy of the full bilinear representation, the computational cost and storage consumption. Besides, it is because GBP compresses the bilinear representation to the extreme that it can be used with more powerful backbones as a plug-and-play module. The effectiveness of GBP is proved by experiments on the widely used fine-grained recognition datasets CUB and Stanford Cars.
ISSN:2076-3417
2076-3417
DOI:10.3390/app12105063