The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification

The key to solving fine-grained image categorization is finding discriminate and local regions that correspond to subtle visual traits. Great strides have been made, with complex networks designed specifically to learn part-level discriminate feature representations. In this paper, we show that it i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing 2020-01, Vol.29, p.4683-4695
Hauptverfasser:	Chang, Dongliang, Ding, Yifeng, Xie, Jiyang, Bhunia, Ayan Kumar, Li, Xiaoxu, Ma, Zhanyu, Wu, Ming, Guo, Jun, Song, Yi-Zhe
Format:	Artikel
Sprache:	eng
Schlagworte:	Ablation Annotations Automobiles Channels Computer Science Computer Science, Artificial Intelligence Data mining deep learning Engineering Engineering, Electrical & Electronic Feature extraction Feature maps Fine-grained image classification Image classification loss function Manuals mutual channel Networks Science & Technology Task analysis Technology Training Visual discrimination Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The key to solving fine-grained image categorization is finding discriminate and local regions that correspond to subtle visual traits. Great strides have been made, with complex networks designed specifically to learn part-level discriminate feature representations. In this paper, we show that it is possible to cultivate subtle details without the need for overly complicated network designs or training mechanisms - a single loss is all it takes. The main trick lies with how we delve into individual feature channels early on, as opposed to the convention of starting from a consolidated feature map. The proposed loss function, termed as mutual-channel loss (MC-Loss), consists of two channel-specific components: a discriminality component and a diversity component. The discriminality component forces all feature channels belonging to the same class to be discriminative, through a novel channel-wise attention mechanism. The diversity component additionally constraints channels so that they become mutually exclusive across the spatial dimension. The end result is therefore a set of feature channels, each of which reflects different locally discriminative regions for a specific class. The MC-Loss can be trained end-to-end, without the need for any bounding-box/part annotations, and yields highly discriminative regions during inference. Experimental results show our MC-Loss when implemented on top of common base networks can achieve state-of-the-art performance on all four fine-grained categorization datasets (CUB-Birds, FGVC-Aircraft, Flowers-102, and Stanford Cars). Ablative studies further demonstrate the superiority of the MC-Loss when compared with other recently proposed general-purpose losses for visual classification, on two different base networks. Codes are available at: https://github.com/dongliangchang/Mutual-Channel-Loss.
ISSN:	1057-7149 1941-0042
DOI:	10.1109/TIP.2020.2973812