Task-Related Saliency for Few-Shot Image Classification
A weakness of the existing metric-based few-shot classification method is that task-unrelated objects or backgrounds may mislead the model since the small number of samples in the support set is insufficient to reveal the task-related targets. An essential cue of human wisdom in the few-shot classif...
Gespeichert in:
Veröffentlicht in: | IEEE transaction on neural networks and learning systems 2024-08, Vol.35 (8), p.10751-10763 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A weakness of the existing metric-based few-shot classification method is that task-unrelated objects or backgrounds may mislead the model since the small number of samples in the support set is insufficient to reveal the task-related targets. An essential cue of human wisdom in the few-shot classification task is that they can recognize the task-related targets by a glimpse of support images without being distracted by task-unrelated things. Thus, we propose to explicitly learn task-related saliency features and make use of them in the metric-based few-shot learning schema. We divide the tackling of the task into three phases, namely, the modeling, the analyzing, and the matching. In the modeling phase, we introduce a saliency sensitive module (SSM), which is an inexact supervision task jointly trained with a standard multiclass classification task. SSM not only enhances the fine-grained representation of feature embedding but also can locate the task-related saliency features. Meanwhile, we propose a self-training-based task-related saliency network (TRSN) which is a lightweight network to distill task-related salience produced by SSM. In the analyzing phase, we freeze TRSN and use it to handle novel tasks. TRSN extracts task-relevant features while suppressing the disturbing task-unrelated features. We, therefore, can discriminate samples accurately in the matching phase by strengthening the task-related features. We conduct extensive experiments on five-way 1-shot and 5-shot settings to evaluate the proposed method. Results show that our method achieves a consistent performance gain on benchmarks and achieves the state-of-the-art. |
---|---|
ISSN: | 2162-237X 2162-2388 2162-2388 |
DOI: | 10.1109/TNNLS.2023.3243903 |