Improved local-feature-based few-shot learning with Sinkhorn metrics
Local-feature-based Few-Shot Learning (FSL) has attracked a lot of attention and achieved great progress recently. Given an image, the model extracts a group of local features through the Fully Convolutional Network (FCN), each of which contains information from the corresponding receptive field of...
Gespeichert in:
Veröffentlicht in: | International journal of machine learning and cybernetics 2022-04, Vol.13 (4), p.1099-1114 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Local-feature-based Few-Shot Learning (FSL) has attracked a lot of attention and achieved great progress recently. Given an image, the model extracts a group of local features through the Fully Convolutional Network (FCN), each of which contains information from the corresponding receptive field of the image. The challenging problem is that how to exploit the local-feature-level similarities to generate the image-level similarity. Towards this, many existing works have proposed different heuristic rules or settings. In this paper, we first follow existing works and systematically propose several modified methods for local feature matching, induced by a novel and improved heterogeneous matching mechanism. However, these heuristic methods are not optimal to highlight the most informative local feature pairs to represent the image-level similarity, and also can not generalize well to different tasks. Therefore, we propose a new idea called Sinkhorn Metrics (
SM
). We consider the local-feature-based FSL as the Regularized Optimal Transport (ROT) problem. The cost matrix is formed by the similarities of local feature pairs. The marginals indicating the importance of each local feature are obtained by a new attentive cross-comparison module. The optimal transportation plan is used as weights to aggregate all the local-feature-level similarities to obtain the image-level similarity. We exploit the Sinkhorn algorithm to solve the ROT problem, which is efficient for the end-to-end training. We conduct a hybrid experiment on SM with some heuristic baselines to demonstrate its compatibility. Extensive ablation studies are performed to fully evaluate important hyper-parameters and settings. Our method achieves a series of state-of-the-arts on multiple datasets in both the single-domain and cross-domain FSL scenarios (The code for evaluation, trained model, and datasets in this study are available at
https://github.com/Wangduo428/few-shot-learning-SM
). |
---|---|
ISSN: | 1868-8071 1868-808X |
DOI: | 10.1007/s13042-021-01437-y |