Cornerstone network with feature extractor: a metric-based few-shot model for chinese natural sign language
StandardChinese natural sign language (CNSL) contains over 8,000 words. We consider dividing the task of CNSL recognition into multiple subtasks. Few-shot learning on subtasks can achieve minimal acquisition cost and short-term training. However, the existing few-shot learning methods do not take in...
Gespeichert in:
Veröffentlicht in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2021-10, Vol.51 (10), p.7139-7150 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | StandardChinese natural sign language (CNSL) contains over 8,000 words. We consider dividing the task of CNSL recognition into multiple subtasks. Few-shot learning on subtasks can achieve minimal acquisition cost and short-term training. However, the existing few-shot learning methods do not take into account the impact of ill-conditioned support samples, so we propose a new metric-based model, Cornerstone Network (CN), to complete the subtasks. CN is mainly composed of feature extractor (optional), embedding network and cornerstone generator. The cornerstone generator is designed as a semi-supervised clusterer. Compared with other metric-based few-shot models, CN without feature extractor improves 5-shot accuracy on Omniglot and miniImageNet. In order to verify the feasibility of our model on the task of CNSL recognition, we expanded the Chinese Natural Sign Language database, from CNSL-80 to CNSL-139, which integrates surface electromyography and inertial signals. The 5-shot accuracy on CNSL-139 increases from 65.25% to 68.83% comparing with the state-of-art model. After connecting with the 1-D convolution feature extractor using Siamese Network’s idea for secondary training, the accuracy increases by 10.38%. During the online test, the feature vector norms are used for selective matching. Although the accuracy drops, it is still at least 5% higher than that without feature extractor. Experimental results confirm the effectiveness of our model on 2-D images and 1-D time-series signals and the improvement of real-time recognition by SM. |
---|---|
ISSN: | 0924-669X 1573-7497 |
DOI: | 10.1007/s10489-020-02170-9 |