Active zero-shot learning: a novel approach to extreme multi-labeled classification
Big data bring a huge volume of data in a great speed and in many formats with extremely many labels and concepts to be modeled and predicted, such as in text and image tagging, online advertisement placement, recommendation systems, NLP. This emerging issue of big data is termed “extreme multi-labe...
Gespeichert in:
Veröffentlicht in: | International journal of data science and analytics 2017-05, Vol.3 (3), p.151-160 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Big data bring a huge volume of data in a great speed and in many formats with extremely many labels and concepts to be modeled and predicted, such as in text and image tagging, online advertisement placement, recommendation systems, NLP. This emerging issue of big data is termed “extreme multi-labeled classification” (XMLC) and is challenging due to the time, space and sample complexity in predictive model training and testing. We first define general XMLC and then categorize and review recent methods based on two specific forms of XMLC. We propose a novel method called active zero-shot learning to reduce the above complexities. Since the performance of the unseen class prediction largely depends on the seen classes that have labeled data, we challenge the critical and yet often overlooked assumption that the labeled data can only be passively acquired. We propose a new learning paradigm aiming at accurate predictions of a large number of unseen labels using labeled data from only an intelligently selected small set of seed classes with the help of external knowledge. We further demonstrate that the proposed strategy has desirable probabilistic properties to facilitate unseen classes prediction. Experiments on 4 datasets demonstrate that the proposed algorithm is superior to a wide spectrum of baselines. Based on our findings, we point out several critical and promising future directions in XMLC. |
---|---|
ISSN: | 2364-415X 2364-4168 |
DOI: | 10.1007/s41060-017-0042-5 |