Instance-based Domain Adaptation via Multiclustering Logistic Approximation

With the explosive growth of the Internet online texts, we could nowadays easily collect a large amount of labeled training data from different source domains. However, a basic assumption in building statistical machine learning models for sentiment analysis is that the training and test data must b...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE intelligent systems 2018-01, Vol.33 (1), p.78-88
Hauptverfasser:	Xu, Feng, Yu, Jianfei, Xia, Rui
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation Adaptation models Affective Computing Approximation Artificial intelligence Biological system modeling Clustering Data mining Feature extraction instance adaptation Internet/Web technologies Logistics Machine learning Mathematical analysis multiclustering logistic approximation multidistributional training data Portable computers Sentiment analysis Statistical models Training Training data
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	With the explosive growth of the Internet online texts, we could nowadays easily collect a large amount of labeled training data from different source domains. However, a basic assumption in building statistical machine learning models for sentiment analysis is that the training and test data must be drawn from the same distribution. Directly training a statistical model usually results in poor performance, when the training and test data have different distributions. Faced with the massive labeled data from different domains, it is therefore important to identify the source-domain training instances that are closely relevant to the target domain, and make better use of them. In this work, we propose a new approach, called multiclustering logistic approximation (MLA), to address this problem. In MLA, we adapt the source-domain training data to the target domain via a framework of multiclustering logistic approximation. Experimental results demonstrate that MLA has significant advantages over the state-of-the-art instance adaptation methods, especially in the scenario of multidistributional training data.
ISSN:	1541-1672 1941-1294
DOI:	10.1109/MIS.2018.012001555