Sentiment analysis leveraging emotions and word embeddings
•A flexible, generic methodology for the sentiment prediction of written documents.•The methodology can be easily customized for any language.•Hybrid approach combining Word2Vec and Bag-of-Words representations.•Applied on online user reviews in both Greek and English languages.•Improved accuracy an...
Gespeichert in:
Veröffentlicht in: | Expert systems with applications 2017-03, Vol.69, p.214-224 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •A flexible, generic methodology for the sentiment prediction of written documents.•The methodology can be easily customized for any language.•Hybrid approach combining Word2Vec and Bag-of-Words representations.•Applied on online user reviews in both Greek and English languages.•Improved accuracy and efficiency in comparison to existing other approaches.
Sentiment analysis and opinion mining are valuable for extraction of useful subjective information out of text documents. These tasks have become of great importance, especially for business and marketing professionals, since online posted products and services reviews impact markets and consumers shifts. This work is motivated by the fact that automating retrieval and detection of sentiments expressed for certain products and services embeds complex processes and pose research challenges, due to the textual phenomena and the language specific expression variations. This paper proposes a fast, flexible, generic methodology for sentiment detection out of textual snippets which express people’s opinions in different languages. The proposed methodology adopts a machine learning approach with which textual documents are represented by vectors and are used for training a polarity classification model. Several documents’ vector representation approaches have been studied, including lexicon-based, word embedding-based and hybrid vectorizations. The competence of these feature representations for the sentiment classification task is assessed through experiments on four datasets containing online user reviews in both Greek and English languages, in order to represent high and weak inflection language groups. The proposed methodology requires minimal computational resources, thus, it might have impact in real world scenarios where limited resources is the case. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2016.10.043 |