Development of a Chinese opinion-mining system for application to Internet online forums
Articles posted on a forum often contain new Internet words related to opinion elements (feature words and opinion words). Consequently, existing Chinese opinion-mining systems may exhibit low recall and precision because they cannot recognize these new Internet words. Therefore, we propose a simple...
Gespeichert in:
Veröffentlicht in: | The Journal of supercomputing 2017-07, Vol.73 (7), p.2987-3001 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Articles posted on a forum often contain new Internet words related to opinion elements (feature words and opinion words). Consequently, existing Chinese opinion-mining systems may exhibit low recall and precision because they cannot recognize these new Internet words. Therefore, we propose a simple algorithm to elaborate on the opinion elements of such articles by extracting the opinion elements. Moreover, when an opinion word is combined with a specific word or concatenated with another opinion word, it may cause a change in the polarity or meaning of the opinion. This fact is prone to cause difficulties by changing the polarity or meaning of certain opinion elements, leading to errors in the analysis results of the Chinese system. We designed three algorithms with context dependency to address this problem. In this paper, we develop a semi-automatic Chinese opinion-mining system with these algorithms to extract these new opinion elements. Then, we determine whether the new word identified through manual judgment is a useful opinion element for a specific domain and add it to the thesaurus. In comparison with semi-automatic annotation methods, our approach can save considerable labor. After a 20-month follow-up analysis, the experimental data indicated that the precision, recall, and
F
1 of the system reached 84.0, 89.4 %, and 0.865, respectively. |
---|---|
ISSN: | 0920-8542 1573-0484 |
DOI: | 10.1007/s11227-016-1816-6 |