Towards building a high-quality microblog-specific Chinese sentiment lexicon
Due to the huge popularity of microblogging services, microblogs have become important sources of customer opinions. Sentiment analysis systems can provide useful knowledge to decision support systems and decision makers by aggregating and summarizing the opinions in massive microblogs automatically...
Gespeichert in:
Veröffentlicht in: | Decision Support Systems 2016-07, Vol.87, p.39-49 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 49 |
---|---|
container_issue | |
container_start_page | 39 |
container_title | Decision Support Systems |
container_volume | 87 |
creator | Wu, Fangzhao Huang, Yongfeng Song, Yangqiu Liu, Shixia |
description | Due to the huge popularity of microblogging services, microblogs have become important sources of customer opinions. Sentiment analysis systems can provide useful knowledge to decision support systems and decision makers by aggregating and summarizing the opinions in massive microblogs automatically. The most important component of sentiment analysis systems is sentiment lexicon. However, the performance of traditional sentiment lexicons on microblog sentiment analysis is far from satisfactory, especially for Chinese. In this paper, we propose a data-driven approach to build a high-quality microblog-specific sentiment lexicon for Chinese microblog sentiment analysis system. The core of our method is a unified framework that incorporates three kinds of sentiment knowledge for sentiment lexicon construction, i.e., the word-sentiment knowledge extracted from microblogs with emoticons, the sentiment similarity knowledge extracted from words' associations among all the messages, and the prior sentiment knowledge extracted from existing sentiment lexicons. In addition, in order to improve the coverage of our sentiment lexicon, we propose an effective method to detect popular new words in microblogs, which considers not only words' distributions over texts, but also their distributions over users.The detected new words with strong sentiment are incorporated in our sentiment lexicon.We built a microblog-specific Chinese sentiment lexicon on a large microblog dataset with more than 17 million messages. Experimental results on two microblog sentiment datasets show that our microblog-specific sentiment lexicon can significantly improve the performance of microblog sentiment analysis.
•An effective and efficient method to detect the popular use-invented new words in Chinese microblogs.•Three kinds of heterogenous sentiment knowledge are extracted for building sentiment lexicon.•A unified framework incorporating various kinds of sentiment knowledge for microblog-specific sentiment lexicon construction.•Our microblog-specific sentiment lexicon outperforms existing sentiment lexicons. |
doi_str_mv | 10.1016/j.dss.2016.04.007 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1825543122</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0167923616300604</els_id><sourcerecordid>4102321821</sourcerecordid><originalsourceid>FETCH-LOGICAL-c358t-37e900d853ea66d5cfe0957aad4b1bad3b659ff488e6284c8d04de04cc02f68e3</originalsourceid><addsrcrecordid>eNp9kD1PwzAQhi0EEqXwA9gisbAk2PFnxIQqvqRKLGW2HPvSOkqTYidA_z2uysTAcnfSve_p3geha4ILgom4awsXY1GmscCswFieoBlRkuZcVvIUzdJC5lVJxTm6iLHFWFCpxAwtV8OXCS5m9eQ75_t1ZrKNX2_yj8l0ftxnW2_DUHfDOo87sL7xNltsfA8Rsgj96LepZB18ezv0l-isMV2Eq98-R-9Pj6vFS758e35dPCxzS7kacyqhwtgpTsEI4bhtAFdcGuNYTWrjaC141TRMKRClYlY5zBxgZi0uG6GAztHt8e4uDB8TxFFvfbTQdaaHYYqaqJJzRklZJunNH2k7TKFP32kiq0pJjhOlOSJHVcoaY4BG74LfmrDXBOsDX93qxFcf-GrMdOKbPPdHD6Sknx6CjtZDb8H5AHbUbvD_uH8AX5qDgw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1799875087</pqid></control><display><type>article</type><title>Towards building a high-quality microblog-specific Chinese sentiment lexicon</title><source>Access via ScienceDirect (Elsevier)</source><creator>Wu, Fangzhao ; Huang, Yongfeng ; Song, Yangqiu ; Liu, Shixia</creator><creatorcontrib>Wu, Fangzhao ; Huang, Yongfeng ; Song, Yangqiu ; Liu, Shixia</creatorcontrib><description>Due to the huge popularity of microblogging services, microblogs have become important sources of customer opinions. Sentiment analysis systems can provide useful knowledge to decision support systems and decision makers by aggregating and summarizing the opinions in massive microblogs automatically. The most important component of sentiment analysis systems is sentiment lexicon. However, the performance of traditional sentiment lexicons on microblog sentiment analysis is far from satisfactory, especially for Chinese. In this paper, we propose a data-driven approach to build a high-quality microblog-specific sentiment lexicon for Chinese microblog sentiment analysis system. The core of our method is a unified framework that incorporates three kinds of sentiment knowledge for sentiment lexicon construction, i.e., the word-sentiment knowledge extracted from microblogs with emoticons, the sentiment similarity knowledge extracted from words' associations among all the messages, and the prior sentiment knowledge extracted from existing sentiment lexicons. In addition, in order to improve the coverage of our sentiment lexicon, we propose an effective method to detect popular new words in microblogs, which considers not only words' distributions over texts, but also their distributions over users.The detected new words with strong sentiment are incorporated in our sentiment lexicon.We built a microblog-specific Chinese sentiment lexicon on a large microblog dataset with more than 17 million messages. Experimental results on two microblog sentiment datasets show that our microblog-specific sentiment lexicon can significantly improve the performance of microblog sentiment analysis.
•An effective and efficient method to detect the popular use-invented new words in Chinese microblogs.•Three kinds of heterogenous sentiment knowledge are extracted for building sentiment lexicon.•A unified framework incorporating various kinds of sentiment knowledge for microblog-specific sentiment lexicon construction.•Our microblog-specific sentiment lexicon outperforms existing sentiment lexicons.</description><identifier>ISSN: 0167-9236</identifier><identifier>EISSN: 1873-5797</identifier><identifier>DOI: 10.1016/j.dss.2016.04.007</identifier><identifier>CODEN: DSSYDK</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Construction ; Customer services ; Data mining ; Decision making ; Decision support systems ; Messages ; Microblog ; Sentiment analysis ; Sentiment lexicon ; Similarity ; Social networks ; Studies ; Texts</subject><ispartof>Decision Support Systems, 2016-07, Vol.87, p.39-49</ispartof><rights>2016</rights><rights>Copyright Elsevier Sequoia S.A. Jul 2016</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c358t-37e900d853ea66d5cfe0957aad4b1bad3b659ff488e6284c8d04de04cc02f68e3</citedby><cites>FETCH-LOGICAL-c358t-37e900d853ea66d5cfe0957aad4b1bad3b659ff488e6284c8d04de04cc02f68e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.dss.2016.04.007$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>315,781,785,3551,27926,27927,45997</link.rule.ids></links><search><creatorcontrib>Wu, Fangzhao</creatorcontrib><creatorcontrib>Huang, Yongfeng</creatorcontrib><creatorcontrib>Song, Yangqiu</creatorcontrib><creatorcontrib>Liu, Shixia</creatorcontrib><title>Towards building a high-quality microblog-specific Chinese sentiment lexicon</title><title>Decision Support Systems</title><description>Due to the huge popularity of microblogging services, microblogs have become important sources of customer opinions. Sentiment analysis systems can provide useful knowledge to decision support systems and decision makers by aggregating and summarizing the opinions in massive microblogs automatically. The most important component of sentiment analysis systems is sentiment lexicon. However, the performance of traditional sentiment lexicons on microblog sentiment analysis is far from satisfactory, especially for Chinese. In this paper, we propose a data-driven approach to build a high-quality microblog-specific sentiment lexicon for Chinese microblog sentiment analysis system. The core of our method is a unified framework that incorporates three kinds of sentiment knowledge for sentiment lexicon construction, i.e., the word-sentiment knowledge extracted from microblogs with emoticons, the sentiment similarity knowledge extracted from words' associations among all the messages, and the prior sentiment knowledge extracted from existing sentiment lexicons. In addition, in order to improve the coverage of our sentiment lexicon, we propose an effective method to detect popular new words in microblogs, which considers not only words' distributions over texts, but also their distributions over users.The detected new words with strong sentiment are incorporated in our sentiment lexicon.We built a microblog-specific Chinese sentiment lexicon on a large microblog dataset with more than 17 million messages. Experimental results on two microblog sentiment datasets show that our microblog-specific sentiment lexicon can significantly improve the performance of microblog sentiment analysis.
•An effective and efficient method to detect the popular use-invented new words in Chinese microblogs.•Three kinds of heterogenous sentiment knowledge are extracted for building sentiment lexicon.•A unified framework incorporating various kinds of sentiment knowledge for microblog-specific sentiment lexicon construction.•Our microblog-specific sentiment lexicon outperforms existing sentiment lexicons.</description><subject>Construction</subject><subject>Customer services</subject><subject>Data mining</subject><subject>Decision making</subject><subject>Decision support systems</subject><subject>Messages</subject><subject>Microblog</subject><subject>Sentiment analysis</subject><subject>Sentiment lexicon</subject><subject>Similarity</subject><subject>Social networks</subject><subject>Studies</subject><subject>Texts</subject><issn>0167-9236</issn><issn>1873-5797</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNp9kD1PwzAQhi0EEqXwA9gisbAk2PFnxIQqvqRKLGW2HPvSOkqTYidA_z2uysTAcnfSve_p3geha4ILgom4awsXY1GmscCswFieoBlRkuZcVvIUzdJC5lVJxTm6iLHFWFCpxAwtV8OXCS5m9eQ75_t1ZrKNX2_yj8l0ftxnW2_DUHfDOo87sL7xNltsfA8Rsgj96LepZB18ezv0l-isMV2Eq98-R-9Pj6vFS758e35dPCxzS7kacyqhwtgpTsEI4bhtAFdcGuNYTWrjaC141TRMKRClYlY5zBxgZi0uG6GAztHt8e4uDB8TxFFvfbTQdaaHYYqaqJJzRklZJunNH2k7TKFP32kiq0pJjhOlOSJHVcoaY4BG74LfmrDXBOsDX93qxFcf-GrMdOKbPPdHD6Sknx6CjtZDb8H5AHbUbvD_uH8AX5qDgw</recordid><startdate>201607</startdate><enddate>201607</enddate><creator>Wu, Fangzhao</creator><creator>Huang, Yongfeng</creator><creator>Song, Yangqiu</creator><creator>Liu, Shixia</creator><general>Elsevier B.V</general><general>Elsevier Sequoia S.A</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>201607</creationdate><title>Towards building a high-quality microblog-specific Chinese sentiment lexicon</title><author>Wu, Fangzhao ; Huang, Yongfeng ; Song, Yangqiu ; Liu, Shixia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c358t-37e900d853ea66d5cfe0957aad4b1bad3b659ff488e6284c8d04de04cc02f68e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Construction</topic><topic>Customer services</topic><topic>Data mining</topic><topic>Decision making</topic><topic>Decision support systems</topic><topic>Messages</topic><topic>Microblog</topic><topic>Sentiment analysis</topic><topic>Sentiment lexicon</topic><topic>Similarity</topic><topic>Social networks</topic><topic>Studies</topic><topic>Texts</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wu, Fangzhao</creatorcontrib><creatorcontrib>Huang, Yongfeng</creatorcontrib><creatorcontrib>Song, Yangqiu</creatorcontrib><creatorcontrib>Liu, Shixia</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Decision Support Systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wu, Fangzhao</au><au>Huang, Yongfeng</au><au>Song, Yangqiu</au><au>Liu, Shixia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Towards building a high-quality microblog-specific Chinese sentiment lexicon</atitle><jtitle>Decision Support Systems</jtitle><date>2016-07</date><risdate>2016</risdate><volume>87</volume><spage>39</spage><epage>49</epage><pages>39-49</pages><issn>0167-9236</issn><eissn>1873-5797</eissn><coden>DSSYDK</coden><abstract>Due to the huge popularity of microblogging services, microblogs have become important sources of customer opinions. Sentiment analysis systems can provide useful knowledge to decision support systems and decision makers by aggregating and summarizing the opinions in massive microblogs automatically. The most important component of sentiment analysis systems is sentiment lexicon. However, the performance of traditional sentiment lexicons on microblog sentiment analysis is far from satisfactory, especially for Chinese. In this paper, we propose a data-driven approach to build a high-quality microblog-specific sentiment lexicon for Chinese microblog sentiment analysis system. The core of our method is a unified framework that incorporates three kinds of sentiment knowledge for sentiment lexicon construction, i.e., the word-sentiment knowledge extracted from microblogs with emoticons, the sentiment similarity knowledge extracted from words' associations among all the messages, and the prior sentiment knowledge extracted from existing sentiment lexicons. In addition, in order to improve the coverage of our sentiment lexicon, we propose an effective method to detect popular new words in microblogs, which considers not only words' distributions over texts, but also their distributions over users.The detected new words with strong sentiment are incorporated in our sentiment lexicon.We built a microblog-specific Chinese sentiment lexicon on a large microblog dataset with more than 17 million messages. Experimental results on two microblog sentiment datasets show that our microblog-specific sentiment lexicon can significantly improve the performance of microblog sentiment analysis.
•An effective and efficient method to detect the popular use-invented new words in Chinese microblogs.•Three kinds of heterogenous sentiment knowledge are extracted for building sentiment lexicon.•A unified framework incorporating various kinds of sentiment knowledge for microblog-specific sentiment lexicon construction.•Our microblog-specific sentiment lexicon outperforms existing sentiment lexicons.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.dss.2016.04.007</doi><tpages>11</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0167-9236 |
ispartof | Decision Support Systems, 2016-07, Vol.87, p.39-49 |
issn | 0167-9236 1873-5797 |
language | eng |
recordid | cdi_proquest_miscellaneous_1825543122 |
source | Access via ScienceDirect (Elsevier) |
subjects | Construction Customer services Data mining Decision making Decision support systems Messages Microblog Sentiment analysis Sentiment lexicon Similarity Social networks Studies Texts |
title | Towards building a high-quality microblog-specific Chinese sentiment lexicon |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-18T08%3A32%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Towards%20building%20a%20high-quality%20microblog-specific%20Chinese%20sentiment%20lexicon&rft.jtitle=Decision%20Support%20Systems&rft.au=Wu,%20Fangzhao&rft.date=2016-07&rft.volume=87&rft.spage=39&rft.epage=49&rft.pages=39-49&rft.issn=0167-9236&rft.eissn=1873-5797&rft.coden=DSSYDK&rft_id=info:doi/10.1016/j.dss.2016.04.007&rft_dat=%3Cproquest_cross%3E4102321821%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1799875087&rft_id=info:pmid/&rft_els_id=S0167923616300604&rfr_iscdi=true |