Sentiment Analysis of Short Informal Texts

We describe a state-of-the-art sentiment analysis system that detects (a) the sentiment of short informal textual messages such as tweets and SMS (message-level task) and (b) the sentiment of a word or a phrase within a message (term-level task). The system is based on a supervised statistical text...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of artificial intelligence research 2014-08, Vol.50, p.723-762
Hauptverfasser:	Kiritchenko, S., Zhu, X., Mohammad, S. M.
Format:	Artikel
Sprache:	eng
Schlagworte:	Ablation Artificial intelligence Data mining Sentiment analysis Short message service
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	762
container_issue
container_start_page	723
container_title	The Journal of artificial intelligence research
container_volume	50
creator	Kiritchenko, S. Zhu, X. Mohammad, S. M.
description	We describe a state-of-the-art sentiment analysis system that detects (a) the sentiment of short informal textual messages such as tweets and SMS (message-level task) and (b) the sentiment of a word or a phrase within a message (term-level task). The system is based on a supervised statistical text classification approach leveraging a variety of surface-form, semantic, and sentiment features. The sentiment features are primarily derived from novel high-coverage tweet-specific sentiment lexicons. These lexicons are automatically generated from tweets with sentiment-word hashtags and from tweets with emoticons. To adequately capture the sentiment of words in negated contexts, a separate sentiment lexicon is generated for negated words. The system ranked first in the SemEval-2013 shared task `Sentiment Analysis in Twitter' (Task 2), obtaining an F-score of 69.02 in the message-level task and 88.93 in the term-level task. Post-competition improvements boost the performance to an F-score of 70.45 (message-level task) and 89.50 (term-level task). The system also obtains state-of-the-art performance on two additional datasets: the SemEval-2013 SMS test set and a corpus of movie review excerpts. The ablation experiments demonstrate that the use of the automatically generated lexicons results in performance gains of up to 6.5 absolute percentage points.
doi_str_mv	10.1613/jair.4272
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2554099146</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2554099146</sourcerecordid><originalsourceid>FETCH-LOGICAL-c238t-4694e2b8b4b9eccb2ae6b38376d64cfc1b3c25c6e9e16fbd37e71755aba600583</originalsourceid><addsrcrecordid>eNpNkE1LAzEYhIMoWKsH_8GCJ4Wt-c7mWIofhYKH1nNI0je4y-6mJluw_95d6sHLzByGYXgQuid4QSRhz42t04JTRS_QjGAlS62EuvyXr9FNzg3GRHNazdDTFvqh7kYplr1tT7nORQzF9iumoVj3IabOtsUOfoZ8i66CbTPc_fkcfb6-7Fbv5ebjbb1abkpPWTWUXGoO1FWOOw3eO2pBOlYxJfeS--CJY54KL0EDkcHtmQJFlBDWWYmxqNgcPZx3Dyl-HyEPponHNJ7LhgrBsdaEy7H1eG75FHNOEMwh1Z1NJ0OwmVCYCYWZULBfz0BQrw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2554099146</pqid></control><display><type>article</type><title>Sentiment Analysis of Short Informal Texts</title><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>Free E- Journals</source><creator>Kiritchenko, S. ; Zhu, X. ; Mohammad, S. M.</creator><creatorcontrib>Kiritchenko, S. ; Zhu, X. ; Mohammad, S. M.</creatorcontrib><description>We describe a state-of-the-art sentiment analysis system that detects (a) the sentiment of short informal textual messages such as tweets and SMS (message-level task) and (b) the sentiment of a word or a phrase within a message (term-level task). The system is based on a supervised statistical text classification approach leveraging a variety of surface-form, semantic, and sentiment features. The sentiment features are primarily derived from novel high-coverage tweet-specific sentiment lexicons. These lexicons are automatically generated from tweets with sentiment-word hashtags and from tweets with emoticons. To adequately capture the sentiment of words in negated contexts, a separate sentiment lexicon is generated for negated words. The system ranked first in the SemEval-2013 shared task `Sentiment Analysis in Twitter' (Task 2), obtaining an F-score of 69.02 in the message-level task and 88.93 in the term-level task. Post-competition improvements boost the performance to an F-score of 70.45 (message-level task) and 89.50 (term-level task). The system also obtains state-of-the-art performance on two additional datasets: the SemEval-2013 SMS test set and a corpus of movie review excerpts. The ablation experiments demonstrate that the use of the automatically generated lexicons results in performance gains of up to 6.5 absolute percentage points.</description><identifier>ISSN: 1076-9757</identifier><identifier>EISSN: 1076-9757</identifier><identifier>EISSN: 1943-5037</identifier><identifier>DOI: 10.1613/jair.4272</identifier><language>eng</language><publisher>San Francisco: AI Access Foundation</publisher><subject>Ablation ; Artificial intelligence ; Data mining ; Sentiment analysis ; Short message service</subject><ispartof>The Journal of artificial intelligence research, 2014-08, Vol.50, p.723-762</ispartof><rights>2014. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the associated terms available at https://www.jair.org/index.php/jair/about</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c238t-4694e2b8b4b9eccb2ae6b38376d64cfc1b3c25c6e9e16fbd37e71755aba600583</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,860,27901,27902</link.rule.ids></links><search><creatorcontrib>Kiritchenko, S.</creatorcontrib><creatorcontrib>Zhu, X.</creatorcontrib><creatorcontrib>Mohammad, S. M.</creatorcontrib><title>Sentiment Analysis of Short Informal Texts</title><title>The Journal of artificial intelligence research</title><description>We describe a state-of-the-art sentiment analysis system that detects (a) the sentiment of short informal textual messages such as tweets and SMS (message-level task) and (b) the sentiment of a word or a phrase within a message (term-level task). The system is based on a supervised statistical text classification approach leveraging a variety of surface-form, semantic, and sentiment features. The sentiment features are primarily derived from novel high-coverage tweet-specific sentiment lexicons. These lexicons are automatically generated from tweets with sentiment-word hashtags and from tweets with emoticons. To adequately capture the sentiment of words in negated contexts, a separate sentiment lexicon is generated for negated words. The system ranked first in the SemEval-2013 shared task `Sentiment Analysis in Twitter' (Task 2), obtaining an F-score of 69.02 in the message-level task and 88.93 in the term-level task. Post-competition improvements boost the performance to an F-score of 70.45 (message-level task) and 89.50 (term-level task). The system also obtains state-of-the-art performance on two additional datasets: the SemEval-2013 SMS test set and a corpus of movie review excerpts. The ablation experiments demonstrate that the use of the automatically generated lexicons results in performance gains of up to 6.5 absolute percentage points.</description><subject>Ablation</subject><subject>Artificial intelligence</subject><subject>Data mining</subject><subject>Sentiment analysis</subject><subject>Short message service</subject><issn>1076-9757</issn><issn>1076-9757</issn><issn>1943-5037</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNpNkE1LAzEYhIMoWKsH_8GCJ4Wt-c7mWIofhYKH1nNI0je4y-6mJluw_95d6sHLzByGYXgQuid4QSRhz42t04JTRS_QjGAlS62EuvyXr9FNzg3GRHNazdDTFvqh7kYplr1tT7nORQzF9iumoVj3IabOtsUOfoZ8i66CbTPc_fkcfb6-7Fbv5ebjbb1abkpPWTWUXGoO1FWOOw3eO2pBOlYxJfeS--CJY54KL0EDkcHtmQJFlBDWWYmxqNgcPZx3Dyl-HyEPponHNJ7LhgrBsdaEy7H1eG75FHNOEMwh1Z1NJ0OwmVCYCYWZULBfz0BQrw</recordid><startdate>20140820</startdate><enddate>20140820</enddate><creator>Kiritchenko, S.</creator><creator>Zhu, X.</creator><creator>Mohammad, S. M.</creator><general>AI Access Foundation</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope></search><sort><creationdate>20140820</creationdate><title>Sentiment Analysis of Short Informal Texts</title><author>Kiritchenko, S. ; Zhu, X. ; Mohammad, S. M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c238t-4694e2b8b4b9eccb2ae6b38376d64cfc1b3c25c6e9e16fbd37e71755aba600583</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Ablation</topic><topic>Artificial intelligence</topic><topic>Data mining</topic><topic>Sentiment analysis</topic><topic>Short message service</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kiritchenko, S.</creatorcontrib><creatorcontrib>Zhu, X.</creatorcontrib><creatorcontrib>Mohammad, S. M.</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>The Journal of artificial intelligence research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kiritchenko, S.</au><au>Zhu, X.</au><au>Mohammad, S. M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sentiment Analysis of Short Informal Texts</atitle><jtitle>The Journal of artificial intelligence research</jtitle><date>2014-08-20</date><risdate>2014</risdate><volume>50</volume><spage>723</spage><epage>762</epage><pages>723-762</pages><issn>1076-9757</issn><eissn>1076-9757</eissn><eissn>1943-5037</eissn><abstract>We describe a state-of-the-art sentiment analysis system that detects (a) the sentiment of short informal textual messages such as tweets and SMS (message-level task) and (b) the sentiment of a word or a phrase within a message (term-level task). The system is based on a supervised statistical text classification approach leveraging a variety of surface-form, semantic, and sentiment features. The sentiment features are primarily derived from novel high-coverage tweet-specific sentiment lexicons. These lexicons are automatically generated from tweets with sentiment-word hashtags and from tweets with emoticons. To adequately capture the sentiment of words in negated contexts, a separate sentiment lexicon is generated for negated words. The system ranked first in the SemEval-2013 shared task `Sentiment Analysis in Twitter' (Task 2), obtaining an F-score of 69.02 in the message-level task and 88.93 in the term-level task. Post-competition improvements boost the performance to an F-score of 70.45 (message-level task) and 89.50 (term-level task). The system also obtains state-of-the-art performance on two additional datasets: the SemEval-2013 SMS test set and a corpus of movie review excerpts. The ablation experiments demonstrate that the use of the automatically generated lexicons results in performance gains of up to 6.5 absolute percentage points.</abstract><cop>San Francisco</cop><pub>AI Access Foundation</pub><doi>10.1613/jair.4272</doi><tpages>40</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1076-9757
ispartof	The Journal of artificial intelligence research, 2014-08, Vol.50, p.723-762
issn	1076-9757 1076-9757 1943-5037
language	eng
recordid	cdi_proquest_journals_2554099146
source	DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; Free E- Journals
subjects	Ablation Artificial intelligence Data mining Sentiment analysis Short message service
title	Sentiment Analysis of Short Informal Texts
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T01%3A34%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sentiment%20Analysis%20of%20Short%20Informal%20Texts&rft.jtitle=The%20Journal%20of%20artificial%20intelligence%20research&rft.au=Kiritchenko,%20S.&rft.date=2014-08-20&rft.volume=50&rft.spage=723&rft.epage=762&rft.pages=723-762&rft.issn=1076-9757&rft.eissn=1076-9757&rft_id=info:doi/10.1613/jair.4272&rft_dat=%3Cproquest_cross%3E2554099146%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2554099146&rft_id=info:pmid/&rfr_iscdi=true