Resources for Indonesian Sentiment Analysis

In this work, we present subjectivity lexicons of positive and negative expressions for Indonesian language created by automatically translating English lexicons. Other variations are created by intersecting or unioning them. We compare the lexicons in the task of predicting sentence polarity on a s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Prague bulletin of mathematical linguistics 2015-04, Vol.103 (1), p.21-41
Hauptverfasser: Franky, Bojar, Ondřej, Veselovská, Kateřina
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 41
container_issue 1
container_start_page 21
container_title Prague bulletin of mathematical linguistics
container_volume 103
creator Franky
Bojar, Ondřej
Veselovská, Kateřina
description In this work, we present subjectivity lexicons of positive and negative expressions for Indonesian language created by automatically translating English lexicons. Other variations are created by intersecting or unioning them. We compare the lexicons in the task of predicting sentence polarity on a set of 446 manually annotated sentences and we also contrast the generic lexicons with a small lexicon extracted directly from the annotated sentences (in a cross-validation setting). We seek for further improvements by assigning weights to lexicon entries and by wrapping the prediction into a machine learning task with a small number of additional features. We observe that lexicons are able to reach high recall but suffer from low precision when predicting whether a sentence is evaluative (positive or negative) or not (neutral). Weighting the lexicons can improve either the recall or the precision but with a comparable decrease in the other measure.
doi_str_mv 10.1515/pralin-2015-0002
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_1697869867</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3750467701</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2131-336ef71f47a4f32151e26f5ec8ef698ddd9d0ed9bc991115d21f6b7685dbb30d3</originalsourceid><addsrcrecordid>eNp1kM1LxDAQxYMouK7ePRY8SnUmadP2IiyLHwsLgh_nkDYT6dJta9Ii-9-bUg978TIzh_ceb36MXSPcYYrpfe90U7cxB0xjAOAnbIE5JDEkkp8e3efswvsdgMyFxAW7fSPfja4iH9nORZvWdC35WrfRO7VDvQ8jWrW6OfjaX7IzqxtPV397yT6fHj_WL_H29XmzXm3jiqPAWAhJNkObZDqxgod2xKVNqcrJyiI3xhQGyBRlVRSImBqOVpaZzFNTlgKMWLKbObd33fdIflC7UDGU8AplkeUhRGZBBbOqcp33jqzqXb3X7qAQ1IREzUjUhERNSILlYbb86GYgZ-jLjYdwHOX_Y0UQGJ77BcFIaUo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1697869867</pqid></control><display><type>article</type><title>Resources for Indonesian Sentiment Analysis</title><source>EZB Electronic Journals Library</source><creator>Franky ; Bojar, Ondřej ; Veselovská, Kateřina</creator><creatorcontrib>Franky ; Bojar, Ondřej ; Veselovská, Kateřina ; Franky</creatorcontrib><description>In this work, we present subjectivity lexicons of positive and negative expressions for Indonesian language created by automatically translating English lexicons. Other variations are created by intersecting or unioning them. We compare the lexicons in the task of predicting sentence polarity on a set of 446 manually annotated sentences and we also contrast the generic lexicons with a small lexicon extracted directly from the annotated sentences (in a cross-validation setting). We seek for further improvements by assigning weights to lexicon entries and by wrapping the prediction into a machine learning task with a small number of additional features. We observe that lexicons are able to reach high recall but suffer from low precision when predicting whether a sentence is evaluative (positive or negative) or not (neutral). Weighting the lexicons can improve either the recall or the precision but with a comparable decrease in the other measure.</description><identifier>ISSN: 1804-0462</identifier><identifier>ISSN: 0032-6585</identifier><identifier>EISSN: 1804-0462</identifier><identifier>DOI: 10.1515/pralin-2015-0002</identifier><language>eng</language><publisher>Prague: De Gruyter Open</publisher><subject>Data mining ; Educational activities ; English language ; Indonesian language ; Lexicon ; Linguistics ; Machine learning ; Machine translation ; Polarity ; Predictions ; Recall ; Seeds ; Sentences ; Sentiment analysis ; Subjectivity ; Translating</subject><ispartof>Prague bulletin of mathematical linguistics, 2015-04, Vol.103 (1), p.21-41</ispartof><rights>Copyright De Gruyter Open Sp. z o.o. 2015</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2131-336ef71f47a4f32151e26f5ec8ef698ddd9d0ed9bc991115d21f6b7685dbb30d3</citedby><cites>FETCH-LOGICAL-c2131-336ef71f47a4f32151e26f5ec8ef698ddd9d0ed9bc991115d21f6b7685dbb30d3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,781,785,27929,27930</link.rule.ids></links><search><creatorcontrib>Franky</creatorcontrib><creatorcontrib>Bojar, Ondřej</creatorcontrib><creatorcontrib>Veselovská, Kateřina</creatorcontrib><creatorcontrib>Franky</creatorcontrib><title>Resources for Indonesian Sentiment Analysis</title><title>Prague bulletin of mathematical linguistics</title><description>In this work, we present subjectivity lexicons of positive and negative expressions for Indonesian language created by automatically translating English lexicons. Other variations are created by intersecting or unioning them. We compare the lexicons in the task of predicting sentence polarity on a set of 446 manually annotated sentences and we also contrast the generic lexicons with a small lexicon extracted directly from the annotated sentences (in a cross-validation setting). We seek for further improvements by assigning weights to lexicon entries and by wrapping the prediction into a machine learning task with a small number of additional features. We observe that lexicons are able to reach high recall but suffer from low precision when predicting whether a sentence is evaluative (positive or negative) or not (neutral). Weighting the lexicons can improve either the recall or the precision but with a comparable decrease in the other measure.</description><subject>Data mining</subject><subject>Educational activities</subject><subject>English language</subject><subject>Indonesian language</subject><subject>Lexicon</subject><subject>Linguistics</subject><subject>Machine learning</subject><subject>Machine translation</subject><subject>Polarity</subject><subject>Predictions</subject><subject>Recall</subject><subject>Seeds</subject><subject>Sentences</subject><subject>Sentiment analysis</subject><subject>Subjectivity</subject><subject>Translating</subject><issn>1804-0462</issn><issn>0032-6585</issn><issn>1804-0462</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AIMQZ</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp1kM1LxDAQxYMouK7ePRY8SnUmadP2IiyLHwsLgh_nkDYT6dJta9Ii-9-bUg978TIzh_ceb36MXSPcYYrpfe90U7cxB0xjAOAnbIE5JDEkkp8e3efswvsdgMyFxAW7fSPfja4iH9nORZvWdC35WrfRO7VDvQ8jWrW6OfjaX7IzqxtPV397yT6fHj_WL_H29XmzXm3jiqPAWAhJNkObZDqxgod2xKVNqcrJyiI3xhQGyBRlVRSImBqOVpaZzFNTlgKMWLKbObd33fdIflC7UDGU8AplkeUhRGZBBbOqcp33jqzqXb3X7qAQ1IREzUjUhERNSILlYbb86GYgZ-jLjYdwHOX_Y0UQGJ77BcFIaUo</recordid><startdate>20150401</startdate><enddate>20150401</enddate><creator>Franky</creator><creator>Bojar, Ondřej</creator><creator>Veselovská, Kateřina</creator><general>De Gruyter Open</general><general>Institute of Formal and Applied Linguistics</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7T9</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AIMQZ</scope><scope>ALSLI</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BYOGL</scope><scope>CCPQU</scope><scope>CPGLG</scope><scope>CRLPW</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>LIQON</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20150401</creationdate><title>Resources for Indonesian Sentiment Analysis</title><author>Franky ; Bojar, Ondřej ; Veselovská, Kateřina</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2131-336ef71f47a4f32151e26f5ec8ef698ddd9d0ed9bc991115d21f6b7685dbb30d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Data mining</topic><topic>Educational activities</topic><topic>English language</topic><topic>Indonesian language</topic><topic>Lexicon</topic><topic>Linguistics</topic><topic>Machine learning</topic><topic>Machine translation</topic><topic>Polarity</topic><topic>Predictions</topic><topic>Recall</topic><topic>Seeds</topic><topic>Sentences</topic><topic>Sentiment analysis</topic><topic>Subjectivity</topic><topic>Translating</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Franky</creatorcontrib><creatorcontrib>Bojar, Ondřej</creatorcontrib><creatorcontrib>Veselovská, Kateřina</creatorcontrib><creatorcontrib>Franky</creatorcontrib><collection>CrossRef</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest One Literature</collection><collection>Social Science Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>East Europe, Central Europe Database</collection><collection>ProQuest One Community College</collection><collection>Linguistics Collection</collection><collection>Linguistics Database</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>One Literature (ProQuest)</collection><collection>Engineering Database</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><jtitle>Prague bulletin of mathematical linguistics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Franky</au><au>Bojar, Ondřej</au><au>Veselovská, Kateřina</au><aucorp>Franky</aucorp><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Resources for Indonesian Sentiment Analysis</atitle><jtitle>Prague bulletin of mathematical linguistics</jtitle><date>2015-04-01</date><risdate>2015</risdate><volume>103</volume><issue>1</issue><spage>21</spage><epage>41</epage><pages>21-41</pages><issn>1804-0462</issn><issn>0032-6585</issn><eissn>1804-0462</eissn><abstract>In this work, we present subjectivity lexicons of positive and negative expressions for Indonesian language created by automatically translating English lexicons. Other variations are created by intersecting or unioning them. We compare the lexicons in the task of predicting sentence polarity on a set of 446 manually annotated sentences and we also contrast the generic lexicons with a small lexicon extracted directly from the annotated sentences (in a cross-validation setting). We seek for further improvements by assigning weights to lexicon entries and by wrapping the prediction into a machine learning task with a small number of additional features. We observe that lexicons are able to reach high recall but suffer from low precision when predicting whether a sentence is evaluative (positive or negative) or not (neutral). Weighting the lexicons can improve either the recall or the precision but with a comparable decrease in the other measure.</abstract><cop>Prague</cop><pub>De Gruyter Open</pub><doi>10.1515/pralin-2015-0002</doi><tpages>21</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1804-0462
ispartof Prague bulletin of mathematical linguistics, 2015-04, Vol.103 (1), p.21-41
issn 1804-0462
0032-6585
1804-0462
language eng
recordid cdi_proquest_journals_1697869867
source EZB Electronic Journals Library
subjects Data mining
Educational activities
English language
Indonesian language
Lexicon
Linguistics
Machine learning
Machine translation
Polarity
Predictions
Recall
Seeds
Sentences
Sentiment analysis
Subjectivity
Translating
title Resources for Indonesian Sentiment Analysis
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-13T18%3A35%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Resources%20for%20Indonesian%20Sentiment%20Analysis&rft.jtitle=Prague%20bulletin%20of%20mathematical%20linguistics&rft.au=Franky&rft.aucorp=Franky&rft.date=2015-04-01&rft.volume=103&rft.issue=1&rft.spage=21&rft.epage=41&rft.pages=21-41&rft.issn=1804-0462&rft.eissn=1804-0462&rft_id=info:doi/10.1515/pralin-2015-0002&rft_dat=%3Cproquest_cross%3E3750467701%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1697869867&rft_id=info:pmid/&rfr_iscdi=true