Resources for Indonesian Sentiment Analysis
In this work, we present subjectivity lexicons of positive and negative expressions for Indonesian language created by automatically translating English lexicons. Other variations are created by intersecting or unioning them. We compare the lexicons in the task of predicting sentence polarity on a s...
Gespeichert in:
Veröffentlicht in: | Prague bulletin of mathematical linguistics 2015-04, Vol.103 (1), p.21-41 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 41 |
---|---|
container_issue | 1 |
container_start_page | 21 |
container_title | Prague bulletin of mathematical linguistics |
container_volume | 103 |
creator | Franky Bojar, Ondřej Veselovská, Kateřina |
description | In this work, we present subjectivity lexicons of positive and negative expressions for Indonesian language created by automatically translating English lexicons. Other variations are created by intersecting or unioning them. We compare the lexicons in the task of predicting sentence polarity on a set of 446 manually annotated sentences and we also contrast the generic lexicons with a small lexicon extracted directly from the annotated sentences (in a cross-validation setting). We seek for further improvements by assigning weights to lexicon entries and by wrapping the prediction into a machine learning task with a small number of additional features. We observe that lexicons are able to reach high recall but suffer from low precision when predicting whether a sentence is evaluative (positive or negative) or not (neutral). Weighting the lexicons can improve either the recall or the precision but with a comparable decrease in the other measure. |
doi_str_mv | 10.1515/pralin-2015-0002 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_1697869867</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3750467701</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2131-336ef71f47a4f32151e26f5ec8ef698ddd9d0ed9bc991115d21f6b7685dbb30d3</originalsourceid><addsrcrecordid>eNp1kM1LxDAQxYMouK7ePRY8SnUmadP2IiyLHwsLgh_nkDYT6dJta9Ii-9-bUg978TIzh_ceb36MXSPcYYrpfe90U7cxB0xjAOAnbIE5JDEkkp8e3efswvsdgMyFxAW7fSPfja4iH9nORZvWdC35WrfRO7VDvQ8jWrW6OfjaX7IzqxtPV397yT6fHj_WL_H29XmzXm3jiqPAWAhJNkObZDqxgod2xKVNqcrJyiI3xhQGyBRlVRSImBqOVpaZzFNTlgKMWLKbObd33fdIflC7UDGU8AplkeUhRGZBBbOqcp33jqzqXb3X7qAQ1IREzUjUhERNSILlYbb86GYgZ-jLjYdwHOX_Y0UQGJ77BcFIaUo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1697869867</pqid></control><display><type>article</type><title>Resources for Indonesian Sentiment Analysis</title><source>EZB Electronic Journals Library</source><creator>Franky ; Bojar, Ondřej ; Veselovská, Kateřina</creator><creatorcontrib>Franky ; Bojar, Ondřej ; Veselovská, Kateřina ; Franky</creatorcontrib><description>In this work, we present subjectivity lexicons of positive and negative expressions for Indonesian language created by automatically translating English lexicons. Other variations are created by intersecting or unioning them. We compare the lexicons in the task of predicting sentence polarity on a set of 446 manually annotated sentences and we also contrast the generic lexicons with a small lexicon extracted directly from the annotated sentences (in a cross-validation setting). We seek for further improvements by assigning weights to lexicon entries and by wrapping the prediction into a machine learning task with a small number of additional features. We observe that lexicons are able to reach high recall but suffer from low precision when predicting whether a sentence is evaluative (positive or negative) or not (neutral). Weighting the lexicons can improve either the recall or the precision but with a comparable decrease in the other measure.</description><identifier>ISSN: 1804-0462</identifier><identifier>ISSN: 0032-6585</identifier><identifier>EISSN: 1804-0462</identifier><identifier>DOI: 10.1515/pralin-2015-0002</identifier><language>eng</language><publisher>Prague: De Gruyter Open</publisher><subject>Data mining ; Educational activities ; English language ; Indonesian language ; Lexicon ; Linguistics ; Machine learning ; Machine translation ; Polarity ; Predictions ; Recall ; Seeds ; Sentences ; Sentiment analysis ; Subjectivity ; Translating</subject><ispartof>Prague bulletin of mathematical linguistics, 2015-04, Vol.103 (1), p.21-41</ispartof><rights>Copyright De Gruyter Open Sp. z o.o. 2015</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2131-336ef71f47a4f32151e26f5ec8ef698ddd9d0ed9bc991115d21f6b7685dbb30d3</citedby><cites>FETCH-LOGICAL-c2131-336ef71f47a4f32151e26f5ec8ef698ddd9d0ed9bc991115d21f6b7685dbb30d3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,781,785,27929,27930</link.rule.ids></links><search><creatorcontrib>Franky</creatorcontrib><creatorcontrib>Bojar, Ondřej</creatorcontrib><creatorcontrib>Veselovská, Kateřina</creatorcontrib><creatorcontrib>Franky</creatorcontrib><title>Resources for Indonesian Sentiment Analysis</title><title>Prague bulletin of mathematical linguistics</title><description>In this work, we present subjectivity lexicons of positive and negative expressions for Indonesian language created by automatically translating English lexicons. Other variations are created by intersecting or unioning them. We compare the lexicons in the task of predicting sentence polarity on a set of 446 manually annotated sentences and we also contrast the generic lexicons with a small lexicon extracted directly from the annotated sentences (in a cross-validation setting). We seek for further improvements by assigning weights to lexicon entries and by wrapping the prediction into a machine learning task with a small number of additional features. We observe that lexicons are able to reach high recall but suffer from low precision when predicting whether a sentence is evaluative (positive or negative) or not (neutral). Weighting the lexicons can improve either the recall or the precision but with a comparable decrease in the other measure.</description><subject>Data mining</subject><subject>Educational activities</subject><subject>English language</subject><subject>Indonesian language</subject><subject>Lexicon</subject><subject>Linguistics</subject><subject>Machine learning</subject><subject>Machine translation</subject><subject>Polarity</subject><subject>Predictions</subject><subject>Recall</subject><subject>Seeds</subject><subject>Sentences</subject><subject>Sentiment analysis</subject><subject>Subjectivity</subject><subject>Translating</subject><issn>1804-0462</issn><issn>0032-6585</issn><issn>1804-0462</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AIMQZ</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp1kM1LxDAQxYMouK7ePRY8SnUmadP2IiyLHwsLgh_nkDYT6dJta9Ii-9-bUg978TIzh_ceb36MXSPcYYrpfe90U7cxB0xjAOAnbIE5JDEkkp8e3efswvsdgMyFxAW7fSPfja4iH9nORZvWdC35WrfRO7VDvQ8jWrW6OfjaX7IzqxtPV397yT6fHj_WL_H29XmzXm3jiqPAWAhJNkObZDqxgod2xKVNqcrJyiI3xhQGyBRlVRSImBqOVpaZzFNTlgKMWLKbObd33fdIflC7UDGU8AplkeUhRGZBBbOqcp33jqzqXb3X7qAQ1IREzUjUhERNSILlYbb86GYgZ-jLjYdwHOX_Y0UQGJ77BcFIaUo</recordid><startdate>20150401</startdate><enddate>20150401</enddate><creator>Franky</creator><creator>Bojar, Ondřej</creator><creator>Veselovská, Kateřina</creator><general>De Gruyter Open</general><general>Institute of Formal and Applied Linguistics</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7T9</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AIMQZ</scope><scope>ALSLI</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BYOGL</scope><scope>CCPQU</scope><scope>CPGLG</scope><scope>CRLPW</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>LIQON</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20150401</creationdate><title>Resources for Indonesian Sentiment Analysis</title><author>Franky ; Bojar, Ondřej ; Veselovská, Kateřina</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2131-336ef71f47a4f32151e26f5ec8ef698ddd9d0ed9bc991115d21f6b7685dbb30d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Data mining</topic><topic>Educational activities</topic><topic>English language</topic><topic>Indonesian language</topic><topic>Lexicon</topic><topic>Linguistics</topic><topic>Machine learning</topic><topic>Machine translation</topic><topic>Polarity</topic><topic>Predictions</topic><topic>Recall</topic><topic>Seeds</topic><topic>Sentences</topic><topic>Sentiment analysis</topic><topic>Subjectivity</topic><topic>Translating</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Franky</creatorcontrib><creatorcontrib>Bojar, Ondřej</creatorcontrib><creatorcontrib>Veselovská, Kateřina</creatorcontrib><creatorcontrib>Franky</creatorcontrib><collection>CrossRef</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest One Literature</collection><collection>Social Science Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>East Europe, Central Europe Database</collection><collection>ProQuest One Community College</collection><collection>Linguistics Collection</collection><collection>Linguistics Database</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>One Literature (ProQuest)</collection><collection>Engineering Database</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><jtitle>Prague bulletin of mathematical linguistics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Franky</au><au>Bojar, Ondřej</au><au>Veselovská, Kateřina</au><aucorp>Franky</aucorp><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Resources for Indonesian Sentiment Analysis</atitle><jtitle>Prague bulletin of mathematical linguistics</jtitle><date>2015-04-01</date><risdate>2015</risdate><volume>103</volume><issue>1</issue><spage>21</spage><epage>41</epage><pages>21-41</pages><issn>1804-0462</issn><issn>0032-6585</issn><eissn>1804-0462</eissn><abstract>In this work, we present subjectivity lexicons of positive and negative expressions for Indonesian language created by automatically translating English lexicons. Other variations are created by intersecting or unioning them. We compare the lexicons in the task of predicting sentence polarity on a set of 446 manually annotated sentences and we also contrast the generic lexicons with a small lexicon extracted directly from the annotated sentences (in a cross-validation setting). We seek for further improvements by assigning weights to lexicon entries and by wrapping the prediction into a machine learning task with a small number of additional features. We observe that lexicons are able to reach high recall but suffer from low precision when predicting whether a sentence is evaluative (positive or negative) or not (neutral). Weighting the lexicons can improve either the recall or the precision but with a comparable decrease in the other measure.</abstract><cop>Prague</cop><pub>De Gruyter Open</pub><doi>10.1515/pralin-2015-0002</doi><tpages>21</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1804-0462 |
ispartof | Prague bulletin of mathematical linguistics, 2015-04, Vol.103 (1), p.21-41 |
issn | 1804-0462 0032-6585 1804-0462 |
language | eng |
recordid | cdi_proquest_journals_1697869867 |
source | EZB Electronic Journals Library |
subjects | Data mining Educational activities English language Indonesian language Lexicon Linguistics Machine learning Machine translation Polarity Predictions Recall Seeds Sentences Sentiment analysis Subjectivity Translating |
title | Resources for Indonesian Sentiment Analysis |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-13T18%3A35%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Resources%20for%20Indonesian%20Sentiment%20Analysis&rft.jtitle=Prague%20bulletin%20of%20mathematical%20linguistics&rft.au=Franky&rft.aucorp=Franky&rft.date=2015-04-01&rft.volume=103&rft.issue=1&rft.spage=21&rft.epage=41&rft.pages=21-41&rft.issn=1804-0462&rft.eissn=1804-0462&rft_id=info:doi/10.1515/pralin-2015-0002&rft_dat=%3Cproquest_cross%3E3750467701%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1697869867&rft_id=info:pmid/&rfr_iscdi=true |