Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data
Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focuse...
Gespeichert in:
Veröffentlicht in: | Ingénierie des systèmes d'Information 2022-02, Vol.27 (1), p.93-100 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng ; fre |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 100 |
---|---|
container_issue | 1 |
container_start_page | 93 |
container_title | Ingénierie des systèmes d'Information |
container_volume | 27 |
creator | Kannan, Eswariah Kothamasu, Lakshmi Anusha |
description | Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focused on binary or ternary sentiments in monolingual texts. However, emotions emerge in bilingual and multilingual texts. The emotions expressed in today's social media, including microblogs, are different. We use a dataset that combines everyday dialogue, easy and emotional stimulation to carry out the algorithm to create a balanced dataset with five labels: joy, sad, anger, fear, and neutral. This entails the preparation of datasets and conventional machine learning models. We categorized tweets using the Bidirectional Encoder Representations from Transformers (BERT) language model but are pre-trained in plain text instead of tweets using BERT Transfer Learning (TensorFlow Keras). In this paper we use the HuggingFace’s transformers library to fine-tune pretrained BERT model for a classification task which is termed as modified (M-BERT). Our modified (M-BERT) model is an average F1-score of 97.63% in all of our taxonomy, which leaves more space for change, is our modified (M-BERT) model. We show that the dual use of an F1-score as a combination of M-BERT and Machine Learning methods increases classification accuracy by 24.92%. as related to baseline BERT model. |
doi_str_mv | 10.18280/isi.270111 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2807019897</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2807019897</sourcerecordid><originalsourceid>FETCH-LOGICAL-c1011-20269ed6ba09f5d9d4e70c73b3e1d6cdd2860310294b8c8c657dcd5c1a56197f3</originalsourceid><addsrcrecordid>eNotkMFKAzEQhoMoWGpPvkDAo6Rmkm6yOba1VaEi6HrxEtIkqynb3ZrsIn17g3UOM_zwM3x8CF0DnULJSnoXUpgySQHgDI0YgCASWHGORiA4J8ABLtEkpR3NwzmIGR2hj3VoPamGNrSfeLF6rfDCJO_w_HCInbFfuO4ifh6aPpBlY1LCb77twz4vPG9Nc0wh4a7F1U_oex_xat_1Ied705srdFGbJvnJ_x2j9_WqWj6SzcvD03K-IRYyK2GUCeWd2Bqq6sIpN_OSWsm33IMT1jlWCsqBMjXblra0opDOusKCKQQoWfMxujn9zcTfg0-93nVDzHBJZyvZhyqVzK3bU8vGLqXoa32IYW_iUQPVf_509qdP_vgvgphhXA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2807019897</pqid></control><display><type>article</type><title>Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data</title><source>EZB-FREE-00999 freely available EZB journals</source><creator>Kannan, Eswariah ; Kothamasu, Lakshmi Anusha</creator><creatorcontrib>Kannan, Eswariah ; Kothamasu, Lakshmi Anusha</creatorcontrib><description>Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focused on binary or ternary sentiments in monolingual texts. However, emotions emerge in bilingual and multilingual texts. The emotions expressed in today's social media, including microblogs, are different. We use a dataset that combines everyday dialogue, easy and emotional stimulation to carry out the algorithm to create a balanced dataset with five labels: joy, sad, anger, fear, and neutral. This entails the preparation of datasets and conventional machine learning models. We categorized tweets using the Bidirectional Encoder Representations from Transformers (BERT) language model but are pre-trained in plain text instead of tweets using BERT Transfer Learning (TensorFlow Keras). In this paper we use the HuggingFace’s transformers library to fine-tune pretrained BERT model for a classification task which is termed as modified (M-BERT). Our modified (M-BERT) model is an average F1-score of 97.63% in all of our taxonomy, which leaves more space for change, is our modified (M-BERT) model. We show that the dual use of an F1-score as a combination of M-BERT and Machine Learning methods increases classification accuracy by 24.92%. as related to baseline BERT model.</description><identifier>ISSN: 1633-1311</identifier><identifier>EISSN: 2116-7125</identifier><identifier>DOI: 10.18280/isi.270111</identifier><language>eng ; fre</language><publisher>Edmonton: International Information and Engineering Technology Association (IIETA)</publisher><subject>Accuracy ; Algorithms ; Artificial intelligence ; Classification ; Coders ; Data mining ; Datasets ; Emotions ; Feature selection ; Heuristic ; Machine learning ; Sentiment analysis ; Social networks ; Taxonomy ; Texts ; Transformers</subject><ispartof>Ingénierie des systèmes d'Information, 2022-02, Vol.27 (1), p.93-100</ispartof><rights>2022. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906</link.rule.ids></links><search><creatorcontrib>Kannan, Eswariah</creatorcontrib><creatorcontrib>Kothamasu, Lakshmi Anusha</creatorcontrib><title>Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data</title><title>Ingénierie des systèmes d'Information</title><description>Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focused on binary or ternary sentiments in monolingual texts. However, emotions emerge in bilingual and multilingual texts. The emotions expressed in today's social media, including microblogs, are different. We use a dataset that combines everyday dialogue, easy and emotional stimulation to carry out the algorithm to create a balanced dataset with five labels: joy, sad, anger, fear, and neutral. This entails the preparation of datasets and conventional machine learning models. We categorized tweets using the Bidirectional Encoder Representations from Transformers (BERT) language model but are pre-trained in plain text instead of tweets using BERT Transfer Learning (TensorFlow Keras). In this paper we use the HuggingFace’s transformers library to fine-tune pretrained BERT model for a classification task which is termed as modified (M-BERT). Our modified (M-BERT) model is an average F1-score of 97.63% in all of our taxonomy, which leaves more space for change, is our modified (M-BERT) model. We show that the dual use of an F1-score as a combination of M-BERT and Machine Learning methods increases classification accuracy by 24.92%. as related to baseline BERT model.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial intelligence</subject><subject>Classification</subject><subject>Coders</subject><subject>Data mining</subject><subject>Datasets</subject><subject>Emotions</subject><subject>Feature selection</subject><subject>Heuristic</subject><subject>Machine learning</subject><subject>Sentiment analysis</subject><subject>Social networks</subject><subject>Taxonomy</subject><subject>Texts</subject><subject>Transformers</subject><issn>1633-1311</issn><issn>2116-7125</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNotkMFKAzEQhoMoWGpPvkDAo6Rmkm6yOba1VaEi6HrxEtIkqynb3ZrsIn17g3UOM_zwM3x8CF0DnULJSnoXUpgySQHgDI0YgCASWHGORiA4J8ABLtEkpR3NwzmIGR2hj3VoPamGNrSfeLF6rfDCJO_w_HCInbFfuO4ifh6aPpBlY1LCb77twz4vPG9Nc0wh4a7F1U_oex_xat_1Ied705srdFGbJvnJ_x2j9_WqWj6SzcvD03K-IRYyK2GUCeWd2Bqq6sIpN_OSWsm33IMT1jlWCsqBMjXblra0opDOusKCKQQoWfMxujn9zcTfg0-93nVDzHBJZyvZhyqVzK3bU8vGLqXoa32IYW_iUQPVf_509qdP_vgvgphhXA</recordid><startdate>20220228</startdate><enddate>20220228</enddate><creator>Kannan, Eswariah</creator><creator>Kothamasu, Lakshmi Anusha</creator><general>International Information and Engineering Technology Association (IIETA)</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>AFKRA</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PQBIZ</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220228</creationdate><title>Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data</title><author>Kannan, Eswariah ; Kothamasu, Lakshmi Anusha</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c1011-20269ed6ba09f5d9d4e70c73b3e1d6cdd2860310294b8c8c657dcd5c1a56197f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng ; fre</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial intelligence</topic><topic>Classification</topic><topic>Coders</topic><topic>Data mining</topic><topic>Datasets</topic><topic>Emotions</topic><topic>Feature selection</topic><topic>Heuristic</topic><topic>Machine learning</topic><topic>Sentiment analysis</topic><topic>Social networks</topic><topic>Taxonomy</topic><topic>Texts</topic><topic>Transformers</topic><toplevel>online_resources</toplevel><creatorcontrib>Kannan, Eswariah</creatorcontrib><creatorcontrib>Kothamasu, Lakshmi Anusha</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>ProQuest One Business</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>Ingénierie des systèmes d'Information</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kannan, Eswariah</au><au>Kothamasu, Lakshmi Anusha</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data</atitle><jtitle>Ingénierie des systèmes d'Information</jtitle><date>2022-02-28</date><risdate>2022</risdate><volume>27</volume><issue>1</issue><spage>93</spage><epage>100</epage><pages>93-100</pages><issn>1633-1311</issn><eissn>2116-7125</eissn><abstract>Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focused on binary or ternary sentiments in monolingual texts. However, emotions emerge in bilingual and multilingual texts. The emotions expressed in today's social media, including microblogs, are different. We use a dataset that combines everyday dialogue, easy and emotional stimulation to carry out the algorithm to create a balanced dataset with five labels: joy, sad, anger, fear, and neutral. This entails the preparation of datasets and conventional machine learning models. We categorized tweets using the Bidirectional Encoder Representations from Transformers (BERT) language model but are pre-trained in plain text instead of tweets using BERT Transfer Learning (TensorFlow Keras). In this paper we use the HuggingFace’s transformers library to fine-tune pretrained BERT model for a classification task which is termed as modified (M-BERT). Our modified (M-BERT) model is an average F1-score of 97.63% in all of our taxonomy, which leaves more space for change, is our modified (M-BERT) model. We show that the dual use of an F1-score as a combination of M-BERT and Machine Learning methods increases classification accuracy by 24.92%. as related to baseline BERT model.</abstract><cop>Edmonton</cop><pub>International Information and Engineering Technology Association (IIETA)</pub><doi>10.18280/isi.270111</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1633-1311 |
ispartof | Ingénierie des systèmes d'Information, 2022-02, Vol.27 (1), p.93-100 |
issn | 1633-1311 2116-7125 |
language | eng ; fre |
recordid | cdi_proquest_journals_2807019897 |
source | EZB-FREE-00999 freely available EZB journals |
subjects | Accuracy Algorithms Artificial intelligence Classification Coders Data mining Datasets Emotions Feature selection Heuristic Machine learning Sentiment analysis Social networks Taxonomy Texts Transformers |
title | Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T14%3A55%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Fine-Tuning%20BERT%20Based%20Approach%20for%20Multi-Class%20Sentiment%20Analysis%20on%20Twitter%20Emotion%20Data&rft.jtitle=Ing%C3%A9nierie%20des%20syst%C3%A8mes%20d'Information&rft.au=Kannan,%20Eswariah&rft.date=2022-02-28&rft.volume=27&rft.issue=1&rft.spage=93&rft.epage=100&rft.pages=93-100&rft.issn=1633-1311&rft.eissn=2116-7125&rft_id=info:doi/10.18280/isi.270111&rft_dat=%3Cproquest_cross%3E2807019897%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2807019897&rft_id=info:pmid/&rfr_iscdi=true |