Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data

Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focuse...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Ingénierie des systèmes d'Information 2022-02, Vol.27 (1), p.93-100
Hauptverfasser: Kannan, Eswariah, Kothamasu, Lakshmi Anusha
Format: Artikel
Sprache:eng ; fre
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 100
container_issue 1
container_start_page 93
container_title Ingénierie des systèmes d'Information
container_volume 27
creator Kannan, Eswariah
Kothamasu, Lakshmi Anusha
description Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focused on binary or ternary sentiments in monolingual texts. However, emotions emerge in bilingual and multilingual texts. The emotions expressed in today's social media, including microblogs, are different. We use a dataset that combines everyday dialogue, easy and emotional stimulation to carry out the algorithm to create a balanced dataset with five labels: joy, sad, anger, fear, and neutral. This entails the preparation of datasets and conventional machine learning models. We categorized tweets using the Bidirectional Encoder Representations from Transformers (BERT) language model but are pre-trained in plain text instead of tweets using BERT Transfer Learning (TensorFlow Keras). In this paper we use the HuggingFace’s transformers library to fine-tune pretrained BERT model for a classification task which is termed as modified (M-BERT). Our modified (M-BERT) model is an average F1-score of 97.63% in all of our taxonomy, which leaves more space for change, is our modified (M-BERT) model. We show that the dual use of an F1-score as a combination of M-BERT and Machine Learning methods increases classification accuracy by 24.92%. as related to baseline BERT model.
doi_str_mv 10.18280/isi.270111
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2807019897</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2807019897</sourcerecordid><originalsourceid>FETCH-LOGICAL-c1011-20269ed6ba09f5d9d4e70c73b3e1d6cdd2860310294b8c8c657dcd5c1a56197f3</originalsourceid><addsrcrecordid>eNotkMFKAzEQhoMoWGpPvkDAo6Rmkm6yOba1VaEi6HrxEtIkqynb3ZrsIn17g3UOM_zwM3x8CF0DnULJSnoXUpgySQHgDI0YgCASWHGORiA4J8ABLtEkpR3NwzmIGR2hj3VoPamGNrSfeLF6rfDCJO_w_HCInbFfuO4ifh6aPpBlY1LCb77twz4vPG9Nc0wh4a7F1U_oex_xat_1Ied705srdFGbJvnJ_x2j9_WqWj6SzcvD03K-IRYyK2GUCeWd2Bqq6sIpN_OSWsm33IMT1jlWCsqBMjXblra0opDOusKCKQQoWfMxujn9zcTfg0-93nVDzHBJZyvZhyqVzK3bU8vGLqXoa32IYW_iUQPVf_509qdP_vgvgphhXA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2807019897</pqid></control><display><type>article</type><title>Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data</title><source>EZB-FREE-00999 freely available EZB journals</source><creator>Kannan, Eswariah ; Kothamasu, Lakshmi Anusha</creator><creatorcontrib>Kannan, Eswariah ; Kothamasu, Lakshmi Anusha</creatorcontrib><description>Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focused on binary or ternary sentiments in monolingual texts. However, emotions emerge in bilingual and multilingual texts. The emotions expressed in today's social media, including microblogs, are different. We use a dataset that combines everyday dialogue, easy and emotional stimulation to carry out the algorithm to create a balanced dataset with five labels: joy, sad, anger, fear, and neutral. This entails the preparation of datasets and conventional machine learning models. We categorized tweets using the Bidirectional Encoder Representations from Transformers (BERT) language model but are pre-trained in plain text instead of tweets using BERT Transfer Learning (TensorFlow Keras). In this paper we use the HuggingFace’s transformers library to fine-tune pretrained BERT model for a classification task which is termed as modified (M-BERT). Our modified (M-BERT) model is an average F1-score of 97.63% in all of our taxonomy, which leaves more space for change, is our modified (M-BERT) model. We show that the dual use of an F1-score as a combination of M-BERT and Machine Learning methods increases classification accuracy by 24.92%. as related to baseline BERT model.</description><identifier>ISSN: 1633-1311</identifier><identifier>EISSN: 2116-7125</identifier><identifier>DOI: 10.18280/isi.270111</identifier><language>eng ; fre</language><publisher>Edmonton: International Information and Engineering Technology Association (IIETA)</publisher><subject>Accuracy ; Algorithms ; Artificial intelligence ; Classification ; Coders ; Data mining ; Datasets ; Emotions ; Feature selection ; Heuristic ; Machine learning ; Sentiment analysis ; Social networks ; Taxonomy ; Texts ; Transformers</subject><ispartof>Ingénierie des systèmes d'Information, 2022-02, Vol.27 (1), p.93-100</ispartof><rights>2022. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906</link.rule.ids></links><search><creatorcontrib>Kannan, Eswariah</creatorcontrib><creatorcontrib>Kothamasu, Lakshmi Anusha</creatorcontrib><title>Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data</title><title>Ingénierie des systèmes d'Information</title><description>Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focused on binary or ternary sentiments in monolingual texts. However, emotions emerge in bilingual and multilingual texts. The emotions expressed in today's social media, including microblogs, are different. We use a dataset that combines everyday dialogue, easy and emotional stimulation to carry out the algorithm to create a balanced dataset with five labels: joy, sad, anger, fear, and neutral. This entails the preparation of datasets and conventional machine learning models. We categorized tweets using the Bidirectional Encoder Representations from Transformers (BERT) language model but are pre-trained in plain text instead of tweets using BERT Transfer Learning (TensorFlow Keras). In this paper we use the HuggingFace’s transformers library to fine-tune pretrained BERT model for a classification task which is termed as modified (M-BERT). Our modified (M-BERT) model is an average F1-score of 97.63% in all of our taxonomy, which leaves more space for change, is our modified (M-BERT) model. We show that the dual use of an F1-score as a combination of M-BERT and Machine Learning methods increases classification accuracy by 24.92%. as related to baseline BERT model.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial intelligence</subject><subject>Classification</subject><subject>Coders</subject><subject>Data mining</subject><subject>Datasets</subject><subject>Emotions</subject><subject>Feature selection</subject><subject>Heuristic</subject><subject>Machine learning</subject><subject>Sentiment analysis</subject><subject>Social networks</subject><subject>Taxonomy</subject><subject>Texts</subject><subject>Transformers</subject><issn>1633-1311</issn><issn>2116-7125</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNotkMFKAzEQhoMoWGpPvkDAo6Rmkm6yOba1VaEi6HrxEtIkqynb3ZrsIn17g3UOM_zwM3x8CF0DnULJSnoXUpgySQHgDI0YgCASWHGORiA4J8ABLtEkpR3NwzmIGR2hj3VoPamGNrSfeLF6rfDCJO_w_HCInbFfuO4ifh6aPpBlY1LCb77twz4vPG9Nc0wh4a7F1U_oex_xat_1Ied705srdFGbJvnJ_x2j9_WqWj6SzcvD03K-IRYyK2GUCeWd2Bqq6sIpN_OSWsm33IMT1jlWCsqBMjXblra0opDOusKCKQQoWfMxujn9zcTfg0-93nVDzHBJZyvZhyqVzK3bU8vGLqXoa32IYW_iUQPVf_509qdP_vgvgphhXA</recordid><startdate>20220228</startdate><enddate>20220228</enddate><creator>Kannan, Eswariah</creator><creator>Kothamasu, Lakshmi Anusha</creator><general>International Information and Engineering Technology Association (IIETA)</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>AFKRA</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PQBIZ</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220228</creationdate><title>Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data</title><author>Kannan, Eswariah ; Kothamasu, Lakshmi Anusha</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c1011-20269ed6ba09f5d9d4e70c73b3e1d6cdd2860310294b8c8c657dcd5c1a56197f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng ; fre</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial intelligence</topic><topic>Classification</topic><topic>Coders</topic><topic>Data mining</topic><topic>Datasets</topic><topic>Emotions</topic><topic>Feature selection</topic><topic>Heuristic</topic><topic>Machine learning</topic><topic>Sentiment analysis</topic><topic>Social networks</topic><topic>Taxonomy</topic><topic>Texts</topic><topic>Transformers</topic><toplevel>online_resources</toplevel><creatorcontrib>Kannan, Eswariah</creatorcontrib><creatorcontrib>Kothamasu, Lakshmi Anusha</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>ProQuest One Business</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>Ingénierie des systèmes d'Information</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kannan, Eswariah</au><au>Kothamasu, Lakshmi Anusha</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data</atitle><jtitle>Ingénierie des systèmes d'Information</jtitle><date>2022-02-28</date><risdate>2022</risdate><volume>27</volume><issue>1</issue><spage>93</spage><epage>100</epage><pages>93-100</pages><issn>1633-1311</issn><eissn>2116-7125</eissn><abstract>Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focused on binary or ternary sentiments in monolingual texts. However, emotions emerge in bilingual and multilingual texts. The emotions expressed in today's social media, including microblogs, are different. We use a dataset that combines everyday dialogue, easy and emotional stimulation to carry out the algorithm to create a balanced dataset with five labels: joy, sad, anger, fear, and neutral. This entails the preparation of datasets and conventional machine learning models. We categorized tweets using the Bidirectional Encoder Representations from Transformers (BERT) language model but are pre-trained in plain text instead of tweets using BERT Transfer Learning (TensorFlow Keras). In this paper we use the HuggingFace’s transformers library to fine-tune pretrained BERT model for a classification task which is termed as modified (M-BERT). Our modified (M-BERT) model is an average F1-score of 97.63% in all of our taxonomy, which leaves more space for change, is our modified (M-BERT) model. We show that the dual use of an F1-score as a combination of M-BERT and Machine Learning methods increases classification accuracy by 24.92%. as related to baseline BERT model.</abstract><cop>Edmonton</cop><pub>International Information and Engineering Technology Association (IIETA)</pub><doi>10.18280/isi.270111</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1633-1311
ispartof Ingénierie des systèmes d'Information, 2022-02, Vol.27 (1), p.93-100
issn 1633-1311
2116-7125
language eng ; fre
recordid cdi_proquest_journals_2807019897
source EZB-FREE-00999 freely available EZB journals
subjects Accuracy
Algorithms
Artificial intelligence
Classification
Coders
Data mining
Datasets
Emotions
Feature selection
Heuristic
Machine learning
Sentiment analysis
Social networks
Taxonomy
Texts
Transformers
title Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T14%3A55%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Fine-Tuning%20BERT%20Based%20Approach%20for%20Multi-Class%20Sentiment%20Analysis%20on%20Twitter%20Emotion%20Data&rft.jtitle=Ing%C3%A9nierie%20des%20syst%C3%A8mes%20d'Information&rft.au=Kannan,%20Eswariah&rft.date=2022-02-28&rft.volume=27&rft.issue=1&rft.spage=93&rft.epage=100&rft.pages=93-100&rft.issn=1633-1311&rft.eissn=2116-7125&rft_id=info:doi/10.18280/isi.270111&rft_dat=%3Cproquest_cross%3E2807019897%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2807019897&rft_id=info:pmid/&rfr_iscdi=true