Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data

Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focuse...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Ingénierie des systèmes d'Information 2022-02, Vol.27 (1), p.93-100
Hauptverfasser:	Kannan, Eswariah, Kothamasu, Lakshmi Anusha
Format:	Artikel
Sprache:	eng ; fre
Schlagworte:	Accuracy Algorithms Artificial intelligence Classification Coders Data mining Datasets Emotions Feature selection Heuristic Machine learning Sentiment analysis Social networks Taxonomy Texts Transformers
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	100
container_issue	1
container_start_page	93
container_title	Ingénierie des systèmes d'Information
container_volume	27
creator	Kannan, Eswariah Kothamasu, Lakshmi Anusha
description	Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focused on binary or ternary sentiments in monolingual texts. However, emotions emerge in bilingual and multilingual texts. The emotions expressed in today's social media, including microblogs, are different. We use a dataset that combines everyday dialogue, easy and emotional stimulation to carry out the algorithm to create a balanced dataset with five labels: joy, sad, anger, fear, and neutral. This entails the preparation of datasets and conventional machine learning models. We categorized tweets using the Bidirectional Encoder Representations from Transformers (BERT) language model but are pre-trained in plain text instead of tweets using BERT Transfer Learning (TensorFlow Keras). In this paper we use the HuggingFace’s transformers library to fine-tune pretrained BERT model for a classification task which is termed as modified (M-BERT). Our modified (M-BERT) model is an average F1-score of 97.63% in all of our taxonomy, which leaves more space for change, is our modified (M-BERT) model. We show that the dual use of an F1-score as a combination of M-BERT and Machine Learning methods increases classification accuracy by 24.92%. as related to baseline BERT model.
doi_str_mv	10.18280/isi.270111
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2807019897</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2807019897</sourcerecordid><originalsourceid>FETCH-LOGICAL-c1011-20269ed6ba09f5d9d4e70c73b3e1d6cdd2860310294b8c8c657dcd5c1a56197f3</originalsourceid><addsrcrecordid>eNotkMFKAzEQhoMoWGpPvkDAo6Rmkm6yOba1VaEi6HrxEtIkqynb3ZrsIn17g3UOM_zwM3x8CF0DnULJSnoXUpgySQHgDI0YgCASWHGORiA4J8ABLtEkpR3NwzmIGR2hj3VoPamGNrSfeLF6rfDCJO_w_HCInbFfuO4ifh6aPpBlY1LCb77twz4vPG9Nc0wh4a7F1U_oex_xat_1Ied705srdFGbJvnJ_x2j9_WqWj6SzcvD03K-IRYyK2GUCeWd2Bqq6sIpN_OSWsm33IMT1jlWCsqBMjXblra0opDOusKCKQQoWfMxujn9zcTfg0-93nVDzHBJZyvZhyqVzK3bU8vGLqXoa32IYW_iUQPVf_509qdP_vgvgphhXA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2807019897</pqid></control><display><type>article</type><title>Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data</title><source>EZB-FREE-00999 freely available EZB journals</source><creator>Kannan, Eswariah ; Kothamasu, Lakshmi Anusha</creator><creatorcontrib>Kannan, Eswariah ; Kothamasu, Lakshmi Anusha</creatorcontrib><description>Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focused on binary or ternary sentiments in monolingual texts. However, emotions emerge in bilingual and multilingual texts. The emotions expressed in today's social media, including microblogs, are different. We use a dataset that combines everyday dialogue, easy and emotional stimulation to carry out the algorithm to create a balanced dataset with five labels: joy, sad, anger, fear, and neutral. This entails the preparation of datasets and conventional machine learning models. We categorized tweets using the Bidirectional Encoder Representations from Transformers (BERT) language model but are pre-trained in plain text instead of tweets using BERT Transfer Learning (TensorFlow Keras). In this paper we use the HuggingFace’s transformers library to fine-tune pretrained BERT model for a classification task which is termed as modified (M-BERT). Our modified (M-BERT) model is an average F1-score of 97.63% in all of our taxonomy, which leaves more space for change, is our modified (M-BERT) model. We show that the dual use of an F1-score as a combination of M-BERT and Machine Learning methods increases classification accuracy by 24.92%. as related to baseline BERT model.</description><identifier>ISSN: 1633-1311</identifier><identifier>EISSN: 2116-7125</identifier><identifier>DOI: 10.18280/isi.270111</identifier><language>eng ; fre</language><publisher>Edmonton: International Information and Engineering Technology Association (IIETA)</publisher><subject>Accuracy ; Algorithms ; Artificial intelligence ; Classification ; Coders ; Data mining ; Datasets ; Emotions ; Feature selection ; Heuristic ; Machine learning ; Sentiment analysis ; Social networks ; Taxonomy ; Texts ; Transformers</subject><ispartof>Ingénierie des systèmes d'Information, 2022-02, Vol.27 (1), p.93-100</ispartof><rights>2022. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906</link.rule.ids></links><search><creatorcontrib>Kannan, Eswariah</creatorcontrib><creatorcontrib>Kothamasu, Lakshmi Anusha</creatorcontrib><title>Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data</title><title>Ingénierie des systèmes d'Information</title><description>Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focused on binary or ternary sentiments in monolingual texts. However, emotions emerge in bilingual and multilingual texts. The emotions expressed in today's social media, including microblogs, are different. We use a dataset that combines everyday dialogue, easy and emotional stimulation to carry out the algorithm to create a balanced dataset with five labels: joy, sad, anger, fear, and neutral. This entails the preparation of datasets and conventional machine learning models. We categorized tweets using the Bidirectional Encoder Representations from Transformers (BERT) language model but are pre-trained in plain text instead of tweets using BERT Transfer Learning (TensorFlow Keras). In this paper we use the HuggingFace’s transformers library to fine-tune pretrained BERT model for a classification task which is termed as modified (M-BERT). Our modified (M-BERT) model is an average F1-score of 97.63% in all of our taxonomy, which leaves more space for change, is our modified (M-BERT) model. We show that the dual use of an F1-score as a combination of M-BERT and Machine Learning methods increases classification accuracy by 24.92%. as related to baseline BERT model.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial intelligence</subject><subject>Classification</subject><subject>Coders</subject><subject>Data mining</subject><subject>Datasets</subject><subject>Emotions</subject><subject>Feature selection</subject><subject>Heuristic</subject><subject>Machine learning</subject><subject>Sentiment analysis</subject><subject>Social networks</subject><subject>Taxonomy</subject><subject>Texts</subject><subject>Transformers</subject><issn>1633-1311</issn><issn>2116-7125</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNotkMFKAzEQhoMoWGpPvkDAo6Rmkm6yOba1VaEi6HrxEtIkqynb3ZrsIn17g3UOM_zwM3x8CF0DnULJSnoXUpgySQHgDI0YgCASWHGORiA4J8ABLtEkpR3NwzmIGR2hj3VoPamGNrSfeLF6rfDCJO_w_HCInbFfuO4ifh6aPpBlY1LCb77twz4vPG9Nc0wh4a7F1U_oex_xat_1Ied705srdFGbJvnJ_x2j9_WqWj6SzcvD03K-IRYyK2GUCeWd2Bqq6sIpN_OSWsm33IMT1jlWCsqBMjXblra0opDOusKCKQQoWfMxujn9zcTfg0-93nVDzHBJZyvZhyqVzK3bU8vGLqXoa32IYW_iUQPVf_509qdP_vgvgphhXA</recordid><startdate>20220228</startdate><enddate>20220228</enddate><creator>Kannan, Eswariah</creator><creator>Kothamasu, Lakshmi Anusha</creator><general>International Information and Engineering Technology Association (IIETA)</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>AFKRA</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PQBIZ</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220228</creationdate><title>Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data</title><author>Kannan, Eswariah ; Kothamasu, Lakshmi Anusha</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c1011-20269ed6ba09f5d9d4e70c73b3e1d6cdd2860310294b8c8c657dcd5c1a56197f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng ; fre</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial intelligence</topic><topic>Classification</topic><topic>Coders</topic><topic>Data mining</topic><topic>Datasets</topic><topic>Emotions</topic><topic>Feature selection</topic><topic>Heuristic</topic><topic>Machine learning</topic><topic>Sentiment analysis</topic><topic>Social networks</topic><topic>Taxonomy</topic><topic>Texts</topic><topic>Transformers</topic><toplevel>online_resources</toplevel><creatorcontrib>Kannan, Eswariah</creatorcontrib><creatorcontrib>Kothamasu, Lakshmi Anusha</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>ProQuest One Business</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>Ingénierie des systèmes d'Information</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kannan, Eswariah</au><au>Kothamasu, Lakshmi Anusha</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data</atitle><jtitle>Ingénierie des systèmes d'Information</jtitle><date>2022-02-28</date><risdate>2022</risdate><volume>27</volume><issue>1</issue><spage>93</spage><epage>100</epage><pages>93-100</pages><issn>1633-1311</issn><eissn>2116-7125</eissn><abstract>Tweets are difficult to classify due to their simplicity and frequent use of non-standard orthodoxy or slang words. Although several studies have identified highly accurate sentiment data classifications, most have not been tested on Twitter data. Previous research on sentiment interpretation focused on binary or ternary sentiments in monolingual texts. However, emotions emerge in bilingual and multilingual texts. The emotions expressed in today's social media, including microblogs, are different. We use a dataset that combines everyday dialogue, easy and emotional stimulation to carry out the algorithm to create a balanced dataset with five labels: joy, sad, anger, fear, and neutral. This entails the preparation of datasets and conventional machine learning models. We categorized tweets using the Bidirectional Encoder Representations from Transformers (BERT) language model but are pre-trained in plain text instead of tweets using BERT Transfer Learning (TensorFlow Keras). In this paper we use the HuggingFace’s transformers library to fine-tune pretrained BERT model for a classification task which is termed as modified (M-BERT). Our modified (M-BERT) model is an average F1-score of 97.63% in all of our taxonomy, which leaves more space for change, is our modified (M-BERT) model. We show that the dual use of an F1-score as a combination of M-BERT and Machine Learning methods increases classification accuracy by 24.92%. as related to baseline BERT model.</abstract><cop>Edmonton</cop><pub>International Information and Engineering Technology Association (IIETA)</pub><doi>10.18280/isi.270111</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1633-1311
ispartof	Ingénierie des systèmes d'Information, 2022-02, Vol.27 (1), p.93-100
issn	1633-1311 2116-7125
language	eng ; fre
recordid	cdi_proquest_journals_2807019897
source	EZB-FREE-00999 freely available EZB journals
subjects	Accuracy Algorithms Artificial intelligence Classification Coders Data mining Datasets Emotions Feature selection Heuristic Machine learning Sentiment analysis Social networks Taxonomy Texts Transformers
title	Fine-Tuning BERT Based Approach for Multi-Class Sentiment Analysis on Twitter Emotion Data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T14%3A55%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Fine-Tuning%20BERT%20Based%20Approach%20for%20Multi-Class%20Sentiment%20Analysis%20on%20Twitter%20Emotion%20Data&rft.jtitle=Ing%C3%A9nierie%20des%20syst%C3%A8mes%20d'Information&rft.au=Kannan,%20Eswariah&rft.date=2022-02-28&rft.volume=27&rft.issue=1&rft.spage=93&rft.epage=100&rft.pages=93-100&rft.issn=1633-1311&rft.eissn=2116-7125&rft_id=info:doi/10.18280/isi.270111&rft_dat=%3Cproquest_cross%3E2807019897%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2807019897&rft_id=info:pmid/&rfr_iscdi=true