F_MixBERT: Sentiment Analysis Model using Focal Loss for Imbalanced E-commerce Reviews

Users' comments after online shopping are critical to product reputation and business improvement. These comments, sometimes known as e-commerce reviews, influence other customers' purchasing decisions. To confront large amounts of e-commerce reviews, automatic analysis based on machine le...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:KSII transactions on Internet and information systems 2024-02, Vol.18 (2), p.263-283
Hauptverfasser: Fengqian Pang, Xi Chen, Letong Li, Xin Xu, Zhiqiang Xing
Format: Artikel
Sprache:kor
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 283
container_issue 2
container_start_page 263
container_title KSII transactions on Internet and information systems
container_volume 18
creator Fengqian Pang
Xi Chen
Letong Li
Xin Xu
Zhiqiang Xing
description Users' comments after online shopping are critical to product reputation and business improvement. These comments, sometimes known as e-commerce reviews, influence other customers' purchasing decisions. To confront large amounts of e-commerce reviews, automatic analysis based on machine learning and deep learning draws more and more attention. A core task therein is sentiment analysis. However, the e-commerce reviews exhibit the following characteristics: (1) inconsistency between comment content and the star rating; (2) a large number of unlabeled data, i.e., comments without a star rating, and (3) the data imbalance caused by the sparse negative comments. This paper employs Bidirectional Encoder Representation from Transformers (BERT), one of the best natural language processing models, as the base model. According to the above data characteristics, we propose the F_MixBERT framework, to more effectively use inconsistently low-quality and unlabeled data and resolve the problem of data imbalance. In the framework, the proposed MixBERT incorporates the MixMatch approach into BERT’s high-dimensional vectors to train the unlabeled and low-quality data with generated pseudo labels. Meanwhile, data imbalance is resolved by Focal loss, which penalizes the contribution of large-scale data and easily-identifiable data to total loss. Comparative experiments demonstrate that the proposed framework outperforms BERT and MixBERT for sentiment analysis of e-commerce comments.
format Article
fullrecord <record><control><sourceid>kiss_kisti</sourceid><recordid>TN_cdi_kisti_ndsl_JAKO202409557600169</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><kiss_id>4079869</kiss_id><sourcerecordid>4079869</sourcerecordid><originalsourceid>FETCH-LOGICAL-k509-e89efa23cfe518d56a10f5a48b7ce8ab289f754b6737499ad741d2c38292fde73</originalsourceid><addsrcrecordid>eNpNjF1LwzAYhYsoOOZ-gTe58bKQpvn0ro5Wpx2DWbwtafJGwtJWmvmxf29BEW_OOfA8nLNkkSnBU0GEOP-3L5NVjL7DGZGEUykXyUvVbv3XXblvbtEzDEffz4GKQYdT9BFtRwsBvUc_vKJqNDqgeowRuXFCm77TQQ8GLCpTM_Y9TAbQHj48fMar5MLpEGH128ukqcpm_ZDWu_vNuqjTA8MqBanAaZIbByyTlnGdYcc0lZ0wIHVHpHKC0Y6LXFCltBU0s8TkkijiLIh8mdz83B58PPp2sDG0j8XTjmBCsWJMcIwzrmbv-s-L7dvkez2dWoqFkjP9BkQ3Vd8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>F_MixBERT: Sentiment Analysis Model using Focal Loss for Imbalanced E-commerce Reviews</title><source>EZB-FREE-00999 freely available EZB journals</source><creator>Fengqian Pang ; Xi Chen ; Letong Li ; Xin Xu ; Zhiqiang Xing</creator><creatorcontrib>Fengqian Pang ; Xi Chen ; Letong Li ; Xin Xu ; Zhiqiang Xing</creatorcontrib><description>Users' comments after online shopping are critical to product reputation and business improvement. These comments, sometimes known as e-commerce reviews, influence other customers' purchasing decisions. To confront large amounts of e-commerce reviews, automatic analysis based on machine learning and deep learning draws more and more attention. A core task therein is sentiment analysis. However, the e-commerce reviews exhibit the following characteristics: (1) inconsistency between comment content and the star rating; (2) a large number of unlabeled data, i.e., comments without a star rating, and (3) the data imbalance caused by the sparse negative comments. This paper employs Bidirectional Encoder Representation from Transformers (BERT), one of the best natural language processing models, as the base model. According to the above data characteristics, we propose the F_MixBERT framework, to more effectively use inconsistently low-quality and unlabeled data and resolve the problem of data imbalance. In the framework, the proposed MixBERT incorporates the MixMatch approach into BERT’s high-dimensional vectors to train the unlabeled and low-quality data with generated pseudo labels. Meanwhile, data imbalance is resolved by Focal loss, which penalizes the contribution of large-scale data and easily-identifiable data to total loss. Comparative experiments demonstrate that the proposed framework outperforms BERT and MixBERT for sentiment analysis of e-commerce comments.</description><identifier>ISSN: 1976-7277</identifier><identifier>EISSN: 1976-7277</identifier><language>kor</language><publisher>한국인터넷정보학회</publisher><subject>BERT ; E-commerce reviews ; Focal loss ; MixMatch ; Sentiment analysis</subject><ispartof>KSII transactions on Internet and information systems, 2024-02, Vol.18 (2), p.263-283</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885</link.rule.ids></links><search><creatorcontrib>Fengqian Pang</creatorcontrib><creatorcontrib>Xi Chen</creatorcontrib><creatorcontrib>Letong Li</creatorcontrib><creatorcontrib>Xin Xu</creatorcontrib><creatorcontrib>Zhiqiang Xing</creatorcontrib><title>F_MixBERT: Sentiment Analysis Model using Focal Loss for Imbalanced E-commerce Reviews</title><title>KSII transactions on Internet and information systems</title><addtitle>KSII Transactions on Internet and Information Systems (TIIS)</addtitle><description>Users' comments after online shopping are critical to product reputation and business improvement. These comments, sometimes known as e-commerce reviews, influence other customers' purchasing decisions. To confront large amounts of e-commerce reviews, automatic analysis based on machine learning and deep learning draws more and more attention. A core task therein is sentiment analysis. However, the e-commerce reviews exhibit the following characteristics: (1) inconsistency between comment content and the star rating; (2) a large number of unlabeled data, i.e., comments without a star rating, and (3) the data imbalance caused by the sparse negative comments. This paper employs Bidirectional Encoder Representation from Transformers (BERT), one of the best natural language processing models, as the base model. According to the above data characteristics, we propose the F_MixBERT framework, to more effectively use inconsistently low-quality and unlabeled data and resolve the problem of data imbalance. In the framework, the proposed MixBERT incorporates the MixMatch approach into BERT’s high-dimensional vectors to train the unlabeled and low-quality data with generated pseudo labels. Meanwhile, data imbalance is resolved by Focal loss, which penalizes the contribution of large-scale data and easily-identifiable data to total loss. Comparative experiments demonstrate that the proposed framework outperforms BERT and MixBERT for sentiment analysis of e-commerce comments.</description><subject>BERT</subject><subject>E-commerce reviews</subject><subject>Focal loss</subject><subject>MixMatch</subject><subject>Sentiment analysis</subject><issn>1976-7277</issn><issn>1976-7277</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>JDI</sourceid><recordid>eNpNjF1LwzAYhYsoOOZ-gTe58bKQpvn0ro5Wpx2DWbwtafJGwtJWmvmxf29BEW_OOfA8nLNkkSnBU0GEOP-3L5NVjL7DGZGEUykXyUvVbv3XXblvbtEzDEffz4GKQYdT9BFtRwsBvUc_vKJqNDqgeowRuXFCm77TQQ8GLCpTM_Y9TAbQHj48fMar5MLpEGH128ukqcpm_ZDWu_vNuqjTA8MqBanAaZIbByyTlnGdYcc0lZ0wIHVHpHKC0Y6LXFCltBU0s8TkkijiLIh8mdz83B58PPp2sDG0j8XTjmBCsWJMcIwzrmbv-s-L7dvkez2dWoqFkjP9BkQ3Vd8</recordid><startdate>20240228</startdate><enddate>20240228</enddate><creator>Fengqian Pang</creator><creator>Xi Chen</creator><creator>Letong Li</creator><creator>Xin Xu</creator><creator>Zhiqiang Xing</creator><general>한국인터넷정보학회</general><scope>HZB</scope><scope>Q5X</scope><scope>JDI</scope></search><sort><creationdate>20240228</creationdate><title>F_MixBERT: Sentiment Analysis Model using Focal Loss for Imbalanced E-commerce Reviews</title><author>Fengqian Pang ; Xi Chen ; Letong Li ; Xin Xu ; Zhiqiang Xing</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-k509-e89efa23cfe518d56a10f5a48b7ce8ab289f754b6737499ad741d2c38292fde73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>kor</language><creationdate>2024</creationdate><topic>BERT</topic><topic>E-commerce reviews</topic><topic>Focal loss</topic><topic>MixMatch</topic><topic>Sentiment analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fengqian Pang</creatorcontrib><creatorcontrib>Xi Chen</creatorcontrib><creatorcontrib>Letong Li</creatorcontrib><creatorcontrib>Xin Xu</creatorcontrib><creatorcontrib>Zhiqiang Xing</creatorcontrib><collection>Korea Information Science Society (KISS)</collection><collection>Korean Studies Information Service System (KISS) B-Type</collection><collection>KoreaScience</collection><jtitle>KSII transactions on Internet and information systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fengqian Pang</au><au>Xi Chen</au><au>Letong Li</au><au>Xin Xu</au><au>Zhiqiang Xing</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>F_MixBERT: Sentiment Analysis Model using Focal Loss for Imbalanced E-commerce Reviews</atitle><jtitle>KSII transactions on Internet and information systems</jtitle><addtitle>KSII Transactions on Internet and Information Systems (TIIS)</addtitle><date>2024-02-28</date><risdate>2024</risdate><volume>18</volume><issue>2</issue><spage>263</spage><epage>283</epage><pages>263-283</pages><issn>1976-7277</issn><eissn>1976-7277</eissn><abstract>Users' comments after online shopping are critical to product reputation and business improvement. These comments, sometimes known as e-commerce reviews, influence other customers' purchasing decisions. To confront large amounts of e-commerce reviews, automatic analysis based on machine learning and deep learning draws more and more attention. A core task therein is sentiment analysis. However, the e-commerce reviews exhibit the following characteristics: (1) inconsistency between comment content and the star rating; (2) a large number of unlabeled data, i.e., comments without a star rating, and (3) the data imbalance caused by the sparse negative comments. This paper employs Bidirectional Encoder Representation from Transformers (BERT), one of the best natural language processing models, as the base model. According to the above data characteristics, we propose the F_MixBERT framework, to more effectively use inconsistently low-quality and unlabeled data and resolve the problem of data imbalance. In the framework, the proposed MixBERT incorporates the MixMatch approach into BERT’s high-dimensional vectors to train the unlabeled and low-quality data with generated pseudo labels. Meanwhile, data imbalance is resolved by Focal loss, which penalizes the contribution of large-scale data and easily-identifiable data to total loss. Comparative experiments demonstrate that the proposed framework outperforms BERT and MixBERT for sentiment analysis of e-commerce comments.</abstract><pub>한국인터넷정보학회</pub><tpages>21</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1976-7277
ispartof KSII transactions on Internet and information systems, 2024-02, Vol.18 (2), p.263-283
issn 1976-7277
1976-7277
language kor
recordid cdi_kisti_ndsl_JAKO202409557600169
source EZB-FREE-00999 freely available EZB journals
subjects BERT
E-commerce reviews
Focal loss
MixMatch
Sentiment analysis
title F_MixBERT: Sentiment Analysis Model using Focal Loss for Imbalanced E-commerce Reviews
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T00%3A10%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-kiss_kisti&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=F_MixBERT:%20Sentiment%20Analysis%20Model%20using%20Focal%20Loss%20for%20Imbalanced%20E-commerce%20Reviews&rft.jtitle=KSII%20transactions%20on%20Internet%20and%20information%20systems&rft.au=Fengqian%20Pang&rft.date=2024-02-28&rft.volume=18&rft.issue=2&rft.spage=263&rft.epage=283&rft.pages=263-283&rft.issn=1976-7277&rft.eissn=1976-7277&rft_id=info:doi/&rft_dat=%3Ckiss_kisti%3E4079869%3C/kiss_kisti%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_kiss_id=4079869&rfr_iscdi=true