F_MixBERT: Sentiment Analysis Model using Focal Loss for Imbalanced E-commerce Reviews
Users' comments after online shopping are critical to product reputation and business improvement. These comments, sometimes known as e-commerce reviews, influence other customers' purchasing decisions. To confront large amounts of e-commerce reviews, automatic analysis based on machine le...
Gespeichert in:
Veröffentlicht in: | KSII transactions on Internet and information systems 2024-02, Vol.18 (2), p.263-283 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | kor |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 283 |
---|---|
container_issue | 2 |
container_start_page | 263 |
container_title | KSII transactions on Internet and information systems |
container_volume | 18 |
creator | Fengqian Pang Xi Chen Letong Li Xin Xu Zhiqiang Xing |
description | Users' comments after online shopping are critical to product reputation and business improvement. These comments, sometimes known as e-commerce reviews, influence other customers' purchasing decisions. To confront large amounts of e-commerce reviews, automatic analysis based on machine learning and deep learning draws more and more attention. A core task therein is sentiment analysis. However, the e-commerce reviews exhibit the following characteristics: (1) inconsistency between comment content and the star rating; (2) a large number of unlabeled data, i.e., comments without a star rating, and (3) the data imbalance caused by the sparse negative comments. This paper employs Bidirectional Encoder Representation from Transformers (BERT), one of the best natural language processing models, as the base model. According to the above data characteristics, we propose the F_MixBERT framework, to more effectively use inconsistently low-quality and unlabeled data and resolve the problem of data imbalance. In the framework, the proposed MixBERT incorporates the MixMatch approach into BERT’s high-dimensional vectors to train the unlabeled and low-quality data with generated pseudo labels. Meanwhile, data imbalance is resolved by Focal loss, which penalizes the contribution of large-scale data and easily-identifiable data to total loss. Comparative experiments demonstrate that the proposed framework outperforms BERT and MixBERT for sentiment analysis of e-commerce comments. |
format | Article |
fullrecord | <record><control><sourceid>kiss_kisti</sourceid><recordid>TN_cdi_kisti_ndsl_JAKO202409557600169</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><kiss_id>4079869</kiss_id><sourcerecordid>4079869</sourcerecordid><originalsourceid>FETCH-LOGICAL-k509-e89efa23cfe518d56a10f5a48b7ce8ab289f754b6737499ad741d2c38292fde73</originalsourceid><addsrcrecordid>eNpNjF1LwzAYhYsoOOZ-gTe58bKQpvn0ro5Wpx2DWbwtafJGwtJWmvmxf29BEW_OOfA8nLNkkSnBU0GEOP-3L5NVjL7DGZGEUykXyUvVbv3XXblvbtEzDEffz4GKQYdT9BFtRwsBvUc_vKJqNDqgeowRuXFCm77TQQ8GLCpTM_Y9TAbQHj48fMar5MLpEGH128ukqcpm_ZDWu_vNuqjTA8MqBanAaZIbByyTlnGdYcc0lZ0wIHVHpHKC0Y6LXFCltBU0s8TkkijiLIh8mdz83B58PPp2sDG0j8XTjmBCsWJMcIwzrmbv-s-L7dvkez2dWoqFkjP9BkQ3Vd8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>F_MixBERT: Sentiment Analysis Model using Focal Loss for Imbalanced E-commerce Reviews</title><source>EZB-FREE-00999 freely available EZB journals</source><creator>Fengqian Pang ; Xi Chen ; Letong Li ; Xin Xu ; Zhiqiang Xing</creator><creatorcontrib>Fengqian Pang ; Xi Chen ; Letong Li ; Xin Xu ; Zhiqiang Xing</creatorcontrib><description>Users' comments after online shopping are critical to product reputation and business improvement. These comments, sometimes known as e-commerce reviews, influence other customers' purchasing decisions. To confront large amounts of e-commerce reviews, automatic analysis based on machine learning and deep learning draws more and more attention. A core task therein is sentiment analysis. However, the e-commerce reviews exhibit the following characteristics: (1) inconsistency between comment content and the star rating; (2) a large number of unlabeled data, i.e., comments without a star rating, and (3) the data imbalance caused by the sparse negative comments. This paper employs Bidirectional Encoder Representation from Transformers (BERT), one of the best natural language processing models, as the base model. According to the above data characteristics, we propose the F_MixBERT framework, to more effectively use inconsistently low-quality and unlabeled data and resolve the problem of data imbalance. In the framework, the proposed MixBERT incorporates the MixMatch approach into BERT’s high-dimensional vectors to train the unlabeled and low-quality data with generated pseudo labels. Meanwhile, data imbalance is resolved by Focal loss, which penalizes the contribution of large-scale data and easily-identifiable data to total loss. Comparative experiments demonstrate that the proposed framework outperforms BERT and MixBERT for sentiment analysis of e-commerce comments.</description><identifier>ISSN: 1976-7277</identifier><identifier>EISSN: 1976-7277</identifier><language>kor</language><publisher>한국인터넷정보학회</publisher><subject>BERT ; E-commerce reviews ; Focal loss ; MixMatch ; Sentiment analysis</subject><ispartof>KSII transactions on Internet and information systems, 2024-02, Vol.18 (2), p.263-283</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885</link.rule.ids></links><search><creatorcontrib>Fengqian Pang</creatorcontrib><creatorcontrib>Xi Chen</creatorcontrib><creatorcontrib>Letong Li</creatorcontrib><creatorcontrib>Xin Xu</creatorcontrib><creatorcontrib>Zhiqiang Xing</creatorcontrib><title>F_MixBERT: Sentiment Analysis Model using Focal Loss for Imbalanced E-commerce Reviews</title><title>KSII transactions on Internet and information systems</title><addtitle>KSII Transactions on Internet and Information Systems (TIIS)</addtitle><description>Users' comments after online shopping are critical to product reputation and business improvement. These comments, sometimes known as e-commerce reviews, influence other customers' purchasing decisions. To confront large amounts of e-commerce reviews, automatic analysis based on machine learning and deep learning draws more and more attention. A core task therein is sentiment analysis. However, the e-commerce reviews exhibit the following characteristics: (1) inconsistency between comment content and the star rating; (2) a large number of unlabeled data, i.e., comments without a star rating, and (3) the data imbalance caused by the sparse negative comments. This paper employs Bidirectional Encoder Representation from Transformers (BERT), one of the best natural language processing models, as the base model. According to the above data characteristics, we propose the F_MixBERT framework, to more effectively use inconsistently low-quality and unlabeled data and resolve the problem of data imbalance. In the framework, the proposed MixBERT incorporates the MixMatch approach into BERT’s high-dimensional vectors to train the unlabeled and low-quality data with generated pseudo labels. Meanwhile, data imbalance is resolved by Focal loss, which penalizes the contribution of large-scale data and easily-identifiable data to total loss. Comparative experiments demonstrate that the proposed framework outperforms BERT and MixBERT for sentiment analysis of e-commerce comments.</description><subject>BERT</subject><subject>E-commerce reviews</subject><subject>Focal loss</subject><subject>MixMatch</subject><subject>Sentiment analysis</subject><issn>1976-7277</issn><issn>1976-7277</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>JDI</sourceid><recordid>eNpNjF1LwzAYhYsoOOZ-gTe58bKQpvn0ro5Wpx2DWbwtafJGwtJWmvmxf29BEW_OOfA8nLNkkSnBU0GEOP-3L5NVjL7DGZGEUykXyUvVbv3XXblvbtEzDEffz4GKQYdT9BFtRwsBvUc_vKJqNDqgeowRuXFCm77TQQ8GLCpTM_Y9TAbQHj48fMar5MLpEGH128ukqcpm_ZDWu_vNuqjTA8MqBanAaZIbByyTlnGdYcc0lZ0wIHVHpHKC0Y6LXFCltBU0s8TkkijiLIh8mdz83B58PPp2sDG0j8XTjmBCsWJMcIwzrmbv-s-L7dvkez2dWoqFkjP9BkQ3Vd8</recordid><startdate>20240228</startdate><enddate>20240228</enddate><creator>Fengqian Pang</creator><creator>Xi Chen</creator><creator>Letong Li</creator><creator>Xin Xu</creator><creator>Zhiqiang Xing</creator><general>한국인터넷정보학회</general><scope>HZB</scope><scope>Q5X</scope><scope>JDI</scope></search><sort><creationdate>20240228</creationdate><title>F_MixBERT: Sentiment Analysis Model using Focal Loss for Imbalanced E-commerce Reviews</title><author>Fengqian Pang ; Xi Chen ; Letong Li ; Xin Xu ; Zhiqiang Xing</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-k509-e89efa23cfe518d56a10f5a48b7ce8ab289f754b6737499ad741d2c38292fde73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>kor</language><creationdate>2024</creationdate><topic>BERT</topic><topic>E-commerce reviews</topic><topic>Focal loss</topic><topic>MixMatch</topic><topic>Sentiment analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fengqian Pang</creatorcontrib><creatorcontrib>Xi Chen</creatorcontrib><creatorcontrib>Letong Li</creatorcontrib><creatorcontrib>Xin Xu</creatorcontrib><creatorcontrib>Zhiqiang Xing</creatorcontrib><collection>Korea Information Science Society (KISS)</collection><collection>Korean Studies Information Service System (KISS) B-Type</collection><collection>KoreaScience</collection><jtitle>KSII transactions on Internet and information systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fengqian Pang</au><au>Xi Chen</au><au>Letong Li</au><au>Xin Xu</au><au>Zhiqiang Xing</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>F_MixBERT: Sentiment Analysis Model using Focal Loss for Imbalanced E-commerce Reviews</atitle><jtitle>KSII transactions on Internet and information systems</jtitle><addtitle>KSII Transactions on Internet and Information Systems (TIIS)</addtitle><date>2024-02-28</date><risdate>2024</risdate><volume>18</volume><issue>2</issue><spage>263</spage><epage>283</epage><pages>263-283</pages><issn>1976-7277</issn><eissn>1976-7277</eissn><abstract>Users' comments after online shopping are critical to product reputation and business improvement. These comments, sometimes known as e-commerce reviews, influence other customers' purchasing decisions. To confront large amounts of e-commerce reviews, automatic analysis based on machine learning and deep learning draws more and more attention. A core task therein is sentiment analysis. However, the e-commerce reviews exhibit the following characteristics: (1) inconsistency between comment content and the star rating; (2) a large number of unlabeled data, i.e., comments without a star rating, and (3) the data imbalance caused by the sparse negative comments. This paper employs Bidirectional Encoder Representation from Transformers (BERT), one of the best natural language processing models, as the base model. According to the above data characteristics, we propose the F_MixBERT framework, to more effectively use inconsistently low-quality and unlabeled data and resolve the problem of data imbalance. In the framework, the proposed MixBERT incorporates the MixMatch approach into BERT’s high-dimensional vectors to train the unlabeled and low-quality data with generated pseudo labels. Meanwhile, data imbalance is resolved by Focal loss, which penalizes the contribution of large-scale data and easily-identifiable data to total loss. Comparative experiments demonstrate that the proposed framework outperforms BERT and MixBERT for sentiment analysis of e-commerce comments.</abstract><pub>한국인터넷정보학회</pub><tpages>21</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1976-7277 |
ispartof | KSII transactions on Internet and information systems, 2024-02, Vol.18 (2), p.263-283 |
issn | 1976-7277 1976-7277 |
language | kor |
recordid | cdi_kisti_ndsl_JAKO202409557600169 |
source | EZB-FREE-00999 freely available EZB journals |
subjects | BERT E-commerce reviews Focal loss MixMatch Sentiment analysis |
title | F_MixBERT: Sentiment Analysis Model using Focal Loss for Imbalanced E-commerce Reviews |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T00%3A10%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-kiss_kisti&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=F_MixBERT:%20Sentiment%20Analysis%20Model%20using%20Focal%20Loss%20for%20Imbalanced%20E-commerce%20Reviews&rft.jtitle=KSII%20transactions%20on%20Internet%20and%20information%20systems&rft.au=Fengqian%20Pang&rft.date=2024-02-28&rft.volume=18&rft.issue=2&rft.spage=263&rft.epage=283&rft.pages=263-283&rft.issn=1976-7277&rft.eissn=1976-7277&rft_id=info:doi/&rft_dat=%3Ckiss_kisti%3E4079869%3C/kiss_kisti%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_kiss_id=4079869&rfr_iscdi=true |