A Million Tweets Are Worth a Few Points: Tuning Transformers for Customer Service Tasks
In online domain-specific customer service applications, many companies struggle to deploy advanced NLP models successfully, due to the limited availability of and noise in their datasets. While prior research demonstrated the potential of migrating large open-domain pretrained models for domain-spe...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2021-04 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Hadifar, Amir Labat, Sofie Hoste, Véronique Develder, Chris Demeester, Thomas |
description | In online domain-specific customer service applications, many companies struggle to deploy advanced NLP models successfully, due to the limited availability of and noise in their datasets. While prior research demonstrated the potential of migrating large open-domain pretrained models for domain-specific tasks, the appropriate (pre)training strategies have not yet been rigorously evaluated in such social media customer service settings, especially under multilingual conditions. We address this gap by collecting a multilingual social media corpus containing customer service conversations (865k tweets), comparing various pipelines of pretraining and finetuning approaches, applying them on 5 different end tasks. We show that pretraining a generic multilingual transformer model on our in-domain dataset, before finetuning on specific end tasks, consistently boosts performance, especially in non-English settings. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2514889381</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2514889381</sourcerecordid><originalsourceid>FETCH-proquest_journals_25148893813</originalsourceid><addsrcrecordid>eNqNisEKgkAUAJcgSMp_eNBZ0F0t6yaSdAmCFjzKEs9as93at-bv56EP6DQMMzMWcCGSKE85X7CQqIvjmG-2PMtEwOoCTrrvtTUgR0RPUDiE2jp_BwUVjnC22njagxyMNjeQThlqrXuiI5gI5UDeTgYXdB99RZCKHrRi81b1hOGPS7auDrI8Ri9n3wOSbzo7ODOlhmdJmuc7kSfiv-sLv1pAwA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2514889381</pqid></control><display><type>article</type><title>A Million Tweets Are Worth a Few Points: Tuning Transformers for Customer Service Tasks</title><source>Free E- Journals</source><creator>Hadifar, Amir ; Labat, Sofie ; Hoste, Véronique ; Develder, Chris ; Demeester, Thomas</creator><creatorcontrib>Hadifar, Amir ; Labat, Sofie ; Hoste, Véronique ; Develder, Chris ; Demeester, Thomas</creatorcontrib><description>In online domain-specific customer service applications, many companies struggle to deploy advanced NLP models successfully, due to the limited availability of and noise in their datasets. While prior research demonstrated the potential of migrating large open-domain pretrained models for domain-specific tasks, the appropriate (pre)training strategies have not yet been rigorously evaluated in such social media customer service settings, especially under multilingual conditions. We address this gap by collecting a multilingual social media corpus containing customer service conversations (865k tweets), comparing various pipelines of pretraining and finetuning approaches, applying them on 5 different end tasks. We show that pretraining a generic multilingual transformer model on our in-domain dataset, before finetuning on specific end tasks, consistently boosts performance, especially in non-English settings.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Customer services ; Datasets ; Digital media ; Model testing ; Multilingualism ; Social networks ; Transformers</subject><ispartof>arXiv.org, 2021-04</ispartof><rights>2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Hadifar, Amir</creatorcontrib><creatorcontrib>Labat, Sofie</creatorcontrib><creatorcontrib>Hoste, Véronique</creatorcontrib><creatorcontrib>Develder, Chris</creatorcontrib><creatorcontrib>Demeester, Thomas</creatorcontrib><title>A Million Tweets Are Worth a Few Points: Tuning Transformers for Customer Service Tasks</title><title>arXiv.org</title><description>In online domain-specific customer service applications, many companies struggle to deploy advanced NLP models successfully, due to the limited availability of and noise in their datasets. While prior research demonstrated the potential of migrating large open-domain pretrained models for domain-specific tasks, the appropriate (pre)training strategies have not yet been rigorously evaluated in such social media customer service settings, especially under multilingual conditions. We address this gap by collecting a multilingual social media corpus containing customer service conversations (865k tweets), comparing various pipelines of pretraining and finetuning approaches, applying them on 5 different end tasks. We show that pretraining a generic multilingual transformer model on our in-domain dataset, before finetuning on specific end tasks, consistently boosts performance, especially in non-English settings.</description><subject>Customer services</subject><subject>Datasets</subject><subject>Digital media</subject><subject>Model testing</subject><subject>Multilingualism</subject><subject>Social networks</subject><subject>Transformers</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNisEKgkAUAJcgSMp_eNBZ0F0t6yaSdAmCFjzKEs9as93at-bv56EP6DQMMzMWcCGSKE85X7CQqIvjmG-2PMtEwOoCTrrvtTUgR0RPUDiE2jp_BwUVjnC22njagxyMNjeQThlqrXuiI5gI5UDeTgYXdB99RZCKHrRi81b1hOGPS7auDrI8Ri9n3wOSbzo7ODOlhmdJmuc7kSfiv-sLv1pAwA</recordid><startdate>20210416</startdate><enddate>20210416</enddate><creator>Hadifar, Amir</creator><creator>Labat, Sofie</creator><creator>Hoste, Véronique</creator><creator>Develder, Chris</creator><creator>Demeester, Thomas</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20210416</creationdate><title>A Million Tweets Are Worth a Few Points: Tuning Transformers for Customer Service Tasks</title><author>Hadifar, Amir ; Labat, Sofie ; Hoste, Véronique ; Develder, Chris ; Demeester, Thomas</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_25148893813</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Customer services</topic><topic>Datasets</topic><topic>Digital media</topic><topic>Model testing</topic><topic>Multilingualism</topic><topic>Social networks</topic><topic>Transformers</topic><toplevel>online_resources</toplevel><creatorcontrib>Hadifar, Amir</creatorcontrib><creatorcontrib>Labat, Sofie</creatorcontrib><creatorcontrib>Hoste, Véronique</creatorcontrib><creatorcontrib>Develder, Chris</creatorcontrib><creatorcontrib>Demeester, Thomas</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hadifar, Amir</au><au>Labat, Sofie</au><au>Hoste, Véronique</au><au>Develder, Chris</au><au>Demeester, Thomas</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>A Million Tweets Are Worth a Few Points: Tuning Transformers for Customer Service Tasks</atitle><jtitle>arXiv.org</jtitle><date>2021-04-16</date><risdate>2021</risdate><eissn>2331-8422</eissn><abstract>In online domain-specific customer service applications, many companies struggle to deploy advanced NLP models successfully, due to the limited availability of and noise in their datasets. While prior research demonstrated the potential of migrating large open-domain pretrained models for domain-specific tasks, the appropriate (pre)training strategies have not yet been rigorously evaluated in such social media customer service settings, especially under multilingual conditions. We address this gap by collecting a multilingual social media corpus containing customer service conversations (865k tweets), comparing various pipelines of pretraining and finetuning approaches, applying them on 5 different end tasks. We show that pretraining a generic multilingual transformer model on our in-domain dataset, before finetuning on specific end tasks, consistently boosts performance, especially in non-English settings.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2021-04 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2514889381 |
source | Free E- Journals |
subjects | Customer services Datasets Digital media Model testing Multilingualism Social networks Transformers |
title | A Million Tweets Are Worth a Few Points: Tuning Transformers for Customer Service Tasks |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T11%3A07%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=A%20Million%20Tweets%20Are%20Worth%20a%20Few%20Points:%20Tuning%20Transformers%20for%20Customer%20Service%20Tasks&rft.jtitle=arXiv.org&rft.au=Hadifar,%20Amir&rft.date=2021-04-16&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2514889381%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2514889381&rft_id=info:pmid/&rfr_iscdi=true |