Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT

Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2023, Vol.11, p.134009-134022
Hauptverfasser: Riedel, Pascal, Reichert, Manfred, Von Schwerin, Reinhold, Hafner, Alexander, Schaudt, Daniel, Singh, Gaurav
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 134022
container_issue
container_start_page 134009
container_title IEEE access
container_volume 11
creator Riedel, Pascal
Reichert, Manfred
Von Schwerin, Reinhold
Hafner, Alexander
Schaudt, Daniel
Singh, Gaurav
description Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federated Learning (FL) system, these challenges can be alleviated by the training of a global model, eliminating the need to centralize sensitive data of clients. However, distributed NLP data is usually Non-Independent and Identically Distributed (Non-IID), which leads to poorer generalizability of the global model when trained with Federated Averaging (FedAvg). Recently proposed extensions to FedAvg promise to improve the global model performance on Non-IID data. Yet, such advanced FL algorithms trained on multilingual Non-IID texts have not been studied in industry and academia in detail. This paper compares, for the first time, the FL algorithms: FedAvg, FedAvgM, FedYogi, FedAdam and FedAdagrad for a binary text classification task using 12078 tailored real-world news reports in English, Portuguese, Spanish and Hindi. For this objective, pre-trained DistilBERT and BERT models fine-tuned with these texts are used. The paper results show that FedYogi is the most stable and robust FL algorithm when DistilBERT is used, achieving an average macro F1 score of 0.7789 for IID and 0.7755 for Non-IID protest news. The study also exhibits that BERT models trained with weighted FedAvg and FedAvgM can achieve a similar prediction power as centralized language models, demonstrating the potential of leveraging FL in the NLP domain without the need to collect data centrally.
doi_str_mv 10.1109/ACCESS.2023.3334910
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_ACCESS_2023_3334910</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10330588</ieee_id><doaj_id>oai_doaj_org_article_90f7449be5b94a70840d0ebf60ddd829</doaj_id><sourcerecordid>2895872396</sourcerecordid><originalsourceid>FETCH-LOGICAL-c359t-dec06de68bdcd76a487cadd30731d58590a41d2cc3cd029c0d7081ff9c4aea183</originalsourceid><addsrcrecordid>eNpNkU1v2zAMho1hBVZ0_QXdQcDOziTLH9IxS9OtQLYGbXoWGJHOFDhWJykoet4fnzIXQ3URQfJ5CfItiivBZ0Jw_WW-WCwfHmYVr-RMSllrwd8V55VodSkb2b5_E38oLmPc8_xUTjXdefFnTaH34QCjJTYfYXiJLjLfsxtCCpAI2YogjG7csfmw88GlX4fIMsJ-HIfkhlw4wsDWwSeKif2k58iuKZFNzo_sMZ7AdaByE8CNWe3axUx9Xd5vGIzITsHH4qyHIdLl639RPN4sN4vv5eru2-1iviqtbHQqkSxvkVq1RYtdC7XqLCBK3kmBjWo0h1pgZa20yCttOXZ5zb7XtgYCoeRFcTvpooe9eQruAOHFeHDmX8KHnYGQnB3IaN53da231Gx1DVmn5shp27ccEVWls9bnSesp-N_HvLnZ-2PI94umUrpRXSV1m7vk1GWDjzFQ_3-q4OZknpnMMyfzzKt5mfo0UY6I3hBS8kYp-RfGn5cO</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2895872396</pqid></control><display><type>article</type><title>Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Riedel, Pascal ; Reichert, Manfred ; Von Schwerin, Reinhold ; Hafner, Alexander ; Schaudt, Daniel ; Singh, Gaurav</creator><creatorcontrib>Riedel, Pascal ; Reichert, Manfred ; Von Schwerin, Reinhold ; Hafner, Alexander ; Schaudt, Daniel ; Singh, Gaurav</creatorcontrib><description>Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federated Learning (FL) system, these challenges can be alleviated by the training of a global model, eliminating the need to centralize sensitive data of clients. However, distributed NLP data is usually Non-Independent and Identically Distributed (Non-IID), which leads to poorer generalizability of the global model when trained with Federated Averaging (FedAvg). Recently proposed extensions to FedAvg promise to improve the global model performance on Non-IID data. Yet, such advanced FL algorithms trained on multilingual Non-IID texts have not been studied in industry and academia in detail. This paper compares, for the first time, the FL algorithms: FedAvg, FedAvgM, FedYogi, FedAdam and FedAdagrad for a binary text classification task using 12078 tailored real-world news reports in English, Portuguese, Spanish and Hindi. For this objective, pre-trained DistilBERT and BERT models fine-tuned with these texts are used. The paper results show that FedYogi is the most stable and robust FL algorithm when DistilBERT is used, achieving an average macro F1 score of 0.7789 for IID and 0.7755 for Non-IID protest news. The study also exhibits that BERT models trained with weighted FedAvg and FedAvgM can achieve a similar prediction power as centralized language models, demonstrating the potential of leveraging FL in the NLP domain without the need to collect data centrally.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2023.3334910</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Analytical models ; Computational modeling ; Data collection ; data distributions ; Data models ; Distributed databases ; distributed learning ; federated algorithms ; Federated learning ; Information management ; Machine learning ; Multilingualism ; Natural language processing ; News ; optimization ; Privacy ; Texts ; Training ; Transformers</subject><ispartof>IEEE access, 2023, Vol.11, p.134009-134022</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c359t-dec06de68bdcd76a487cadd30731d58590a41d2cc3cd029c0d7081ff9c4aea183</cites><orcidid>0000-0002-6332-3865 ; 0000-0002-1368-8209 ; 0000-0002-1645-730X ; 0000-0003-2536-4153</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10330588$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2100,4022,27632,27922,27923,27924,54932</link.rule.ids></links><search><creatorcontrib>Riedel, Pascal</creatorcontrib><creatorcontrib>Reichert, Manfred</creatorcontrib><creatorcontrib>Von Schwerin, Reinhold</creatorcontrib><creatorcontrib>Hafner, Alexander</creatorcontrib><creatorcontrib>Schaudt, Daniel</creatorcontrib><creatorcontrib>Singh, Gaurav</creatorcontrib><title>Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT</title><title>IEEE access</title><addtitle>Access</addtitle><description>Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federated Learning (FL) system, these challenges can be alleviated by the training of a global model, eliminating the need to centralize sensitive data of clients. However, distributed NLP data is usually Non-Independent and Identically Distributed (Non-IID), which leads to poorer generalizability of the global model when trained with Federated Averaging (FedAvg). Recently proposed extensions to FedAvg promise to improve the global model performance on Non-IID data. Yet, such advanced FL algorithms trained on multilingual Non-IID texts have not been studied in industry and academia in detail. This paper compares, for the first time, the FL algorithms: FedAvg, FedAvgM, FedYogi, FedAdam and FedAdagrad for a binary text classification task using 12078 tailored real-world news reports in English, Portuguese, Spanish and Hindi. For this objective, pre-trained DistilBERT and BERT models fine-tuned with these texts are used. The paper results show that FedYogi is the most stable and robust FL algorithm when DistilBERT is used, achieving an average macro F1 score of 0.7789 for IID and 0.7755 for Non-IID protest news. The study also exhibits that BERT models trained with weighted FedAvg and FedAvgM can achieve a similar prediction power as centralized language models, demonstrating the potential of leveraging FL in the NLP domain without the need to collect data centrally.</description><subject>Algorithms</subject><subject>Analytical models</subject><subject>Computational modeling</subject><subject>Data collection</subject><subject>data distributions</subject><subject>Data models</subject><subject>Distributed databases</subject><subject>distributed learning</subject><subject>federated algorithms</subject><subject>Federated learning</subject><subject>Information management</subject><subject>Machine learning</subject><subject>Multilingualism</subject><subject>Natural language processing</subject><subject>News</subject><subject>optimization</subject><subject>Privacy</subject><subject>Texts</subject><subject>Training</subject><subject>Transformers</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNkU1v2zAMho1hBVZ0_QXdQcDOziTLH9IxS9OtQLYGbXoWGJHOFDhWJykoet4fnzIXQ3URQfJ5CfItiivBZ0Jw_WW-WCwfHmYVr-RMSllrwd8V55VodSkb2b5_E38oLmPc8_xUTjXdefFnTaH34QCjJTYfYXiJLjLfsxtCCpAI2YogjG7csfmw88GlX4fIMsJ-HIfkhlw4wsDWwSeKif2k58iuKZFNzo_sMZ7AdaByE8CNWe3axUx9Xd5vGIzITsHH4qyHIdLl639RPN4sN4vv5eru2-1iviqtbHQqkSxvkVq1RYtdC7XqLCBK3kmBjWo0h1pgZa20yCttOXZ5zb7XtgYCoeRFcTvpooe9eQruAOHFeHDmX8KHnYGQnB3IaN53da231Gx1DVmn5shp27ccEVWls9bnSesp-N_HvLnZ-2PI94umUrpRXSV1m7vk1GWDjzFQ_3-q4OZknpnMMyfzzKt5mfo0UY6I3hBS8kYp-RfGn5cO</recordid><startdate>2023</startdate><enddate>2023</enddate><creator>Riedel, Pascal</creator><creator>Reichert, Manfred</creator><creator>Von Schwerin, Reinhold</creator><creator>Hafner, Alexander</creator><creator>Schaudt, Daniel</creator><creator>Singh, Gaurav</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-6332-3865</orcidid><orcidid>https://orcid.org/0000-0002-1368-8209</orcidid><orcidid>https://orcid.org/0000-0002-1645-730X</orcidid><orcidid>https://orcid.org/0000-0003-2536-4153</orcidid></search><sort><creationdate>2023</creationdate><title>Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT</title><author>Riedel, Pascal ; Reichert, Manfred ; Von Schwerin, Reinhold ; Hafner, Alexander ; Schaudt, Daniel ; Singh, Gaurav</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c359t-dec06de68bdcd76a487cadd30731d58590a41d2cc3cd029c0d7081ff9c4aea183</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Analytical models</topic><topic>Computational modeling</topic><topic>Data collection</topic><topic>data distributions</topic><topic>Data models</topic><topic>Distributed databases</topic><topic>distributed learning</topic><topic>federated algorithms</topic><topic>Federated learning</topic><topic>Information management</topic><topic>Machine learning</topic><topic>Multilingualism</topic><topic>Natural language processing</topic><topic>News</topic><topic>optimization</topic><topic>Privacy</topic><topic>Texts</topic><topic>Training</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Riedel, Pascal</creatorcontrib><creatorcontrib>Reichert, Manfred</creatorcontrib><creatorcontrib>Von Schwerin, Reinhold</creatorcontrib><creatorcontrib>Hafner, Alexander</creatorcontrib><creatorcontrib>Schaudt, Daniel</creatorcontrib><creatorcontrib>Singh, Gaurav</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Riedel, Pascal</au><au>Reichert, Manfred</au><au>Von Schwerin, Reinhold</au><au>Hafner, Alexander</au><au>Schaudt, Daniel</au><au>Singh, Gaurav</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2023</date><risdate>2023</risdate><volume>11</volume><spage>134009</spage><epage>134022</epage><pages>134009-134022</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federated Learning (FL) system, these challenges can be alleviated by the training of a global model, eliminating the need to centralize sensitive data of clients. However, distributed NLP data is usually Non-Independent and Identically Distributed (Non-IID), which leads to poorer generalizability of the global model when trained with Federated Averaging (FedAvg). Recently proposed extensions to FedAvg promise to improve the global model performance on Non-IID data. Yet, such advanced FL algorithms trained on multilingual Non-IID texts have not been studied in industry and academia in detail. This paper compares, for the first time, the FL algorithms: FedAvg, FedAvgM, FedYogi, FedAdam and FedAdagrad for a binary text classification task using 12078 tailored real-world news reports in English, Portuguese, Spanish and Hindi. For this objective, pre-trained DistilBERT and BERT models fine-tuned with these texts are used. The paper results show that FedYogi is the most stable and robust FL algorithm when DistilBERT is used, achieving an average macro F1 score of 0.7789 for IID and 0.7755 for Non-IID protest news. The study also exhibits that BERT models trained with weighted FedAvg and FedAvgM can achieve a similar prediction power as centralized language models, demonstrating the potential of leveraging FL in the NLP domain without the need to collect data centrally.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2023.3334910</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-6332-3865</orcidid><orcidid>https://orcid.org/0000-0002-1368-8209</orcidid><orcidid>https://orcid.org/0000-0002-1645-730X</orcidid><orcidid>https://orcid.org/0000-0003-2536-4153</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2023, Vol.11, p.134009-134022
issn 2169-3536
2169-3536
language eng
recordid cdi_crossref_primary_10_1109_ACCESS_2023_3334910
source IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals
subjects Algorithms
Analytical models
Computational modeling
Data collection
data distributions
Data models
Distributed databases
distributed learning
federated algorithms
Federated learning
Information management
Machine learning
Multilingualism
Natural language processing
News
optimization
Privacy
Texts
Training
Transformers
title Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T14%3A08%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Performance%20Analysis%20of%20Federated%20Learning%20Algorithms%20for%20Multilingual%20Protest%20News%20Detection%20Using%20Pre-Trained%20DistilBERT%20and%20BERT&rft.jtitle=IEEE%20access&rft.au=Riedel,%20Pascal&rft.date=2023&rft.volume=11&rft.spage=134009&rft.epage=134022&rft.pages=134009-134022&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2023.3334910&rft_dat=%3Cproquest_cross%3E2895872396%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2895872396&rft_id=info:pmid/&rft_ieee_id=10330588&rft_doaj_id=oai_doaj_org_article_90f7449be5b94a70840d0ebf60ddd829&rfr_iscdi=true