Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT

Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2023, Vol.11, p.134009-134022
Hauptverfasser:	Riedel, Pascal, Reichert, Manfred, Von Schwerin, Reinhold, Hafner, Alexander, Schaudt, Daniel, Singh, Gaurav
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Analytical models Computational modeling Data collection data distributions Data models Distributed databases distributed learning federated algorithms Federated learning Information management Machine learning Multilingualism Natural language processing News optimization Privacy Texts Training Transformers
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	134022
container_issue
container_start_page	134009
container_title	IEEE access
container_volume	11
creator	Riedel, Pascal Reichert, Manfred Von Schwerin, Reinhold Hafner, Alexander Schaudt, Daniel Singh, Gaurav
description	Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federated Learning (FL) system, these challenges can be alleviated by the training of a global model, eliminating the need to centralize sensitive data of clients. However, distributed NLP data is usually Non-Independent and Identically Distributed (Non-IID), which leads to poorer generalizability of the global model when trained with Federated Averaging (FedAvg). Recently proposed extensions to FedAvg promise to improve the global model performance on Non-IID data. Yet, such advanced FL algorithms trained on multilingual Non-IID texts have not been studied in industry and academia in detail. This paper compares, for the first time, the FL algorithms: FedAvg, FedAvgM, FedYogi, FedAdam and FedAdagrad for a binary text classification task using 12078 tailored real-world news reports in English, Portuguese, Spanish and Hindi. For this objective, pre-trained DistilBERT and BERT models fine-tuned with these texts are used. The paper results show that FedYogi is the most stable and robust FL algorithm when DistilBERT is used, achieving an average macro F1 score of 0.7789 for IID and 0.7755 for Non-IID protest news. The study also exhibits that BERT models trained with weighted FedAvg and FedAvgM can achieve a similar prediction power as centralized language models, demonstrating the potential of leveraging FL in the NLP domain without the need to collect data centrally.
doi_str_mv	10.1109/ACCESS.2023.3334910
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_ACCESS_2023_3334910</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10330588</ieee_id><doaj_id>oai_doaj_org_article_90f7449be5b94a70840d0ebf60ddd829</doaj_id><sourcerecordid>2895872396</sourcerecordid><originalsourceid>FETCH-LOGICAL-c359t-dec06de68bdcd76a487cadd30731d58590a41d2cc3cd029c0d7081ff9c4aea183</originalsourceid><addsrcrecordid>eNpNkU1v2zAMho1hBVZ0_QXdQcDOziTLH9IxS9OtQLYGbXoWGJHOFDhWJykoet4fnzIXQ3URQfJ5CfItiivBZ0Jw_WW-WCwfHmYVr-RMSllrwd8V55VodSkb2b5_E38oLmPc8_xUTjXdefFnTaH34QCjJTYfYXiJLjLfsxtCCpAI2YogjG7csfmw88GlX4fIMsJ-HIfkhlw4wsDWwSeKif2k58iuKZFNzo_sMZ7AdaByE8CNWe3axUx9Xd5vGIzITsHH4qyHIdLl639RPN4sN4vv5eru2-1iviqtbHQqkSxvkVq1RYtdC7XqLCBK3kmBjWo0h1pgZa20yCttOXZ5zb7XtgYCoeRFcTvpooe9eQruAOHFeHDmX8KHnYGQnB3IaN53da231Gx1DVmn5shp27ccEVWls9bnSesp-N_HvLnZ-2PI94umUrpRXSV1m7vk1GWDjzFQ_3-q4OZknpnMMyfzzKt5mfo0UY6I3hBS8kYp-RfGn5cO</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2895872396</pqid></control><display><type>article</type><title>Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Riedel, Pascal ; Reichert, Manfred ; Von Schwerin, Reinhold ; Hafner, Alexander ; Schaudt, Daniel ; Singh, Gaurav</creator><creatorcontrib>Riedel, Pascal ; Reichert, Manfred ; Von Schwerin, Reinhold ; Hafner, Alexander ; Schaudt, Daniel ; Singh, Gaurav</creatorcontrib><description>Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federated Learning (FL) system, these challenges can be alleviated by the training of a global model, eliminating the need to centralize sensitive data of clients. However, distributed NLP data is usually Non-Independent and Identically Distributed (Non-IID), which leads to poorer generalizability of the global model when trained with Federated Averaging (FedAvg). Recently proposed extensions to FedAvg promise to improve the global model performance on Non-IID data. Yet, such advanced FL algorithms trained on multilingual Non-IID texts have not been studied in industry and academia in detail. This paper compares, for the first time, the FL algorithms: FedAvg, FedAvgM, FedYogi, FedAdam and FedAdagrad for a binary text classification task using 12078 tailored real-world news reports in English, Portuguese, Spanish and Hindi. For this objective, pre-trained DistilBERT and BERT models fine-tuned with these texts are used. The paper results show that FedYogi is the most stable and robust FL algorithm when DistilBERT is used, achieving an average macro F1 score of 0.7789 for IID and 0.7755 for Non-IID protest news. The study also exhibits that BERT models trained with weighted FedAvg and FedAvgM can achieve a similar prediction power as centralized language models, demonstrating the potential of leveraging FL in the NLP domain without the need to collect data centrally.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2023.3334910</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Analytical models ; Computational modeling ; Data collection ; data distributions ; Data models ; Distributed databases ; distributed learning ; federated algorithms ; Federated learning ; Information management ; Machine learning ; Multilingualism ; Natural language processing ; News ; optimization ; Privacy ; Texts ; Training ; Transformers</subject><ispartof>IEEE access, 2023, Vol.11, p.134009-134022</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c359t-dec06de68bdcd76a487cadd30731d58590a41d2cc3cd029c0d7081ff9c4aea183</cites><orcidid>0000-0002-6332-3865 ; 0000-0002-1368-8209 ; 0000-0002-1645-730X ; 0000-0003-2536-4153</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10330588$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2100,4022,27632,27922,27923,27924,54932</link.rule.ids></links><search><creatorcontrib>Riedel, Pascal</creatorcontrib><creatorcontrib>Reichert, Manfred</creatorcontrib><creatorcontrib>Von Schwerin, Reinhold</creatorcontrib><creatorcontrib>Hafner, Alexander</creatorcontrib><creatorcontrib>Schaudt, Daniel</creatorcontrib><creatorcontrib>Singh, Gaurav</creatorcontrib><title>Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT</title><title>IEEE access</title><addtitle>Access</addtitle><description>Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federated Learning (FL) system, these challenges can be alleviated by the training of a global model, eliminating the need to centralize sensitive data of clients. However, distributed NLP data is usually Non-Independent and Identically Distributed (Non-IID), which leads to poorer generalizability of the global model when trained with Federated Averaging (FedAvg). Recently proposed extensions to FedAvg promise to improve the global model performance on Non-IID data. Yet, such advanced FL algorithms trained on multilingual Non-IID texts have not been studied in industry and academia in detail. This paper compares, for the first time, the FL algorithms: FedAvg, FedAvgM, FedYogi, FedAdam and FedAdagrad for a binary text classification task using 12078 tailored real-world news reports in English, Portuguese, Spanish and Hindi. For this objective, pre-trained DistilBERT and BERT models fine-tuned with these texts are used. The paper results show that FedYogi is the most stable and robust FL algorithm when DistilBERT is used, achieving an average macro F1 score of 0.7789 for IID and 0.7755 for Non-IID protest news. The study also exhibits that BERT models trained with weighted FedAvg and FedAvgM can achieve a similar prediction power as centralized language models, demonstrating the potential of leveraging FL in the NLP domain without the need to collect data centrally.</description><subject>Algorithms</subject><subject>Analytical models</subject><subject>Computational modeling</subject><subject>Data collection</subject><subject>data distributions</subject><subject>Data models</subject><subject>Distributed databases</subject><subject>distributed learning</subject><subject>federated algorithms</subject><subject>Federated learning</subject><subject>Information management</subject><subject>Machine learning</subject><subject>Multilingualism</subject><subject>Natural language processing</subject><subject>News</subject><subject>optimization</subject><subject>Privacy</subject><subject>Texts</subject><subject>Training</subject><subject>Transformers</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNkU1v2zAMho1hBVZ0_QXdQcDOziTLH9IxS9OtQLYGbXoWGJHOFDhWJykoet4fnzIXQ3URQfJ5CfItiivBZ0Jw_WW-WCwfHmYVr-RMSllrwd8V55VodSkb2b5_E38oLmPc8_xUTjXdefFnTaH34QCjJTYfYXiJLjLfsxtCCpAI2YogjG7csfmw88GlX4fIMsJ-HIfkhlw4wsDWwSeKif2k58iuKZFNzo_sMZ7AdaByE8CNWe3axUx9Xd5vGIzITsHH4qyHIdLl639RPN4sN4vv5eru2-1iviqtbHQqkSxvkVq1RYtdC7XqLCBK3kmBjWo0h1pgZa20yCttOXZ5zb7XtgYCoeRFcTvpooe9eQruAOHFeHDmX8KHnYGQnB3IaN53da231Gx1DVmn5shp27ccEVWls9bnSesp-N_HvLnZ-2PI94umUrpRXSV1m7vk1GWDjzFQ_3-q4OZknpnMMyfzzKt5mfo0UY6I3hBS8kYp-RfGn5cO</recordid><startdate>2023</startdate><enddate>2023</enddate><creator>Riedel, Pascal</creator><creator>Reichert, Manfred</creator><creator>Von Schwerin, Reinhold</creator><creator>Hafner, Alexander</creator><creator>Schaudt, Daniel</creator><creator>Singh, Gaurav</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-6332-3865</orcidid><orcidid>https://orcid.org/0000-0002-1368-8209</orcidid><orcidid>https://orcid.org/0000-0002-1645-730X</orcidid><orcidid>https://orcid.org/0000-0003-2536-4153</orcidid></search><sort><creationdate>2023</creationdate><title>Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT</title><author>Riedel, Pascal ; Reichert, Manfred ; Von Schwerin, Reinhold ; Hafner, Alexander ; Schaudt, Daniel ; Singh, Gaurav</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c359t-dec06de68bdcd76a487cadd30731d58590a41d2cc3cd029c0d7081ff9c4aea183</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Analytical models</topic><topic>Computational modeling</topic><topic>Data collection</topic><topic>data distributions</topic><topic>Data models</topic><topic>Distributed databases</topic><topic>distributed learning</topic><topic>federated algorithms</topic><topic>Federated learning</topic><topic>Information management</topic><topic>Machine learning</topic><topic>Multilingualism</topic><topic>Natural language processing</topic><topic>News</topic><topic>optimization</topic><topic>Privacy</topic><topic>Texts</topic><topic>Training</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Riedel, Pascal</creatorcontrib><creatorcontrib>Reichert, Manfred</creatorcontrib><creatorcontrib>Von Schwerin, Reinhold</creatorcontrib><creatorcontrib>Hafner, Alexander</creatorcontrib><creatorcontrib>Schaudt, Daniel</creatorcontrib><creatorcontrib>Singh, Gaurav</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Riedel, Pascal</au><au>Reichert, Manfred</au><au>Von Schwerin, Reinhold</au><au>Hafner, Alexander</au><au>Schaudt, Daniel</au><au>Singh, Gaurav</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2023</date><risdate>2023</risdate><volume>11</volume><spage>134009</spage><epage>134022</epage><pages>134009-134022</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federated Learning (FL) system, these challenges can be alleviated by the training of a global model, eliminating the need to centralize sensitive data of clients. However, distributed NLP data is usually Non-Independent and Identically Distributed (Non-IID), which leads to poorer generalizability of the global model when trained with Federated Averaging (FedAvg). Recently proposed extensions to FedAvg promise to improve the global model performance on Non-IID data. Yet, such advanced FL algorithms trained on multilingual Non-IID texts have not been studied in industry and academia in detail. This paper compares, for the first time, the FL algorithms: FedAvg, FedAvgM, FedYogi, FedAdam and FedAdagrad for a binary text classification task using 12078 tailored real-world news reports in English, Portuguese, Spanish and Hindi. For this objective, pre-trained DistilBERT and BERT models fine-tuned with these texts are used. The paper results show that FedYogi is the most stable and robust FL algorithm when DistilBERT is used, achieving an average macro F1 score of 0.7789 for IID and 0.7755 for Non-IID protest news. The study also exhibits that BERT models trained with weighted FedAvg and FedAvgM can achieve a similar prediction power as centralized language models, demonstrating the potential of leveraging FL in the NLP domain without the need to collect data centrally.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2023.3334910</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-6332-3865</orcidid><orcidid>https://orcid.org/0000-0002-1368-8209</orcidid><orcidid>https://orcid.org/0000-0002-1645-730X</orcidid><orcidid>https://orcid.org/0000-0003-2536-4153</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2169-3536
ispartof	IEEE access, 2023, Vol.11, p.134009-134022
issn	2169-3536 2169-3536
language	eng
recordid	cdi_crossref_primary_10_1109_ACCESS_2023_3334910
source	IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals
subjects	Algorithms Analytical models Computational modeling Data collection data distributions Data models Distributed databases distributed learning federated algorithms Federated learning Information management Machine learning Multilingualism Natural language processing News optimization Privacy Texts Training Transformers
title	Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T14%3A08%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Performance%20Analysis%20of%20Federated%20Learning%20Algorithms%20for%20Multilingual%20Protest%20News%20Detection%20Using%20Pre-Trained%20DistilBERT%20and%20BERT&rft.jtitle=IEEE%20access&rft.au=Riedel,%20Pascal&rft.date=2023&rft.volume=11&rft.spage=134009&rft.epage=134022&rft.pages=134009-134022&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2023.3334910&rft_dat=%3Cproquest_cross%3E2895872396%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2895872396&rft_id=info:pmid/&rft_ieee_id=10330588&rft_doaj_id=oai_doaj_org_article_90f7449be5b94a70840d0ebf60ddd829&rfr_iscdi=true