Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT
Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federa...
Gespeichert in:
Veröffentlicht in: | IEEE access 2023, Vol.11, p.134009-134022 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 134022 |
---|---|
container_issue | |
container_start_page | 134009 |
container_title | IEEE access |
container_volume | 11 |
creator | Riedel, Pascal Reichert, Manfred Von Schwerin, Reinhold Hafner, Alexander Schaudt, Daniel Singh, Gaurav |
description | Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federated Learning (FL) system, these challenges can be alleviated by the training of a global model, eliminating the need to centralize sensitive data of clients. However, distributed NLP data is usually Non-Independent and Identically Distributed (Non-IID), which leads to poorer generalizability of the global model when trained with Federated Averaging (FedAvg). Recently proposed extensions to FedAvg promise to improve the global model performance on Non-IID data. Yet, such advanced FL algorithms trained on multilingual Non-IID texts have not been studied in industry and academia in detail. This paper compares, for the first time, the FL algorithms: FedAvg, FedAvgM, FedYogi, FedAdam and FedAdagrad for a binary text classification task using 12078 tailored real-world news reports in English, Portuguese, Spanish and Hindi. For this objective, pre-trained DistilBERT and BERT models fine-tuned with these texts are used. The paper results show that FedYogi is the most stable and robust FL algorithm when DistilBERT is used, achieving an average macro F1 score of 0.7789 for IID and 0.7755 for Non-IID protest news. The study also exhibits that BERT models trained with weighted FedAvg and FedAvgM can achieve a similar prediction power as centralized language models, demonstrating the potential of leveraging FL in the NLP domain without the need to collect data centrally. |
doi_str_mv | 10.1109/ACCESS.2023.3334910 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_ACCESS_2023_3334910</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10330588</ieee_id><doaj_id>oai_doaj_org_article_90f7449be5b94a70840d0ebf60ddd829</doaj_id><sourcerecordid>2895872396</sourcerecordid><originalsourceid>FETCH-LOGICAL-c359t-dec06de68bdcd76a487cadd30731d58590a41d2cc3cd029c0d7081ff9c4aea183</originalsourceid><addsrcrecordid>eNpNkU1v2zAMho1hBVZ0_QXdQcDOziTLH9IxS9OtQLYGbXoWGJHOFDhWJykoet4fnzIXQ3URQfJ5CfItiivBZ0Jw_WW-WCwfHmYVr-RMSllrwd8V55VodSkb2b5_E38oLmPc8_xUTjXdefFnTaH34QCjJTYfYXiJLjLfsxtCCpAI2YogjG7csfmw88GlX4fIMsJ-HIfkhlw4wsDWwSeKif2k58iuKZFNzo_sMZ7AdaByE8CNWe3axUx9Xd5vGIzITsHH4qyHIdLl639RPN4sN4vv5eru2-1iviqtbHQqkSxvkVq1RYtdC7XqLCBK3kmBjWo0h1pgZa20yCttOXZ5zb7XtgYCoeRFcTvpooe9eQruAOHFeHDmX8KHnYGQnB3IaN53da231Gx1DVmn5shp27ccEVWls9bnSesp-N_HvLnZ-2PI94umUrpRXSV1m7vk1GWDjzFQ_3-q4OZknpnMMyfzzKt5mfo0UY6I3hBS8kYp-RfGn5cO</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2895872396</pqid></control><display><type>article</type><title>Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Riedel, Pascal ; Reichert, Manfred ; Von Schwerin, Reinhold ; Hafner, Alexander ; Schaudt, Daniel ; Singh, Gaurav</creator><creatorcontrib>Riedel, Pascal ; Reichert, Manfred ; Von Schwerin, Reinhold ; Hafner, Alexander ; Schaudt, Daniel ; Singh, Gaurav</creatorcontrib><description>Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federated Learning (FL) system, these challenges can be alleviated by the training of a global model, eliminating the need to centralize sensitive data of clients. However, distributed NLP data is usually Non-Independent and Identically Distributed (Non-IID), which leads to poorer generalizability of the global model when trained with Federated Averaging (FedAvg). Recently proposed extensions to FedAvg promise to improve the global model performance on Non-IID data. Yet, such advanced FL algorithms trained on multilingual Non-IID texts have not been studied in industry and academia in detail. This paper compares, for the first time, the FL algorithms: FedAvg, FedAvgM, FedYogi, FedAdam and FedAdagrad for a binary text classification task using 12078 tailored real-world news reports in English, Portuguese, Spanish and Hindi. For this objective, pre-trained DistilBERT and BERT models fine-tuned with these texts are used. The paper results show that FedYogi is the most stable and robust FL algorithm when DistilBERT is used, achieving an average macro F1 score of 0.7789 for IID and 0.7755 for Non-IID protest news. The study also exhibits that BERT models trained with weighted FedAvg and FedAvgM can achieve a similar prediction power as centralized language models, demonstrating the potential of leveraging FL in the NLP domain without the need to collect data centrally.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2023.3334910</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Analytical models ; Computational modeling ; Data collection ; data distributions ; Data models ; Distributed databases ; distributed learning ; federated algorithms ; Federated learning ; Information management ; Machine learning ; Multilingualism ; Natural language processing ; News ; optimization ; Privacy ; Texts ; Training ; Transformers</subject><ispartof>IEEE access, 2023, Vol.11, p.134009-134022</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c359t-dec06de68bdcd76a487cadd30731d58590a41d2cc3cd029c0d7081ff9c4aea183</cites><orcidid>0000-0002-6332-3865 ; 0000-0002-1368-8209 ; 0000-0002-1645-730X ; 0000-0003-2536-4153</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10330588$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2100,4022,27632,27922,27923,27924,54932</link.rule.ids></links><search><creatorcontrib>Riedel, Pascal</creatorcontrib><creatorcontrib>Reichert, Manfred</creatorcontrib><creatorcontrib>Von Schwerin, Reinhold</creatorcontrib><creatorcontrib>Hafner, Alexander</creatorcontrib><creatorcontrib>Schaudt, Daniel</creatorcontrib><creatorcontrib>Singh, Gaurav</creatorcontrib><title>Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT</title><title>IEEE access</title><addtitle>Access</addtitle><description>Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federated Learning (FL) system, these challenges can be alleviated by the training of a global model, eliminating the need to centralize sensitive data of clients. However, distributed NLP data is usually Non-Independent and Identically Distributed (Non-IID), which leads to poorer generalizability of the global model when trained with Federated Averaging (FedAvg). Recently proposed extensions to FedAvg promise to improve the global model performance on Non-IID data. Yet, such advanced FL algorithms trained on multilingual Non-IID texts have not been studied in industry and academia in detail. This paper compares, for the first time, the FL algorithms: FedAvg, FedAvgM, FedYogi, FedAdam and FedAdagrad for a binary text classification task using 12078 tailored real-world news reports in English, Portuguese, Spanish and Hindi. For this objective, pre-trained DistilBERT and BERT models fine-tuned with these texts are used. The paper results show that FedYogi is the most stable and robust FL algorithm when DistilBERT is used, achieving an average macro F1 score of 0.7789 for IID and 0.7755 for Non-IID protest news. The study also exhibits that BERT models trained with weighted FedAvg and FedAvgM can achieve a similar prediction power as centralized language models, demonstrating the potential of leveraging FL in the NLP domain without the need to collect data centrally.</description><subject>Algorithms</subject><subject>Analytical models</subject><subject>Computational modeling</subject><subject>Data collection</subject><subject>data distributions</subject><subject>Data models</subject><subject>Distributed databases</subject><subject>distributed learning</subject><subject>federated algorithms</subject><subject>Federated learning</subject><subject>Information management</subject><subject>Machine learning</subject><subject>Multilingualism</subject><subject>Natural language processing</subject><subject>News</subject><subject>optimization</subject><subject>Privacy</subject><subject>Texts</subject><subject>Training</subject><subject>Transformers</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNkU1v2zAMho1hBVZ0_QXdQcDOziTLH9IxS9OtQLYGbXoWGJHOFDhWJykoet4fnzIXQ3URQfJ5CfItiivBZ0Jw_WW-WCwfHmYVr-RMSllrwd8V55VodSkb2b5_E38oLmPc8_xUTjXdefFnTaH34QCjJTYfYXiJLjLfsxtCCpAI2YogjG7csfmw88GlX4fIMsJ-HIfkhlw4wsDWwSeKif2k58iuKZFNzo_sMZ7AdaByE8CNWe3axUx9Xd5vGIzITsHH4qyHIdLl639RPN4sN4vv5eru2-1iviqtbHQqkSxvkVq1RYtdC7XqLCBK3kmBjWo0h1pgZa20yCttOXZ5zb7XtgYCoeRFcTvpooe9eQruAOHFeHDmX8KHnYGQnB3IaN53da231Gx1DVmn5shp27ccEVWls9bnSesp-N_HvLnZ-2PI94umUrpRXSV1m7vk1GWDjzFQ_3-q4OZknpnMMyfzzKt5mfo0UY6I3hBS8kYp-RfGn5cO</recordid><startdate>2023</startdate><enddate>2023</enddate><creator>Riedel, Pascal</creator><creator>Reichert, Manfred</creator><creator>Von Schwerin, Reinhold</creator><creator>Hafner, Alexander</creator><creator>Schaudt, Daniel</creator><creator>Singh, Gaurav</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-6332-3865</orcidid><orcidid>https://orcid.org/0000-0002-1368-8209</orcidid><orcidid>https://orcid.org/0000-0002-1645-730X</orcidid><orcidid>https://orcid.org/0000-0003-2536-4153</orcidid></search><sort><creationdate>2023</creationdate><title>Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT</title><author>Riedel, Pascal ; Reichert, Manfred ; Von Schwerin, Reinhold ; Hafner, Alexander ; Schaudt, Daniel ; Singh, Gaurav</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c359t-dec06de68bdcd76a487cadd30731d58590a41d2cc3cd029c0d7081ff9c4aea183</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Analytical models</topic><topic>Computational modeling</topic><topic>Data collection</topic><topic>data distributions</topic><topic>Data models</topic><topic>Distributed databases</topic><topic>distributed learning</topic><topic>federated algorithms</topic><topic>Federated learning</topic><topic>Information management</topic><topic>Machine learning</topic><topic>Multilingualism</topic><topic>Natural language processing</topic><topic>News</topic><topic>optimization</topic><topic>Privacy</topic><topic>Texts</topic><topic>Training</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Riedel, Pascal</creatorcontrib><creatorcontrib>Reichert, Manfred</creatorcontrib><creatorcontrib>Von Schwerin, Reinhold</creatorcontrib><creatorcontrib>Hafner, Alexander</creatorcontrib><creatorcontrib>Schaudt, Daniel</creatorcontrib><creatorcontrib>Singh, Gaurav</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Riedel, Pascal</au><au>Reichert, Manfred</au><au>Von Schwerin, Reinhold</au><au>Hafner, Alexander</au><au>Schaudt, Daniel</au><au>Singh, Gaurav</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2023</date><risdate>2023</risdate><volume>11</volume><spage>134009</spage><epage>134022</epage><pages>134009-134022</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Data scientists in the Natural Language Processing (NLP) field confront the challenge of reconciling the necessity for data-centric analyses with the imperative to safeguard sensitive information, all while managing the substantial costs linked to the collection process of training data. In a Federated Learning (FL) system, these challenges can be alleviated by the training of a global model, eliminating the need to centralize sensitive data of clients. However, distributed NLP data is usually Non-Independent and Identically Distributed (Non-IID), which leads to poorer generalizability of the global model when trained with Federated Averaging (FedAvg). Recently proposed extensions to FedAvg promise to improve the global model performance on Non-IID data. Yet, such advanced FL algorithms trained on multilingual Non-IID texts have not been studied in industry and academia in detail. This paper compares, for the first time, the FL algorithms: FedAvg, FedAvgM, FedYogi, FedAdam and FedAdagrad for a binary text classification task using 12078 tailored real-world news reports in English, Portuguese, Spanish and Hindi. For this objective, pre-trained DistilBERT and BERT models fine-tuned with these texts are used. The paper results show that FedYogi is the most stable and robust FL algorithm when DistilBERT is used, achieving an average macro F1 score of 0.7789 for IID and 0.7755 for Non-IID protest news. The study also exhibits that BERT models trained with weighted FedAvg and FedAvgM can achieve a similar prediction power as centralized language models, demonstrating the potential of leveraging FL in the NLP domain without the need to collect data centrally.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2023.3334910</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-6332-3865</orcidid><orcidid>https://orcid.org/0000-0002-1368-8209</orcidid><orcidid>https://orcid.org/0000-0002-1645-730X</orcidid><orcidid>https://orcid.org/0000-0003-2536-4153</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2023, Vol.11, p.134009-134022 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_crossref_primary_10_1109_ACCESS_2023_3334910 |
source | IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals |
subjects | Algorithms Analytical models Computational modeling Data collection data distributions Data models Distributed databases distributed learning federated algorithms Federated learning Information management Machine learning Multilingualism Natural language processing News optimization Privacy Texts Training Transformers |
title | Performance Analysis of Federated Learning Algorithms for Multilingual Protest News Detection Using Pre-Trained DistilBERT and BERT |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T14%3A08%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Performance%20Analysis%20of%20Federated%20Learning%20Algorithms%20for%20Multilingual%20Protest%20News%20Detection%20Using%20Pre-Trained%20DistilBERT%20and%20BERT&rft.jtitle=IEEE%20access&rft.au=Riedel,%20Pascal&rft.date=2023&rft.volume=11&rft.spage=134009&rft.epage=134022&rft.pages=134009-134022&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2023.3334910&rft_dat=%3Cproquest_cross%3E2895872396%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2895872396&rft_id=info:pmid/&rft_ieee_id=10330588&rft_doaj_id=oai_doaj_org_article_90f7449be5b94a70840d0ebf60ddd829&rfr_iscdi=true |