Securing a Local Training Dataset Size in Federated Learning
Federated learning (FL) is an emerging paradigm that helps to train a global machine learning (ML) model by utilizing decentralized data among clients without sharing them. Although FL is a more secure way of model training than a general ML, industries where training data are primarily personal inf...
Gespeichert in:
Veröffentlicht in: | IEEE access 2022, Vol.10, p.104135-104143 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 104143 |
---|---|
container_issue | |
container_start_page | 104135 |
container_title | IEEE access |
container_volume | 10 |
creator | Shin, Young Ah Noh, Geontae Jeong, Ik Rae Chun, Ji Young |
description | Federated learning (FL) is an emerging paradigm that helps to train a global machine learning (ML) model by utilizing decentralized data among clients without sharing them. Although FL is a more secure way of model training than a general ML, industries where training data are primarily personal information, such as MRI images or Electronic Health Records (EHR), should be more precautious of privacy and security issues when using FL. For example, unbalanced dataset sizes may denote some meaningful information that can lead to security vulnerabilities even if the training data of the clients are not exposed. In this paper, we present a Privacy-Preserving Federated Averaging ( \mathbf {PP-FedAvg} ) protocol specialized for healthcare settings to limit user data privacy leakage in FL. We particularly protect the size of datasets as well as the aggregated local update parameters by securely computing among clients based on homomorphic encryption. This approach ensures that the server does not access the size of datasets and local update parameters while updating the global model. Our protocol has the advantage of protecting the size of datasets when datasets are not uniformly distributed among clients and when some clients drop out each iteration. |
doi_str_mv | 10.1109/ACCESS.2022.3210702 |
format | Article |
fullrecord | <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_9905592</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9905592</ieee_id><doaj_id>oai_doaj_org_article_225c3606e0764fe6b008a905ab3d4285</doaj_id><sourcerecordid>2722549603</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-961633cedc2de1fb1109c19012d5405e3e61980b49a3653bcf86390df881aacb3</originalsourceid><addsrcrecordid>eNpNUMtqwzAQFKWFhjRfkIuhZ6d6WLIFvYQ0aQOGHpyexVpaB4fUTmXn0H595TqE7kXLMDM7GkLmjC4Yo_ppuVqti2LBKecLwRlNKb8hE86UjoUU6vbffk9mXXegYbIAyXRCngu0Z183-wiivLVwjHYe6mYAXqCHDvuoqH8wqptogw499OiiHMEPlAdyV8Gxw9nlnZKPzXq3eovz99ftapnHNqFZH2vFlBAWneUOWVUOqS3TlHEnEypRoGI6o2WiQSgpSltlSmjqqixjALYUU7IdfV0LB3Py9Sf4b9NCbf6A1u8N-L62RzScSysUVUhTlVSoyvBV0FRCKVzCMxm8Hkevk2-_ztj15tCefRPiG54GdaIVFYElRpb1bdd5rK5XGTVDfjO2bobWzaX1oJqPqhoRrwodzkvNxS--knqr</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2722549603</pqid></control><display><type>article</type><title>Securing a Local Training Dataset Size in Federated Learning</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Shin, Young Ah ; Noh, Geontae ; Jeong, Ik Rae ; Chun, Ji Young</creator><creatorcontrib>Shin, Young Ah ; Noh, Geontae ; Jeong, Ik Rae ; Chun, Ji Young</creatorcontrib><description>Federated learning (FL) is an emerging paradigm that helps to train a global machine learning (ML) model by utilizing decentralized data among clients without sharing them. Although FL is a more secure way of model training than a general ML, industries where training data are primarily personal information, such as MRI images or Electronic Health Records (EHR), should be more precautious of privacy and security issues when using FL. For example, unbalanced dataset sizes may denote some meaningful information that can lead to security vulnerabilities even if the training data of the clients are not exposed. In this paper, we present a Privacy-Preserving Federated Averaging (<inline-formula> <tex-math notation="LaTeX">\mathbf {PP-FedAvg} </tex-math></inline-formula>) protocol specialized for healthcare settings to limit user data privacy leakage in FL. We particularly protect the size of datasets as well as the aggregated local update parameters by securely computing among clients based on homomorphic encryption. This approach ensures that the server does not access the size of datasets and local update parameters while updating the global model. Our protocol has the advantage of protecting the size of datasets when datasets are not uniformly distributed among clients and when some clients drop out each iteration.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2022.3210702</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Clients ; Computational modeling ; Cryptography ; Data models ; Data privacy ; Datasets ; Electronic health records ; Federated learning ; Homomorphic encryption ; Iterative methods ; Machine learning ; Mathematical models ; Parameters ; Privacy ; privacy-preserving ; Security ; Servers ; Training data ; training dataset</subject><ispartof>IEEE access, 2022, Vol.10, p.104135-104143</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-961633cedc2de1fb1109c19012d5405e3e61980b49a3653bcf86390df881aacb3</citedby><cites>FETCH-LOGICAL-c408t-961633cedc2de1fb1109c19012d5405e3e61980b49a3653bcf86390df881aacb3</cites><orcidid>0000-0002-5329-8918 ; 0000-0001-7969-7143 ; 0000-0003-2547-7529 ; 0000-0002-4120-2165</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9905592$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,4010,27610,27900,27901,27902,54908</link.rule.ids></links><search><creatorcontrib>Shin, Young Ah</creatorcontrib><creatorcontrib>Noh, Geontae</creatorcontrib><creatorcontrib>Jeong, Ik Rae</creatorcontrib><creatorcontrib>Chun, Ji Young</creatorcontrib><title>Securing a Local Training Dataset Size in Federated Learning</title><title>IEEE access</title><addtitle>Access</addtitle><description>Federated learning (FL) is an emerging paradigm that helps to train a global machine learning (ML) model by utilizing decentralized data among clients without sharing them. Although FL is a more secure way of model training than a general ML, industries where training data are primarily personal information, such as MRI images or Electronic Health Records (EHR), should be more precautious of privacy and security issues when using FL. For example, unbalanced dataset sizes may denote some meaningful information that can lead to security vulnerabilities even if the training data of the clients are not exposed. In this paper, we present a Privacy-Preserving Federated Averaging (<inline-formula> <tex-math notation="LaTeX">\mathbf {PP-FedAvg} </tex-math></inline-formula>) protocol specialized for healthcare settings to limit user data privacy leakage in FL. We particularly protect the size of datasets as well as the aggregated local update parameters by securely computing among clients based on homomorphic encryption. This approach ensures that the server does not access the size of datasets and local update parameters while updating the global model. Our protocol has the advantage of protecting the size of datasets when datasets are not uniformly distributed among clients and when some clients drop out each iteration.</description><subject>Clients</subject><subject>Computational modeling</subject><subject>Cryptography</subject><subject>Data models</subject><subject>Data privacy</subject><subject>Datasets</subject><subject>Electronic health records</subject><subject>Federated learning</subject><subject>Homomorphic encryption</subject><subject>Iterative methods</subject><subject>Machine learning</subject><subject>Mathematical models</subject><subject>Parameters</subject><subject>Privacy</subject><subject>privacy-preserving</subject><subject>Security</subject><subject>Servers</subject><subject>Training data</subject><subject>training dataset</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUMtqwzAQFKWFhjRfkIuhZ6d6WLIFvYQ0aQOGHpyexVpaB4fUTmXn0H595TqE7kXLMDM7GkLmjC4Yo_ppuVqti2LBKecLwRlNKb8hE86UjoUU6vbffk9mXXegYbIAyXRCngu0Z183-wiivLVwjHYe6mYAXqCHDvuoqH8wqptogw499OiiHMEPlAdyV8Gxw9nlnZKPzXq3eovz99ftapnHNqFZH2vFlBAWneUOWVUOqS3TlHEnEypRoGI6o2WiQSgpSltlSmjqqixjALYUU7IdfV0LB3Py9Sf4b9NCbf6A1u8N-L62RzScSysUVUhTlVSoyvBV0FRCKVzCMxm8Hkevk2-_ztj15tCefRPiG54GdaIVFYElRpb1bdd5rK5XGTVDfjO2bobWzaX1oJqPqhoRrwodzkvNxS--knqr</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Shin, Young Ah</creator><creator>Noh, Geontae</creator><creator>Jeong, Ik Rae</creator><creator>Chun, Ji Young</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-5329-8918</orcidid><orcidid>https://orcid.org/0000-0001-7969-7143</orcidid><orcidid>https://orcid.org/0000-0003-2547-7529</orcidid><orcidid>https://orcid.org/0000-0002-4120-2165</orcidid></search><sort><creationdate>2022</creationdate><title>Securing a Local Training Dataset Size in Federated Learning</title><author>Shin, Young Ah ; Noh, Geontae ; Jeong, Ik Rae ; Chun, Ji Young</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-961633cedc2de1fb1109c19012d5405e3e61980b49a3653bcf86390df881aacb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Clients</topic><topic>Computational modeling</topic><topic>Cryptography</topic><topic>Data models</topic><topic>Data privacy</topic><topic>Datasets</topic><topic>Electronic health records</topic><topic>Federated learning</topic><topic>Homomorphic encryption</topic><topic>Iterative methods</topic><topic>Machine learning</topic><topic>Mathematical models</topic><topic>Parameters</topic><topic>Privacy</topic><topic>privacy-preserving</topic><topic>Security</topic><topic>Servers</topic><topic>Training data</topic><topic>training dataset</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Shin, Young Ah</creatorcontrib><creatorcontrib>Noh, Geontae</creatorcontrib><creatorcontrib>Jeong, Ik Rae</creatorcontrib><creatorcontrib>Chun, Ji Young</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Shin, Young Ah</au><au>Noh, Geontae</au><au>Jeong, Ik Rae</au><au>Chun, Ji Young</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Securing a Local Training Dataset Size in Federated Learning</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2022</date><risdate>2022</risdate><volume>10</volume><spage>104135</spage><epage>104143</epage><pages>104135-104143</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Federated learning (FL) is an emerging paradigm that helps to train a global machine learning (ML) model by utilizing decentralized data among clients without sharing them. Although FL is a more secure way of model training than a general ML, industries where training data are primarily personal information, such as MRI images or Electronic Health Records (EHR), should be more precautious of privacy and security issues when using FL. For example, unbalanced dataset sizes may denote some meaningful information that can lead to security vulnerabilities even if the training data of the clients are not exposed. In this paper, we present a Privacy-Preserving Federated Averaging (<inline-formula> <tex-math notation="LaTeX">\mathbf {PP-FedAvg} </tex-math></inline-formula>) protocol specialized for healthcare settings to limit user data privacy leakage in FL. We particularly protect the size of datasets as well as the aggregated local update parameters by securely computing among clients based on homomorphic encryption. This approach ensures that the server does not access the size of datasets and local update parameters while updating the global model. Our protocol has the advantage of protecting the size of datasets when datasets are not uniformly distributed among clients and when some clients drop out each iteration.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2022.3210702</doi><tpages>9</tpages><orcidid>https://orcid.org/0000-0002-5329-8918</orcidid><orcidid>https://orcid.org/0000-0001-7969-7143</orcidid><orcidid>https://orcid.org/0000-0003-2547-7529</orcidid><orcidid>https://orcid.org/0000-0002-4120-2165</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2022, Vol.10, p.104135-104143 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_ieee_primary_9905592 |
source | IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | Clients Computational modeling Cryptography Data models Data privacy Datasets Electronic health records Federated learning Homomorphic encryption Iterative methods Machine learning Mathematical models Parameters Privacy privacy-preserving Security Servers Training data training dataset |
title | Securing a Local Training Dataset Size in Federated Learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-19T08%3A37%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Securing%20a%20Local%20Training%20Dataset%20Size%20in%20Federated%20Learning&rft.jtitle=IEEE%20access&rft.au=Shin,%20Young%20Ah&rft.date=2022&rft.volume=10&rft.spage=104135&rft.epage=104143&rft.pages=104135-104143&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2022.3210702&rft_dat=%3Cproquest_ieee_%3E2722549603%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2722549603&rft_id=info:pmid/&rft_ieee_id=9905592&rft_doaj_id=oai_doaj_org_article_225c3606e0764fe6b008a905ab3d4285&rfr_iscdi=true |