Private Federated Learning for GBDT

Federated Learning (FL) has been an emerging trend in machine learning and artificial intelligence. It allows multiple participants to collaboratively train a better global model and offers a privacy-aware paradigm for model training since it does not require participants to release their original t...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on dependable and secure computing 2024-05, Vol.21 (3), p.1274-1285
Hauptverfasser:	Tian, Zhihua, Zhang, Rui, Hou, Xiaoyang, Lyu, Lingjuan, Zhang, Tianyi, Liu, Jian, Ren, Kui
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial intelligence Boosting Cryptography Data models Decision trees Federated learning GBDT Industrial applications Machine learning Prediction algorithms Privacy Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1285
container_issue	3
container_start_page	1274
container_title	IEEE transactions on dependable and secure computing
container_volume	21
creator	Tian, Zhihua Zhang, Rui Hou, Xiaoyang Lyu, Lingjuan Zhang, Tianyi Liu, Jian Ren, Kui
description	Federated Learning (FL) has been an emerging trend in machine learning and artificial intelligence. It allows multiple participants to collaboratively train a better global model and offers a privacy-aware paradigm for model training since it does not require participants to release their original training data. However, existing FL solutions for vertically partitioned data or decision trees require heavy cryptographic operations. In this article, we propose a framework named \mathsf {FederBoost} FederBoost for private federated learning of gradient boosting decision trees (GBDT). It supports running GBDT over both vertically and horizontally partitioned data. Vertical \mathsf {FederBoost} FederBoost does not require any cryptographic operation and horizontal \mathsf {FederBoost} FederBoost only requires lightweight secure aggregation. The key observation is that the whole training process of GBDT relies on the ordering of the data instead of the values. We fully implement \mathsf {FederBoost} FederBoost and evaluate its utility and efficiency through extensive experiments performed on three public datasets. Our experimental results show that both vertical and horizontal \mathsf {FederBoost} FederBoost achieve the same level of accuracy with centralized training where all data are collected in a central server; and they are 4-5 orders of magnitude faster than the state-of-the-art solutions for federated decision tree training; hence offering practical solutions for industrial applications.
doi_str_mv	10.1109/TDSC.2023.3276365
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TDSC_2023_3276365</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10125048</ieee_id><sourcerecordid>3055178811</sourcerecordid><originalsourceid>FETCH-LOGICAL-c1391-7e688ffd466e2a8eecb7507b3f0f3692549b0b63d2a4eb8971c1544fe0f7831d3</originalsourceid><addsrcrecordid>eNpNkEtLA0EQhAdRMEZ_gOBhIeddu-c9R01MFAIKxvOwjx7ZoNk4mwj--8ySHDx1Haqqi4-xW4QCEdz9avY-LThwUQhutNDqjI3QScwB0J4nraTKlTN4ya76fg3ApXVyxCZvsf0td5TNqaGYRJMtqYybdvOZhS5mi8fZ6ppdhPKrp5vTHbOP-dNq-pwvXxcv04dlXqNwmBvS1obQSK2Jl5aorowCU4kAQWjHlXQVVFo0vJRU2bSlTqtkIAjGCmzEmE2OvdvY_eyp3_l1t4-b9NILUAqNtYjJhUdXHbu-jxT8NrbfZfzzCH5g4QcWfmDhTyxS5u6YaYnonx-5AmnFAQdOV9I</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3055178811</pqid></control><display><type>article</type><title>Private Federated Learning for GBDT</title><source>IEEE Electronic Library (IEL)</source><creator>Tian, Zhihua ; Zhang, Rui ; Hou, Xiaoyang ; Lyu, Lingjuan ; Zhang, Tianyi ; Liu, Jian ; Ren, Kui</creator><creatorcontrib>Tian, Zhihua ; Zhang, Rui ; Hou, Xiaoyang ; Lyu, Lingjuan ; Zhang, Tianyi ; Liu, Jian ; Ren, Kui</creatorcontrib><description><![CDATA[Federated Learning (FL) has been an emerging trend in machine learning and artificial intelligence. It allows multiple participants to collaboratively train a better global model and offers a privacy-aware paradigm for model training since it does not require participants to release their original training data. However, existing FL solutions for vertically partitioned data or decision trees require heavy cryptographic operations. In this article, we propose a framework named <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq5-3276365.gif"/> </inline-formula> for private federated learning of gradient boosting decision trees (GBDT). It supports running GBDT over both vertically and horizontally partitioned data. Vertical <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq6-3276365.gif"/> </inline-formula> does not require any cryptographic operation and horizontal <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq7-3276365.gif"/> </inline-formula> only requires lightweight secure aggregation. The key observation is that the whole training process of GBDT relies on the ordering of the data instead of the values. We fully implement <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq8-3276365.gif"/> </inline-formula> and evaluate its utility and efficiency through extensive experiments performed on three public datasets. Our experimental results show that both vertical and horizontal <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq9-3276365.gif"/> </inline-formula> achieve the same level of accuracy with centralized training where all data are collected in a central server; and they are 4-5 orders of magnitude faster than the state-of-the-art solutions for federated decision tree training; hence offering practical solutions for industrial applications.]]></description><identifier>ISSN: 1545-5971</identifier><identifier>EISSN: 1941-0018</identifier><identifier>DOI: 10.1109/TDSC.2023.3276365</identifier><identifier>CODEN: ITDSCM</identifier><language>eng</language><publisher>Washington: IEEE</publisher><subject>Artificial intelligence ; Boosting ; Cryptography ; Data models ; Decision trees ; Federated learning ; GBDT ; Industrial applications ; Machine learning ; Prediction algorithms ; Privacy ; Training</subject><ispartof>IEEE transactions on dependable and secure computing, 2024-05, Vol.21 (3), p.1274-1285</ispartof><rights>Copyright IEEE Computer Society 2024</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c1391-7e688ffd466e2a8eecb7507b3f0f3692549b0b63d2a4eb8971c1544fe0f7831d3</citedby><cites>FETCH-LOGICAL-c1391-7e688ffd466e2a8eecb7507b3f0f3692549b0b63d2a4eb8971c1544fe0f7831d3</cites><orcidid>0000-0003-2636-5561 ; 0000-0003-3170-4994 ; 0000-0002-1969-2591 ; 0009-0006-3646-372X ; 0000-0001-7885-5103 ; 0009-0001-4635-9831 ; 0000-0002-2477-2473</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10125048$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10125048$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Tian, Zhihua</creatorcontrib><creatorcontrib>Zhang, Rui</creatorcontrib><creatorcontrib>Hou, Xiaoyang</creatorcontrib><creatorcontrib>Lyu, Lingjuan</creatorcontrib><creatorcontrib>Zhang, Tianyi</creatorcontrib><creatorcontrib>Liu, Jian</creatorcontrib><creatorcontrib>Ren, Kui</creatorcontrib><title>Private Federated Learning for GBDT</title><title>IEEE transactions on dependable and secure computing</title><addtitle>TDSC</addtitle><description><![CDATA[Federated Learning (FL) has been an emerging trend in machine learning and artificial intelligence. It allows multiple participants to collaboratively train a better global model and offers a privacy-aware paradigm for model training since it does not require participants to release their original training data. However, existing FL solutions for vertically partitioned data or decision trees require heavy cryptographic operations. In this article, we propose a framework named <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq5-3276365.gif"/> </inline-formula> for private federated learning of gradient boosting decision trees (GBDT). It supports running GBDT over both vertically and horizontally partitioned data. Vertical <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq6-3276365.gif"/> </inline-formula> does not require any cryptographic operation and horizontal <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq7-3276365.gif"/> </inline-formula> only requires lightweight secure aggregation. The key observation is that the whole training process of GBDT relies on the ordering of the data instead of the values. We fully implement <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq8-3276365.gif"/> </inline-formula> and evaluate its utility and efficiency through extensive experiments performed on three public datasets. Our experimental results show that both vertical and horizontal <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq9-3276365.gif"/> </inline-formula> achieve the same level of accuracy with centralized training where all data are collected in a central server; and they are 4-5 orders of magnitude faster than the state-of-the-art solutions for federated decision tree training; hence offering practical solutions for industrial applications.]]></description><subject>Artificial intelligence</subject><subject>Boosting</subject><subject>Cryptography</subject><subject>Data models</subject><subject>Decision trees</subject><subject>Federated learning</subject><subject>GBDT</subject><subject>Industrial applications</subject><subject>Machine learning</subject><subject>Prediction algorithms</subject><subject>Privacy</subject><subject>Training</subject><issn>1545-5971</issn><issn>1941-0018</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkEtLA0EQhAdRMEZ_gOBhIeddu-c9R01MFAIKxvOwjx7ZoNk4mwj--8ySHDx1Haqqi4-xW4QCEdz9avY-LThwUQhutNDqjI3QScwB0J4nraTKlTN4ya76fg3ApXVyxCZvsf0td5TNqaGYRJMtqYybdvOZhS5mi8fZ6ppdhPKrp5vTHbOP-dNq-pwvXxcv04dlXqNwmBvS1obQSK2Jl5aorowCU4kAQWjHlXQVVFo0vJRU2bSlTqtkIAjGCmzEmE2OvdvY_eyp3_l1t4-b9NILUAqNtYjJhUdXHbu-jxT8NrbfZfzzCH5g4QcWfmDhTyxS5u6YaYnonx-5AmnFAQdOV9I</recordid><startdate>202405</startdate><enddate>202405</enddate><creator>Tian, Zhihua</creator><creator>Zhang, Rui</creator><creator>Hou, Xiaoyang</creator><creator>Lyu, Lingjuan</creator><creator>Zhang, Tianyi</creator><creator>Liu, Jian</creator><creator>Ren, Kui</creator><general>IEEE</general><general>IEEE Computer Society</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><orcidid>https://orcid.org/0000-0003-2636-5561</orcidid><orcidid>https://orcid.org/0000-0003-3170-4994</orcidid><orcidid>https://orcid.org/0000-0002-1969-2591</orcidid><orcidid>https://orcid.org/0009-0006-3646-372X</orcidid><orcidid>https://orcid.org/0000-0001-7885-5103</orcidid><orcidid>https://orcid.org/0009-0001-4635-9831</orcidid><orcidid>https://orcid.org/0000-0002-2477-2473</orcidid></search><sort><creationdate>202405</creationdate><title>Private Federated Learning for GBDT</title><author>Tian, Zhihua ; Zhang, Rui ; Hou, Xiaoyang ; Lyu, Lingjuan ; Zhang, Tianyi ; Liu, Jian ; Ren, Kui</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c1391-7e688ffd466e2a8eecb7507b3f0f3692549b0b63d2a4eb8971c1544fe0f7831d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial intelligence</topic><topic>Boosting</topic><topic>Cryptography</topic><topic>Data models</topic><topic>Decision trees</topic><topic>Federated learning</topic><topic>GBDT</topic><topic>Industrial applications</topic><topic>Machine learning</topic><topic>Prediction algorithms</topic><topic>Privacy</topic><topic>Training</topic><toplevel>online_resources</toplevel><creatorcontrib>Tian, Zhihua</creatorcontrib><creatorcontrib>Zhang, Rui</creatorcontrib><creatorcontrib>Hou, Xiaoyang</creatorcontrib><creatorcontrib>Lyu, Lingjuan</creatorcontrib><creatorcontrib>Zhang, Tianyi</creatorcontrib><creatorcontrib>Liu, Jian</creatorcontrib><creatorcontrib>Ren, Kui</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><jtitle>IEEE transactions on dependable and secure computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Tian, Zhihua</au><au>Zhang, Rui</au><au>Hou, Xiaoyang</au><au>Lyu, Lingjuan</au><au>Zhang, Tianyi</au><au>Liu, Jian</au><au>Ren, Kui</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Private Federated Learning for GBDT</atitle><jtitle>IEEE transactions on dependable and secure computing</jtitle><stitle>TDSC</stitle><date>2024-05</date><risdate>2024</risdate><volume>21</volume><issue>3</issue><spage>1274</spage><epage>1285</epage><pages>1274-1285</pages><issn>1545-5971</issn><eissn>1941-0018</eissn><coden>ITDSCM</coden><abstract><![CDATA[Federated Learning (FL) has been an emerging trend in machine learning and artificial intelligence. It allows multiple participants to collaboratively train a better global model and offers a privacy-aware paradigm for model training since it does not require participants to release their original training data. However, existing FL solutions for vertically partitioned data or decision trees require heavy cryptographic operations. In this article, we propose a framework named <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq5-3276365.gif"/> </inline-formula> for private federated learning of gradient boosting decision trees (GBDT). It supports running GBDT over both vertically and horizontally partitioned data. Vertical <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq6-3276365.gif"/> </inline-formula> does not require any cryptographic operation and horizontal <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq7-3276365.gif"/> </inline-formula> only requires lightweight secure aggregation. The key observation is that the whole training process of GBDT relies on the ordering of the data instead of the values. We fully implement <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq8-3276365.gif"/> </inline-formula> and evaluate its utility and efficiency through extensive experiments performed on three public datasets. Our experimental results show that both vertical and horizontal <inline-formula><tex-math notation="LaTeX">\mathsf {FederBoost}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">FederBoost</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq9-3276365.gif"/> </inline-formula> achieve the same level of accuracy with centralized training where all data are collected in a central server; and they are 4-5 orders of magnitude faster than the state-of-the-art solutions for federated decision tree training; hence offering practical solutions for industrial applications.]]></abstract><cop>Washington</cop><pub>IEEE</pub><doi>10.1109/TDSC.2023.3276365</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0003-2636-5561</orcidid><orcidid>https://orcid.org/0000-0003-3170-4994</orcidid><orcidid>https://orcid.org/0000-0002-1969-2591</orcidid><orcidid>https://orcid.org/0009-0006-3646-372X</orcidid><orcidid>https://orcid.org/0000-0001-7885-5103</orcidid><orcidid>https://orcid.org/0009-0001-4635-9831</orcidid><orcidid>https://orcid.org/0000-0002-2477-2473</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1545-5971
ispartof	IEEE transactions on dependable and secure computing, 2024-05, Vol.21 (3), p.1274-1285
issn	1545-5971 1941-0018
language	eng
recordid	cdi_crossref_primary_10_1109_TDSC_2023_3276365
source	IEEE Electronic Library (IEL)
subjects	Artificial intelligence Boosting Cryptography Data models Decision trees Federated learning GBDT Industrial applications Machine learning Prediction algorithms Privacy Training
title	Private Federated Learning for GBDT
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T20%3A21%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Private%20Federated%20Learning%20for%20GBDT&rft.jtitle=IEEE%20transactions%20on%20dependable%20and%20secure%20computing&rft.au=Tian,%20Zhihua&rft.date=2024-05&rft.volume=21&rft.issue=3&rft.spage=1274&rft.epage=1285&rft.pages=1274-1285&rft.issn=1545-5971&rft.eissn=1941-0018&rft.coden=ITDSCM&rft_id=info:doi/10.1109/TDSC.2023.3276365&rft_dat=%3Cproquest_RIE%3E3055178811%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3055178811&rft_id=info:pmid/&rft_ieee_id=10125048&rfr_iscdi=true