Federated Causal Discovery from Heterogeneous Data

Conventional causal discovery methods rely on centralized data, which is inconsistent with the decentralized nature of data in many real-world situations. This discrepancy has motivated the development of federated causal discovery (FCD) approaches. However, existing FCD methods may be limited by th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Li, Loka, Ng, Ignavier, Luo, Gongxu, Huang, Biwei, Chen, Guangyi, Liu, Tongliang, Gu, Bin, Zhang, Kun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Li, Loka
Ng, Ignavier
Luo, Gongxu
Huang, Biwei
Chen, Guangyi
Liu, Tongliang
Gu, Bin
Zhang, Kun
description Conventional causal discovery methods rely on centralized data, which is inconsistent with the decentralized nature of data in many real-world situations. This discrepancy has motivated the development of federated causal discovery (FCD) approaches. However, existing FCD methods may be limited by their potentially restrictive assumptions of identifiable functional causal models or homogeneous data distributions, narrowing their applicability in diverse scenarios. In this paper, we propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data. We first utilize a surrogate variable corresponding to the client index to account for the data heterogeneity across different clients. We then develop a federated conditional independence test (FCIT) for causal skeleton discovery and establish a federated independent change principle (FICP) to determine causal directions. These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy. Owing to the nonparametric properties, FCIT and FICP make no assumption about particular functional forms, thereby facilitating the handling of arbitrary causal models. We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method. The code is available at https://github.com/lokali/FedCDH.git.
doi_str_mv 10.48550/arxiv.2402.13241
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2402_13241</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2402_13241</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-3ac77488524f1663c26e56c551832b55d4ec885014852d77fb87887fd0caffb33</originalsourceid><addsrcrecordid>eNotzrsKwjAYhuEsDqJegJO5gdack1XqEQSX7uVv8kcKaiWtonfvcfqGFz4eQqac5cppzeaQHs09F4qJnEuh-JCINQZM0GOgBdw6ONFl0_n2julJY2rPdIs9pvaIF2xvHV1CD2MyiHDqcPLfESnXq7LYZvvDZlcs9hkYyzMJ3lrlnBYqcmOkFwa18VpzJ0WtdVDo35Xxt0wEa2PtrHM2BuYhxlrKEZn9br_o6pqaM6Rn9cFXX7x8AWlDPb4</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Federated Causal Discovery from Heterogeneous Data</title><source>arXiv.org</source><creator>Li, Loka ; Ng, Ignavier ; Luo, Gongxu ; Huang, Biwei ; Chen, Guangyi ; Liu, Tongliang ; Gu, Bin ; Zhang, Kun</creator><creatorcontrib>Li, Loka ; Ng, Ignavier ; Luo, Gongxu ; Huang, Biwei ; Chen, Guangyi ; Liu, Tongliang ; Gu, Bin ; Zhang, Kun</creatorcontrib><description>Conventional causal discovery methods rely on centralized data, which is inconsistent with the decentralized nature of data in many real-world situations. This discrepancy has motivated the development of federated causal discovery (FCD) approaches. However, existing FCD methods may be limited by their potentially restrictive assumptions of identifiable functional causal models or homogeneous data distributions, narrowing their applicability in diverse scenarios. In this paper, we propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data. We first utilize a surrogate variable corresponding to the client index to account for the data heterogeneity across different clients. We then develop a federated conditional independence test (FCIT) for causal skeleton discovery and establish a federated independent change principle (FICP) to determine causal directions. These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy. Owing to the nonparametric properties, FCIT and FICP make no assumption about particular functional forms, thereby facilitating the handling of arbitrary causal models. We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method. The code is available at https://github.com/lokali/FedCDH.git.</description><identifier>DOI: 10.48550/arxiv.2402.13241</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning</subject><creationdate>2024-02</creationdate><rights>http://creativecommons.org/licenses/by-nc-nd/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2402.13241$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2402.13241$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Li, Loka</creatorcontrib><creatorcontrib>Ng, Ignavier</creatorcontrib><creatorcontrib>Luo, Gongxu</creatorcontrib><creatorcontrib>Huang, Biwei</creatorcontrib><creatorcontrib>Chen, Guangyi</creatorcontrib><creatorcontrib>Liu, Tongliang</creatorcontrib><creatorcontrib>Gu, Bin</creatorcontrib><creatorcontrib>Zhang, Kun</creatorcontrib><title>Federated Causal Discovery from Heterogeneous Data</title><description>Conventional causal discovery methods rely on centralized data, which is inconsistent with the decentralized nature of data in many real-world situations. This discrepancy has motivated the development of federated causal discovery (FCD) approaches. However, existing FCD methods may be limited by their potentially restrictive assumptions of identifiable functional causal models or homogeneous data distributions, narrowing their applicability in diverse scenarios. In this paper, we propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data. We first utilize a surrogate variable corresponding to the client index to account for the data heterogeneity across different clients. We then develop a federated conditional independence test (FCIT) for causal skeleton discovery and establish a federated independent change principle (FICP) to determine causal directions. These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy. Owing to the nonparametric properties, FCIT and FICP make no assumption about particular functional forms, thereby facilitating the handling of arbitrary causal models. We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method. The code is available at https://github.com/lokali/FedCDH.git.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzrsKwjAYhuEsDqJegJO5gdack1XqEQSX7uVv8kcKaiWtonfvcfqGFz4eQqac5cppzeaQHs09F4qJnEuh-JCINQZM0GOgBdw6ONFl0_n2julJY2rPdIs9pvaIF2xvHV1CD2MyiHDqcPLfESnXq7LYZvvDZlcs9hkYyzMJ3lrlnBYqcmOkFwa18VpzJ0WtdVDo35Xxt0wEa2PtrHM2BuYhxlrKEZn9br_o6pqaM6Rn9cFXX7x8AWlDPb4</recordid><startdate>20240220</startdate><enddate>20240220</enddate><creator>Li, Loka</creator><creator>Ng, Ignavier</creator><creator>Luo, Gongxu</creator><creator>Huang, Biwei</creator><creator>Chen, Guangyi</creator><creator>Liu, Tongliang</creator><creator>Gu, Bin</creator><creator>Zhang, Kun</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240220</creationdate><title>Federated Causal Discovery from Heterogeneous Data</title><author>Li, Loka ; Ng, Ignavier ; Luo, Gongxu ; Huang, Biwei ; Chen, Guangyi ; Liu, Tongliang ; Gu, Bin ; Zhang, Kun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-3ac77488524f1663c26e56c551832b55d4ec885014852d77fb87887fd0caffb33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Li, Loka</creatorcontrib><creatorcontrib>Ng, Ignavier</creatorcontrib><creatorcontrib>Luo, Gongxu</creatorcontrib><creatorcontrib>Huang, Biwei</creatorcontrib><creatorcontrib>Chen, Guangyi</creatorcontrib><creatorcontrib>Liu, Tongliang</creatorcontrib><creatorcontrib>Gu, Bin</creatorcontrib><creatorcontrib>Zhang, Kun</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Loka</au><au>Ng, Ignavier</au><au>Luo, Gongxu</au><au>Huang, Biwei</au><au>Chen, Guangyi</au><au>Liu, Tongliang</au><au>Gu, Bin</au><au>Zhang, Kun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Federated Causal Discovery from Heterogeneous Data</atitle><date>2024-02-20</date><risdate>2024</risdate><abstract>Conventional causal discovery methods rely on centralized data, which is inconsistent with the decentralized nature of data in many real-world situations. This discrepancy has motivated the development of federated causal discovery (FCD) approaches. However, existing FCD methods may be limited by their potentially restrictive assumptions of identifiable functional causal models or homogeneous data distributions, narrowing their applicability in diverse scenarios. In this paper, we propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data. We first utilize a surrogate variable corresponding to the client index to account for the data heterogeneity across different clients. We then develop a federated conditional independence test (FCIT) for causal skeleton discovery and establish a federated independent change principle (FICP) to determine causal directions. These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy. Owing to the nonparametric properties, FCIT and FICP make no assumption about particular functional forms, thereby facilitating the handling of arbitrary causal models. We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method. The code is available at https://github.com/lokali/FedCDH.git.</abstract><doi>10.48550/arxiv.2402.13241</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2402.13241
ispartof
issn
language eng
recordid cdi_arxiv_primary_2402_13241
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Learning
title Federated Causal Discovery from Heterogeneous Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T20%3A04%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Federated%20Causal%20Discovery%20from%20Heterogeneous%20Data&rft.au=Li,%20Loka&rft.date=2024-02-20&rft_id=info:doi/10.48550/arxiv.2402.13241&rft_dat=%3Carxiv_GOX%3E2402_13241%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true