Integer Programming for Learning Directed Acyclic Graphs from Continuous Data

Learning directed acyclic graphs (DAGs) from data is a challenging task both in theory and in practice, because the number of possible DAGs scales superexponentially with the number of nodes. In this paper, we study the problem of learning an optimal DAG from continuous observational data. We cast t...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Manzour, Hasan, Küçükyavuz, Simge, Shojaie, Ali
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Discrete Mathematics Computer Science - Learning Statistics - Machine Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Manzour, Hasan Küçükyavuz, Simge Shojaie, Ali
description	Learning directed acyclic graphs (DAGs) from data is a challenging task both in theory and in practice, because the number of possible DAGs scales superexponentially with the number of nodes. In this paper, we study the problem of learning an optimal DAG from continuous observational data. We cast this problem in the form of a mathematical programming model which can naturally incorporate a super-structure in order to reduce the set of possible candidate DAGs. We use the penalized negative log-likelihood score function with both $\ell_0$ and $\ell_1$ regularizations and propose a new mixed-integer quadratic optimization (MIQO) model, referred to as a layered network (LN) formulation. The LN formulation is a compact model, which enjoys as tight an optimal continuous relaxation value as the stronger but larger formulations under a mild condition. Computational results indicate that the proposed formulation outperforms existing mathematical formulations and scales better than available algorithms that can solve the same problem with only $\ell_1$ regularization. In particular, the LN formulation clearly outperforms existing methods in terms of computational time needed to find an optimal DAG in the presence of a sparse super-structure.
doi_str_mv	10.48550/arxiv.1904.10574
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1904_10574</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1904_10574</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-8f6184e9e03081960373e6bb0c3b7d2f0984639c80b0561e6bc3e6c6a46b2ab53</originalsourceid><addsrcrecordid>eNotj7FuwjAURb0wVNAP6FT_QNJn7Dj2iEJLkYLagT16Nk5qicToJVTl7wu009XVkY50GHsSkCtTFPCC9BO_c2FB5QKKUj2w3XaYQheIf1LqCPs-Dh1vE_E6IA23s44U_BQOfOUv_hg93xCevkbeUup5lYYpDud0HvkaJ1ywWYvHMTz-75zt31731XtWf2y21arOUJcqM60WRgUbQIIRVoMsZdDOgZeuPCxbsEZpab0BB4UWV-Sv3GtU2i3RFXLOnv-0957mRLFHujS3rubeJX8BBDFH0g</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Integer Programming for Learning Directed Acyclic Graphs from Continuous Data</title><source>arXiv.org</source><creator>Manzour, Hasan ; Küçükyavuz, Simge ; Shojaie, Ali</creator><creatorcontrib>Manzour, Hasan ; Küçükyavuz, Simge ; Shojaie, Ali</creatorcontrib><description>Learning directed acyclic graphs (DAGs) from data is a challenging task both in theory and in practice, because the number of possible DAGs scales superexponentially with the number of nodes. In this paper, we study the problem of learning an optimal DAG from continuous observational data. We cast this problem in the form of a mathematical programming model which can naturally incorporate a super-structure in order to reduce the set of possible candidate DAGs. We use the penalized negative log-likelihood score function with both $\ell_0$ and $\ell_1$ regularizations and propose a new mixed-integer quadratic optimization (MIQO) model, referred to as a layered network (LN) formulation. The LN formulation is a compact model, which enjoys as tight an optimal continuous relaxation value as the stronger but larger formulations under a mild condition. Computational results indicate that the proposed formulation outperforms existing mathematical formulations and scales better than available algorithms that can solve the same problem with only $\ell_1$ regularization. In particular, the LN formulation clearly outperforms existing methods in terms of computational time needed to find an optimal DAG in the presence of a sparse super-structure.</description><identifier>DOI: 10.48550/arxiv.1904.10574</identifier><language>eng</language><subject>Computer Science - Discrete Mathematics ; Computer Science - Learning ; Statistics - Machine Learning</subject><creationdate>2019-04</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,777,882</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1904.10574$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1904.10574$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Manzour, Hasan</creatorcontrib><creatorcontrib>Küçükyavuz, Simge</creatorcontrib><creatorcontrib>Shojaie, Ali</creatorcontrib><title>Integer Programming for Learning Directed Acyclic Graphs from Continuous Data</title><description>Learning directed acyclic graphs (DAGs) from data is a challenging task both in theory and in practice, because the number of possible DAGs scales superexponentially with the number of nodes. In this paper, we study the problem of learning an optimal DAG from continuous observational data. We cast this problem in the form of a mathematical programming model which can naturally incorporate a super-structure in order to reduce the set of possible candidate DAGs. We use the penalized negative log-likelihood score function with both $\ell_0$ and $\ell_1$ regularizations and propose a new mixed-integer quadratic optimization (MIQO) model, referred to as a layered network (LN) formulation. The LN formulation is a compact model, which enjoys as tight an optimal continuous relaxation value as the stronger but larger formulations under a mild condition. Computational results indicate that the proposed formulation outperforms existing mathematical formulations and scales better than available algorithms that can solve the same problem with only $\ell_1$ regularization. In particular, the LN formulation clearly outperforms existing methods in terms of computational time needed to find an optimal DAG in the presence of a sparse super-structure.</description><subject>Computer Science - Discrete Mathematics</subject><subject>Computer Science - Learning</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj7FuwjAURb0wVNAP6FT_QNJn7Dj2iEJLkYLagT16Nk5qicToJVTl7wu009XVkY50GHsSkCtTFPCC9BO_c2FB5QKKUj2w3XaYQheIf1LqCPs-Dh1vE_E6IA23s44U_BQOfOUv_hg93xCevkbeUup5lYYpDud0HvkaJ1ywWYvHMTz-75zt31731XtWf2y21arOUJcqM60WRgUbQIIRVoMsZdDOgZeuPCxbsEZpab0BB4UWV-Sv3GtU2i3RFXLOnv-0957mRLFHujS3rubeJX8BBDFH0g</recordid><startdate>20190423</startdate><enddate>20190423</enddate><creator>Manzour, Hasan</creator><creator>Küçükyavuz, Simge</creator><creator>Shojaie, Ali</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20190423</creationdate><title>Integer Programming for Learning Directed Acyclic Graphs from Continuous Data</title><author>Manzour, Hasan ; Küçükyavuz, Simge ; Shojaie, Ali</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-8f6184e9e03081960373e6bb0c3b7d2f0984639c80b0561e6bc3e6c6a46b2ab53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Computer Science - Discrete Mathematics</topic><topic>Computer Science - Learning</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Manzour, Hasan</creatorcontrib><creatorcontrib>Küçükyavuz, Simge</creatorcontrib><creatorcontrib>Shojaie, Ali</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Manzour, Hasan</au><au>Küçükyavuz, Simge</au><au>Shojaie, Ali</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Integer Programming for Learning Directed Acyclic Graphs from Continuous Data</atitle><date>2019-04-23</date><risdate>2019</risdate><abstract>Learning directed acyclic graphs (DAGs) from data is a challenging task both in theory and in practice, because the number of possible DAGs scales superexponentially with the number of nodes. In this paper, we study the problem of learning an optimal DAG from continuous observational data. We cast this problem in the form of a mathematical programming model which can naturally incorporate a super-structure in order to reduce the set of possible candidate DAGs. We use the penalized negative log-likelihood score function with both $\ell_0$ and $\ell_1$ regularizations and propose a new mixed-integer quadratic optimization (MIQO) model, referred to as a layered network (LN) formulation. The LN formulation is a compact model, which enjoys as tight an optimal continuous relaxation value as the stronger but larger formulations under a mild condition. Computational results indicate that the proposed formulation outperforms existing mathematical formulations and scales better than available algorithms that can solve the same problem with only $\ell_1$ regularization. In particular, the LN formulation clearly outperforms existing methods in terms of computational time needed to find an optimal DAG in the presence of a sparse super-structure.</abstract><doi>10.48550/arxiv.1904.10574</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.1904.10574
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_1904_10574
source	arXiv.org
subjects	Computer Science - Discrete Mathematics Computer Science - Learning Statistics - Machine Learning
title	Integer Programming for Learning Directed Acyclic Graphs from Continuous Data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T12%3A29%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Integer%20Programming%20for%20Learning%20Directed%20Acyclic%20Graphs%20from%20Continuous%20Data&rft.au=Manzour,%20Hasan&rft.date=2019-04-23&rft_id=info:doi/10.48550/arxiv.1904.10574&rft_dat=%3Carxiv_GOX%3E1904_10574%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true