Causal Decision Trees
Uncovering causal relationships in data is a major objective of data analytics. Currently, there is a need for scalable and automated methods for causal relationship exploration in data. Classification methods are fast and they could be practical substitutes for finding causal signals in data. Howev...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on knowledge and data engineering 2017-02, Vol.29 (2), p.257-271 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 271 |
---|---|
container_issue | 2 |
container_start_page | 257 |
container_title | IEEE transactions on knowledge and data engineering |
container_volume | 29 |
creator | Li, Jiuyong Ma, Saisai Le, Thuc Liu, Lin Liu, Jixue |
description | Uncovering causal relationships in data is a major objective of data analytics. Currently, there is a need for scalable and automated methods for causal relationship exploration in data. Classification methods are fast and they could be practical substitutes for finding causal signals in data. However, classification methods are not designed for causal discovery and a classification method may find false causal signals and miss the true ones. In this paper, we develop a causal decision tree (CDT) where nodes have causal interpretations. Our method follows a well-established causal inference framework and makes use of a classic statistical test to establish the causal relationship between a predictor variable and the outcome variable. At the same time, by taking the advantages of normal decision trees, a CDT provides a compact graphical representation of the causal relationships, and the construction of a CDT is fast as a result of the divide and conquer strategy employed, making CDTs practical for representing and finding causal signals in large data sets. Experiment results demonstrate that CDTs can identify meaningful causal relationships and the CDT algorithm is scalable. |
doi_str_mv | 10.1109/TKDE.2016.2619350 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_7600471</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7600471</ieee_id><sourcerecordid>1858794767</sourcerecordid><originalsourceid>FETCH-LOGICAL-c336t-7b871ae6511a54a9e2151a62b13cc71aa2b97bd84f73a332917015cdf68f3af43</originalsourceid><addsrcrecordid>eNo9jz1PwzAQhi0EEqUwMiCWSswJPn-dPaK0fIhKLGG2HNeWUpWm2M3AvydRqk73Sve8d3oIeQBaAlDzXH8uVyWjoEqmwHBJL8gMpNQFAwOXQ6YCCsEFXpObnLeUUo0aZuS-cn12u8Uy-Da33X5RpxDyLbmKbpfD3WnOyffrqq7ei_XX20f1si485-pYYKMRXFASwEnhTGAgwSnWAPd-2DjWGGw2WkTkjnNmAClIv4lKR-6i4HPyNN09pO63D_lot12f9sNLC1pqNAIVDhRMlE9dzilEe0jtj0t_Fqgd7e1ob0d7e7IfOo9Tpw0hnHlUlAoE_g8oaFLN</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1858794767</pqid></control><display><type>article</type><title>Causal Decision Trees</title><source>IEEE Electronic Library (IEL)</source><creator>Li, Jiuyong ; Ma, Saisai ; Le, Thuc ; Liu, Lin ; Liu, Jixue</creator><creatorcontrib>Li, Jiuyong ; Ma, Saisai ; Le, Thuc ; Liu, Lin ; Liu, Jixue</creatorcontrib><description>Uncovering causal relationships in data is a major objective of data analytics. Currently, there is a need for scalable and automated methods for causal relationship exploration in data. Classification methods are fast and they could be practical substitutes for finding causal signals in data. However, classification methods are not designed for causal discovery and a classification method may find false causal signals and miss the true ones. In this paper, we develop a causal decision tree (CDT) where nodes have causal interpretations. Our method follows a well-established causal inference framework and makes use of a classic statistical test to establish the causal relationship between a predictor variable and the outcome variable. At the same time, by taking the advantages of normal decision trees, a CDT provides a compact graphical representation of the causal relationships, and the construction of a CDT is fast as a result of the divide and conquer strategy employed, making CDTs practical for representing and finding causal signals in large data sets. Experiment results demonstrate that CDTs can identify meaningful causal relationships and the CDT algorithm is scalable.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2016.2619350</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Bayes methods ; causal relationship ; Classification ; Context ; Data analysis ; Decision tree ; Decision trees ; Graphical representations ; Knowledge engineering ; Mathematical model ; partial association ; potential outcome model ; Remuneration ; Statistical inference ; Statistical tests</subject><ispartof>IEEE transactions on knowledge and data engineering, 2017-02, Vol.29 (2), p.257-271</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c336t-7b871ae6511a54a9e2151a62b13cc71aa2b97bd84f73a332917015cdf68f3af43</citedby><cites>FETCH-LOGICAL-c336t-7b871ae6511a54a9e2151a62b13cc71aa2b97bd84f73a332917015cdf68f3af43</cites><orcidid>0000-0002-9023-1878</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7600471$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7600471$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Li, Jiuyong</creatorcontrib><creatorcontrib>Ma, Saisai</creatorcontrib><creatorcontrib>Le, Thuc</creatorcontrib><creatorcontrib>Liu, Lin</creatorcontrib><creatorcontrib>Liu, Jixue</creatorcontrib><title>Causal Decision Trees</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>Uncovering causal relationships in data is a major objective of data analytics. Currently, there is a need for scalable and automated methods for causal relationship exploration in data. Classification methods are fast and they could be practical substitutes for finding causal signals in data. However, classification methods are not designed for causal discovery and a classification method may find false causal signals and miss the true ones. In this paper, we develop a causal decision tree (CDT) where nodes have causal interpretations. Our method follows a well-established causal inference framework and makes use of a classic statistical test to establish the causal relationship between a predictor variable and the outcome variable. At the same time, by taking the advantages of normal decision trees, a CDT provides a compact graphical representation of the causal relationships, and the construction of a CDT is fast as a result of the divide and conquer strategy employed, making CDTs practical for representing and finding causal signals in large data sets. Experiment results demonstrate that CDTs can identify meaningful causal relationships and the CDT algorithm is scalable.</description><subject>Algorithms</subject><subject>Bayes methods</subject><subject>causal relationship</subject><subject>Classification</subject><subject>Context</subject><subject>Data analysis</subject><subject>Decision tree</subject><subject>Decision trees</subject><subject>Graphical representations</subject><subject>Knowledge engineering</subject><subject>Mathematical model</subject><subject>partial association</subject><subject>potential outcome model</subject><subject>Remuneration</subject><subject>Statistical inference</subject><subject>Statistical tests</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9jz1PwzAQhi0EEqUwMiCWSswJPn-dPaK0fIhKLGG2HNeWUpWm2M3AvydRqk73Sve8d3oIeQBaAlDzXH8uVyWjoEqmwHBJL8gMpNQFAwOXQ6YCCsEFXpObnLeUUo0aZuS-cn12u8Uy-Da33X5RpxDyLbmKbpfD3WnOyffrqq7ei_XX20f1si485-pYYKMRXFASwEnhTGAgwSnWAPd-2DjWGGw2WkTkjnNmAClIv4lKR-6i4HPyNN09pO63D_lot12f9sNLC1pqNAIVDhRMlE9dzilEe0jtj0t_Fqgd7e1ob0d7e7IfOo9Tpw0hnHlUlAoE_g8oaFLN</recordid><startdate>20170201</startdate><enddate>20170201</enddate><creator>Li, Jiuyong</creator><creator>Ma, Saisai</creator><creator>Le, Thuc</creator><creator>Liu, Lin</creator><creator>Liu, Jixue</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-9023-1878</orcidid></search><sort><creationdate>20170201</creationdate><title>Causal Decision Trees</title><author>Li, Jiuyong ; Ma, Saisai ; Le, Thuc ; Liu, Lin ; Liu, Jixue</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c336t-7b871ae6511a54a9e2151a62b13cc71aa2b97bd84f73a332917015cdf68f3af43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Algorithms</topic><topic>Bayes methods</topic><topic>causal relationship</topic><topic>Classification</topic><topic>Context</topic><topic>Data analysis</topic><topic>Decision tree</topic><topic>Decision trees</topic><topic>Graphical representations</topic><topic>Knowledge engineering</topic><topic>Mathematical model</topic><topic>partial association</topic><topic>potential outcome model</topic><topic>Remuneration</topic><topic>Statistical inference</topic><topic>Statistical tests</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Jiuyong</creatorcontrib><creatorcontrib>Ma, Saisai</creatorcontrib><creatorcontrib>Le, Thuc</creatorcontrib><creatorcontrib>Liu, Lin</creatorcontrib><creatorcontrib>Liu, Jixue</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Jiuyong</au><au>Ma, Saisai</au><au>Le, Thuc</au><au>Liu, Lin</au><au>Liu, Jixue</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Causal Decision Trees</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2017-02-01</date><risdate>2017</risdate><volume>29</volume><issue>2</issue><spage>257</spage><epage>271</epage><pages>257-271</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>Uncovering causal relationships in data is a major objective of data analytics. Currently, there is a need for scalable and automated methods for causal relationship exploration in data. Classification methods are fast and they could be practical substitutes for finding causal signals in data. However, classification methods are not designed for causal discovery and a classification method may find false causal signals and miss the true ones. In this paper, we develop a causal decision tree (CDT) where nodes have causal interpretations. Our method follows a well-established causal inference framework and makes use of a classic statistical test to establish the causal relationship between a predictor variable and the outcome variable. At the same time, by taking the advantages of normal decision trees, a CDT provides a compact graphical representation of the causal relationships, and the construction of a CDT is fast as a result of the divide and conquer strategy employed, making CDTs practical for representing and finding causal signals in large data sets. Experiment results demonstrate that CDTs can identify meaningful causal relationships and the CDT algorithm is scalable.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TKDE.2016.2619350</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-9023-1878</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1041-4347 |
ispartof | IEEE transactions on knowledge and data engineering, 2017-02, Vol.29 (2), p.257-271 |
issn | 1041-4347 1558-2191 |
language | eng |
recordid | cdi_ieee_primary_7600471 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithms Bayes methods causal relationship Classification Context Data analysis Decision tree Decision trees Graphical representations Knowledge engineering Mathematical model partial association potential outcome model Remuneration Statistical inference Statistical tests |
title | Causal Decision Trees |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-18T01%3A39%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Causal%20Decision%20Trees&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Li,%20Jiuyong&rft.date=2017-02-01&rft.volume=29&rft.issue=2&rft.spage=257&rft.epage=271&rft.pages=257-271&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2016.2619350&rft_dat=%3Cproquest_RIE%3E1858794767%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1858794767&rft_id=info:pmid/&rft_ieee_id=7600471&rfr_iscdi=true |