A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data

In this paper, we introduce a tree-based approach adjusting for observable self-selection bias in intervention studies in management research. In contrast to traditional propensity score (PS) matching methods, including those using classification trees as a subcomponent, our tree-based approach prov...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:MIS quarterly 2016-12, Vol.40 (4), p.819-848
Hauptverfasser: Yahav, Inbal, Shmueli, Galit, Mani, Deepa
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 848
container_issue 4
container_start_page 819
container_title MIS quarterly
container_volume 40
creator Yahav, Inbal
Shmueli, Galit
Mani, Deepa
description In this paper, we introduce a tree-based approach adjusting for observable self-selection bias in intervention studies in management research. In contrast to traditional propensity score (PS) matching methods, including those using classification trees as a subcomponent, our tree-based approach provides a standalone, automated, data-driven methodology that allows for (1) the examination of nascent interventions whose selection is difficult and costly to theoretically specify a priori, (2) detection of heterogeneous intervention effects for different pre-intervention profiles, (3) identification of pre-intervention variables that correlate with the self-selected intervention, and (4) visual presentation of intervention effects that is easy to discern and understand. As such, the tree-based approach is a useful tool for analyzing observational impact studies as well as for post-analysis of experimental data. The tree-based approach is particularly advantageous in the analyses of big data, or data with large sample sizes and a large number of variables. It outperforms PS in terms of computational time, data loss, and automatic capture of nonlinear relationships and heterogeneous interventions. It also requires less user specification and choices than PS, reducing potential data dredging. We discuss the performance of our method in the context of such big data and present results for very large simulated samples with many variables. We illustrate the method and the insights it yields in the context of three impact studies with different study designs: reanalysis of a field study on the effect of training on earnings, analysis of the impact of an electronic governance service in India based on a quasi-experiment, and performance comparison of contract pricing mechanisms and durations in IT outsourcing using observational data.
doi_str_mv 10.25300/MISQ/2016/40.4.02
format Article
fullrecord <record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_journals_1842819917</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>26629678</jstor_id><sourcerecordid>26629678</sourcerecordid><originalsourceid>FETCH-LOGICAL-c276t-e66373cd6a9e0f46b00dec4f35f38ca9ca6afabe53d986e54eedec52db30f1ee3</originalsourceid><addsrcrecordid>eNo9kF9LwzAUxYMoOKdfQBACPne7Sdq0fezmv8FEpPM5ZOnN1rG1M8kQv72ZE1_uebjn3Hv4EXLLYMQzATB-ndXvYw5MjlMYpSPgZ2TAmeRJmQs4JwPguUzyvBCX5Mr7DQCwnOUDUld04RCTifbY0Gq_d702a2p7R6umceh9261ojVubxIEmtH1H247OdnttAq3DoWnR0682rOmkXdEHHfQ1ubB66_HmT4fk4-lxMX1J5m_Ps2k1T0wsExKUUuTCNFKXCDaVS4AGTWpFZkVhdGm01FYvMRNNWUjMUsS4z3izFGAZohiS-9PdWPrzgD6oTX9wXXypWJHygpUly6OLn1zG9d47tGrv2p1234qB-oWnjvDUEZ5KQaUKeAzdnUIbH3r3n-BS8lJGij-GI2xH</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1842819917</pqid></control><display><type>article</type><title>A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data</title><source>EBSCOhost Business Source Complete</source><source>JSTOR Archive Collection A-Z Listing</source><creator>Yahav, Inbal ; Shmueli, Galit ; Mani, Deepa</creator><creatorcontrib>Yahav, Inbal ; Shmueli, Galit ; Mani, Deepa ; National Tsing Hua University ; Bar Ilan University ; Indian School of Business</creatorcontrib><description>In this paper, we introduce a tree-based approach adjusting for observable self-selection bias in intervention studies in management research. In contrast to traditional propensity score (PS) matching methods, including those using classification trees as a subcomponent, our tree-based approach provides a standalone, automated, data-driven methodology that allows for (1) the examination of nascent interventions whose selection is difficult and costly to theoretically specify a priori, (2) detection of heterogeneous intervention effects for different pre-intervention profiles, (3) identification of pre-intervention variables that correlate with the self-selected intervention, and (4) visual presentation of intervention effects that is easy to discern and understand. As such, the tree-based approach is a useful tool for analyzing observational impact studies as well as for post-analysis of experimental data. The tree-based approach is particularly advantageous in the analyses of big data, or data with large sample sizes and a large number of variables. It outperforms PS in terms of computational time, data loss, and automatic capture of nonlinear relationships and heterogeneous interventions. It also requires less user specification and choices than PS, reducing potential data dredging. We discuss the performance of our method in the context of such big data and present results for very large simulated samples with many variables. We illustrate the method and the insights it yields in the context of three impact studies with different study designs: reanalysis of a field study on the effect of training on earnings, analysis of the impact of an electronic governance service in India based on a quasi-experiment, and performance comparison of contract pricing mechanisms and durations in IT outsourcing using observational data.</description><identifier>ISSN: 0276-7783</identifier><identifier>EISSN: 2162-9730</identifier><identifier>DOI: 10.25300/MISQ/2016/40.4.02</identifier><identifier>CODEN: MISQDP</identifier><language>eng</language><publisher>Minneapolis: Management Information Systems Research Center, University of Minnesota</publisher><subject>Big Data ; Big Data &amp; Analytics in Networked Business ; Electronic government ; Impact analysis ; Information technology ; Intervention ; Studies ; Training</subject><ispartof>MIS quarterly, 2016-12, Vol.40 (4), p.819-848</ispartof><rights>Copyright University of Minnesota, MIS Research Center Dec 2016</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c276t-e66373cd6a9e0f46b00dec4f35f38ca9ca6afabe53d986e54eedec52db30f1ee3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/26629678$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/26629678$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>315,781,785,804,27929,27930,58022,58255</link.rule.ids></links><search><creatorcontrib>Yahav, Inbal</creatorcontrib><creatorcontrib>Shmueli, Galit</creatorcontrib><creatorcontrib>Mani, Deepa</creatorcontrib><creatorcontrib>National Tsing Hua University</creatorcontrib><creatorcontrib>Bar Ilan University</creatorcontrib><creatorcontrib>Indian School of Business</creatorcontrib><title>A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data</title><title>MIS quarterly</title><description>In this paper, we introduce a tree-based approach adjusting for observable self-selection bias in intervention studies in management research. In contrast to traditional propensity score (PS) matching methods, including those using classification trees as a subcomponent, our tree-based approach provides a standalone, automated, data-driven methodology that allows for (1) the examination of nascent interventions whose selection is difficult and costly to theoretically specify a priori, (2) detection of heterogeneous intervention effects for different pre-intervention profiles, (3) identification of pre-intervention variables that correlate with the self-selected intervention, and (4) visual presentation of intervention effects that is easy to discern and understand. As such, the tree-based approach is a useful tool for analyzing observational impact studies as well as for post-analysis of experimental data. The tree-based approach is particularly advantageous in the analyses of big data, or data with large sample sizes and a large number of variables. It outperforms PS in terms of computational time, data loss, and automatic capture of nonlinear relationships and heterogeneous interventions. It also requires less user specification and choices than PS, reducing potential data dredging. We discuss the performance of our method in the context of such big data and present results for very large simulated samples with many variables. We illustrate the method and the insights it yields in the context of three impact studies with different study designs: reanalysis of a field study on the effect of training on earnings, analysis of the impact of an electronic governance service in India based on a quasi-experiment, and performance comparison of contract pricing mechanisms and durations in IT outsourcing using observational data.</description><subject>Big Data</subject><subject>Big Data &amp; Analytics in Networked Business</subject><subject>Electronic government</subject><subject>Impact analysis</subject><subject>Information technology</subject><subject>Intervention</subject><subject>Studies</subject><subject>Training</subject><issn>0276-7783</issn><issn>2162-9730</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNo9kF9LwzAUxYMoOKdfQBACPne7Sdq0fezmv8FEpPM5ZOnN1rG1M8kQv72ZE1_uebjn3Hv4EXLLYMQzATB-ndXvYw5MjlMYpSPgZ2TAmeRJmQs4JwPguUzyvBCX5Mr7DQCwnOUDUld04RCTifbY0Gq_d702a2p7R6umceh9261ojVubxIEmtH1H247OdnttAq3DoWnR0682rOmkXdEHHfQ1ubB66_HmT4fk4-lxMX1J5m_Ps2k1T0wsExKUUuTCNFKXCDaVS4AGTWpFZkVhdGm01FYvMRNNWUjMUsS4z3izFGAZohiS-9PdWPrzgD6oTX9wXXypWJHygpUly6OLn1zG9d47tGrv2p1234qB-oWnjvDUEZ5KQaUKeAzdnUIbH3r3n-BS8lJGij-GI2xH</recordid><startdate>20161201</startdate><enddate>20161201</enddate><creator>Yahav, Inbal</creator><creator>Shmueli, Galit</creator><creator>Mani, Deepa</creator><general>Management Information Systems Research Center, University of Minnesota</general><general>University of Minnesota, MIS Research Center</general><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope></search><sort><creationdate>20161201</creationdate><title>A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data</title><author>Yahav, Inbal ; Shmueli, Galit ; Mani, Deepa</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c276t-e66373cd6a9e0f46b00dec4f35f38ca9ca6afabe53d986e54eedec52db30f1ee3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Big Data</topic><topic>Big Data &amp; Analytics in Networked Business</topic><topic>Electronic government</topic><topic>Impact analysis</topic><topic>Information technology</topic><topic>Intervention</topic><topic>Studies</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yahav, Inbal</creatorcontrib><creatorcontrib>Shmueli, Galit</creatorcontrib><creatorcontrib>Mani, Deepa</creatorcontrib><creatorcontrib>National Tsing Hua University</creatorcontrib><creatorcontrib>Bar Ilan University</creatorcontrib><creatorcontrib>Indian School of Business</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><jtitle>MIS quarterly</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yahav, Inbal</au><au>Shmueli, Galit</au><au>Mani, Deepa</au><aucorp>National Tsing Hua University</aucorp><aucorp>Bar Ilan University</aucorp><aucorp>Indian School of Business</aucorp><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data</atitle><jtitle>MIS quarterly</jtitle><date>2016-12-01</date><risdate>2016</risdate><volume>40</volume><issue>4</issue><spage>819</spage><epage>848</epage><pages>819-848</pages><issn>0276-7783</issn><eissn>2162-9730</eissn><coden>MISQDP</coden><abstract>In this paper, we introduce a tree-based approach adjusting for observable self-selection bias in intervention studies in management research. In contrast to traditional propensity score (PS) matching methods, including those using classification trees as a subcomponent, our tree-based approach provides a standalone, automated, data-driven methodology that allows for (1) the examination of nascent interventions whose selection is difficult and costly to theoretically specify a priori, (2) detection of heterogeneous intervention effects for different pre-intervention profiles, (3) identification of pre-intervention variables that correlate with the self-selected intervention, and (4) visual presentation of intervention effects that is easy to discern and understand. As such, the tree-based approach is a useful tool for analyzing observational impact studies as well as for post-analysis of experimental data. The tree-based approach is particularly advantageous in the analyses of big data, or data with large sample sizes and a large number of variables. It outperforms PS in terms of computational time, data loss, and automatic capture of nonlinear relationships and heterogeneous interventions. It also requires less user specification and choices than PS, reducing potential data dredging. We discuss the performance of our method in the context of such big data and present results for very large simulated samples with many variables. We illustrate the method and the insights it yields in the context of three impact studies with different study designs: reanalysis of a field study on the effect of training on earnings, analysis of the impact of an electronic governance service in India based on a quasi-experiment, and performance comparison of contract pricing mechanisms and durations in IT outsourcing using observational data.</abstract><cop>Minneapolis</cop><pub>Management Information Systems Research Center, University of Minnesota</pub><doi>10.25300/MISQ/2016/40.4.02</doi><tpages>30</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0276-7783
ispartof MIS quarterly, 2016-12, Vol.40 (4), p.819-848
issn 0276-7783
2162-9730
language eng
recordid cdi_proquest_journals_1842819917
source EBSCOhost Business Source Complete; JSTOR Archive Collection A-Z Listing
subjects Big Data
Big Data & Analytics in Networked Business
Electronic government
Impact analysis
Information technology
Intervention
Studies
Training
title A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-13T09%3A05%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Tree-Based%20Approach%20for%20Addressing%20Self-Selection%20in%20Impact%20Studies%20with%20Big%20Data&rft.jtitle=MIS%20quarterly&rft.au=Yahav,%20Inbal&rft.aucorp=National%20Tsing%20Hua%20University&rft.date=2016-12-01&rft.volume=40&rft.issue=4&rft.spage=819&rft.epage=848&rft.pages=819-848&rft.issn=0276-7783&rft.eissn=2162-9730&rft.coden=MISQDP&rft_id=info:doi/10.25300/MISQ/2016/40.4.02&rft_dat=%3Cjstor_proqu%3E26629678%3C/jstor_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1842819917&rft_id=info:pmid/&rft_jstor_id=26629678&rfr_iscdi=true