Machine Learning for Experimental Design: Methods for Improved Blocking
Restricting randomization in the design of experiments (e.g., using blocking/stratification, pair-wise matching, or rerandomization) can improve the treatment-control balance on important covariates and therefore improve the estimation of the treatment effect, particularly for small- and medium-size...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Quistorff, Brian Johnson, Gentry |
description | Restricting randomization in the design of experiments (e.g., using
blocking/stratification, pair-wise matching, or rerandomization) can improve
the treatment-control balance on important covariates and therefore improve the
estimation of the treatment effect, particularly for small- and medium-sized
experiments. Existing guidance on how to identify these variables and implement
the restrictions is incomplete and conflicting. We identify that differences
are mainly due to the fact that what is important in the pre-treatment data may
not translate to the post-treatment data. We highlight settings where there is
sufficient data to provide clear guidance and outline improved methods to
mostly automate the process using modern machine learning (ML) techniques. We
show in simulations using real-world data, that these methods reduce both the
mean squared error of the estimate (14%-34%) and the size of the standard error
(6%-16%). |
doi_str_mv | 10.48550/arxiv.2010.15966 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2010_15966</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2010_15966</sourcerecordid><originalsourceid>FETCH-LOGICAL-a676-6abf00701aa059c8f1a9ae89b0f010e75e04c2b713137b84ace985237a0a5b5d3</originalsourceid><addsrcrecordid>eNotj01OwzAUhL1hgQoHYIUvkGLH8R87KKVUSsWm--jZeW4tUidyqqrcnhBYjTSaGc1HyANny8pIyZ4gX-NlWbLJ4NIqdUs2O_DHmJDWCDnFdKChz3R9HTDHE6YzdPQNx3hIz3SH52PfjnNgexpyf8GWvna9_5pqd-QmQDfi_b8uyP59vV99FPXnZrt6qQtQWhUKXGBMMw7ApPUmcLCAxjoWpk-oJbLKl05zwYV2pgKP1shSaGAgnWzFgjz-zc4kzTCdhPzd_BI1M5H4ARjQRec</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Machine Learning for Experimental Design: Methods for Improved Blocking</title><source>arXiv.org</source><creator>Quistorff, Brian ; Johnson, Gentry</creator><creatorcontrib>Quistorff, Brian ; Johnson, Gentry</creatorcontrib><description>Restricting randomization in the design of experiments (e.g., using
blocking/stratification, pair-wise matching, or rerandomization) can improve
the treatment-control balance on important covariates and therefore improve the
estimation of the treatment effect, particularly for small- and medium-sized
experiments. Existing guidance on how to identify these variables and implement
the restrictions is incomplete and conflicting. We identify that differences
are mainly due to the fact that what is important in the pre-treatment data may
not translate to the post-treatment data. We highlight settings where there is
sufficient data to provide clear guidance and outline improved methods to
mostly automate the process using modern machine learning (ML) techniques. We
show in simulations using real-world data, that these methods reduce both the
mean squared error of the estimate (14%-34%) and the size of the standard error
(6%-16%).</description><identifier>DOI: 10.48550/arxiv.2010.15966</identifier><language>eng</language><creationdate>2020-10</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2010.15966$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2010.15966$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Quistorff, Brian</creatorcontrib><creatorcontrib>Johnson, Gentry</creatorcontrib><title>Machine Learning for Experimental Design: Methods for Improved Blocking</title><description>Restricting randomization in the design of experiments (e.g., using
blocking/stratification, pair-wise matching, or rerandomization) can improve
the treatment-control balance on important covariates and therefore improve the
estimation of the treatment effect, particularly for small- and medium-sized
experiments. Existing guidance on how to identify these variables and implement
the restrictions is incomplete and conflicting. We identify that differences
are mainly due to the fact that what is important in the pre-treatment data may
not translate to the post-treatment data. We highlight settings where there is
sufficient data to provide clear guidance and outline improved methods to
mostly automate the process using modern machine learning (ML) techniques. We
show in simulations using real-world data, that these methods reduce both the
mean squared error of the estimate (14%-34%) and the size of the standard error
(6%-16%).</description><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj01OwzAUhL1hgQoHYIUvkGLH8R87KKVUSsWm--jZeW4tUidyqqrcnhBYjTSaGc1HyANny8pIyZ4gX-NlWbLJ4NIqdUs2O_DHmJDWCDnFdKChz3R9HTDHE6YzdPQNx3hIz3SH52PfjnNgexpyf8GWvna9_5pqd-QmQDfi_b8uyP59vV99FPXnZrt6qQtQWhUKXGBMMw7ApPUmcLCAxjoWpk-oJbLKl05zwYV2pgKP1shSaGAgnWzFgjz-zc4kzTCdhPzd_BI1M5H4ARjQRec</recordid><startdate>20201029</startdate><enddate>20201029</enddate><creator>Quistorff, Brian</creator><creator>Johnson, Gentry</creator><scope>ADEOX</scope><scope>GOX</scope></search><sort><creationdate>20201029</creationdate><title>Machine Learning for Experimental Design: Methods for Improved Blocking</title><author>Quistorff, Brian ; Johnson, Gentry</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a676-6abf00701aa059c8f1a9ae89b0f010e75e04c2b713137b84ace985237a0a5b5d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><toplevel>online_resources</toplevel><creatorcontrib>Quistorff, Brian</creatorcontrib><creatorcontrib>Johnson, Gentry</creatorcontrib><collection>arXiv Economics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Quistorff, Brian</au><au>Johnson, Gentry</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Machine Learning for Experimental Design: Methods for Improved Blocking</atitle><date>2020-10-29</date><risdate>2020</risdate><abstract>Restricting randomization in the design of experiments (e.g., using
blocking/stratification, pair-wise matching, or rerandomization) can improve
the treatment-control balance on important covariates and therefore improve the
estimation of the treatment effect, particularly for small- and medium-sized
experiments. Existing guidance on how to identify these variables and implement
the restrictions is incomplete and conflicting. We identify that differences
are mainly due to the fact that what is important in the pre-treatment data may
not translate to the post-treatment data. We highlight settings where there is
sufficient data to provide clear guidance and outline improved methods to
mostly automate the process using modern machine learning (ML) techniques. We
show in simulations using real-world data, that these methods reduce both the
mean squared error of the estimate (14%-34%) and the size of the standard error
(6%-16%).</abstract><doi>10.48550/arxiv.2010.15966</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2010.15966 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2010_15966 |
source | arXiv.org |
title | Machine Learning for Experimental Design: Methods for Improved Blocking |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T18%3A09%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Machine%20Learning%20for%20Experimental%20Design:%20Methods%20for%20Improved%20Blocking&rft.au=Quistorff,%20Brian&rft.date=2020-10-29&rft_id=info:doi/10.48550/arxiv.2010.15966&rft_dat=%3Carxiv_GOX%3E2010_15966%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |