Statistical inference for exploratory data analysis and model diagnostics

We propose to furnish visual statistical methods with an inferential framework and protocol, modelled on confirmatory statistical testing. In this framework, plots take on the role of test statistics, and human cognition the role of statistical tests. Statistical significance of 'discoveries�...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Philosophical transactions of the Royal Society of London. Series A: Mathematical, physical, and engineering sciences physical, and engineering sciences, 2009-11, Vol.367 (1906), p.4361-4383
Hauptverfasser: Buja, Andreas, Cook, Dianne, Hofmann, Heike, Lawrence, Michael, Lee, Eun-Kyung, Swayne, Deborah F., Wickham, Hadley
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4383
container_issue 1906
container_start_page 4361
container_title Philosophical transactions of the Royal Society of London. Series A: Mathematical, physical, and engineering sciences
container_volume 367
creator Buja, Andreas
Cook, Dianne
Hofmann, Heike
Lawrence, Michael
Lee, Eun-Kyung
Swayne, Deborah F.
Wickham, Hadley
description We propose to furnish visual statistical methods with an inferential framework and protocol, modelled on confirmatory statistical testing. In this framework, plots take on the role of test statistics, and human cognition the role of statistical tests. Statistical significance of 'discoveries' is measured by having the human viewer compare the plot of the real dataset with collections of plots of simulated datasets. A simple but rigorous protocol that provides inferential validity is modelled after the 'lineup' popular from criminal legal procedures. Another protocol modelled after the 'Rorschach' inkblot test, well known from (pop-)psychology, will help analysts acclimatize to random variability before being exposed to the plot of the real data. The proposed protocols will be useful for exploratory data analysis, with reference datasets simulated by using a null assumption that structure is absent. The framework is also useful for model diagnostics in which case reference datasets are simulated from the model in question. This latter point follows up on previous proposals. Adopting the protocols will mean an adjustment in working procedures for data analysts, adding more rigour, and teachers might find that incorporating these protocols into the curriculum improves their students' statistical thinking.
doi_str_mv 10.1098/rsta.2009.0120
format Article
fullrecord <record><control><sourceid>jstor_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1098_rsta_2009_0120</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>40485732</jstor_id><sourcerecordid>40485732</sourcerecordid><originalsourceid>FETCH-LOGICAL-c637t-3f22f804cbedb2109ecc6ab6a70a34c2595326e1219f77cc2a55d70a117b010f3</originalsourceid><addsrcrecordid>eNp9UU1v1DAUjBCIfsCVGyg3Tln8bMeOb1QrWoqKkGhB3Cyv47TeZuPUdqDh1-NsVosqRE-2NPPevJnJsleAFoBE9c6HqBYYIbFAgNGT7BAohwILhp-mP2G0KBH5cZAdhbBGCICV-Hl2AKJCJaXiMDu_jCraEK1WbW67xnjTaZM3zufmvm-dV9H5Ma9VVLnqVDsGG9KnzjeuNm1eW3XduWk8vMieNaoN5uXuPc6-nX64Wn4sLr6cnS9PLgrNCI8FaTBuKkT1ytQrnDwYrZlaMcWRIlTjUpQEMwMYRMO51liVZZ0wAL5CgBpynL2d9_be3Q0mRLmxQZu2VZ1xQ5CcEAaMC0jMxczU3oXgTSN7bzfKjxKQnNKTU3pySk9O6aWBN7vVw2pj6r_0XVyJcDsTvBuTR6etiaNcu8GnaIL8enl18pMwbkEgJlFFAHFKgMvftp-1EihtCIORW8pD_X_PIY-p_dfE63lqHVJ1ew8U0arkBCe8mPHUurnf48rfSsYJL-X3ikqxPP0En5MCS_z3M__GXt_8st7IB-ds1bXrouni1t7WGE0dyGZoW9nXU2Xw6Ao39rts9sPkD8Jm4Pw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>733616791</pqid></control><display><type>article</type><title>Statistical inference for exploratory data analysis and model diagnostics</title><source>MEDLINE</source><source>JSTOR Mathematics &amp; Statistics</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Buja, Andreas ; Cook, Dianne ; Hofmann, Heike ; Lawrence, Michael ; Lee, Eun-Kyung ; Swayne, Deborah F. ; Wickham, Hadley</creator><creatorcontrib>Buja, Andreas ; Cook, Dianne ; Hofmann, Heike ; Lawrence, Michael ; Lee, Eun-Kyung ; Swayne, Deborah F. ; Wickham, Hadley</creatorcontrib><description>We propose to furnish visual statistical methods with an inferential framework and protocol, modelled on confirmatory statistical testing. In this framework, plots take on the role of test statistics, and human cognition the role of statistical tests. Statistical significance of 'discoveries' is measured by having the human viewer compare the plot of the real dataset with collections of plots of simulated datasets. A simple but rigorous protocol that provides inferential validity is modelled after the 'lineup' popular from criminal legal procedures. Another protocol modelled after the 'Rorschach' inkblot test, well known from (pop-)psychology, will help analysts acclimatize to random variability before being exposed to the plot of the real data. The proposed protocols will be useful for exploratory data analysis, with reference datasets simulated by using a null assumption that structure is absent. The framework is also useful for model diagnostics in which case reference datasets are simulated from the model in question. This latter point follows up on previous proposals. Adopting the protocols will mean an adjustment in working procedures for data analysts, adding more rigour, and teachers might find that incorporating these protocols into the curriculum improves their students' statistical thinking.</description><identifier>ISSN: 1364-503X</identifier><identifier>EISSN: 1471-2962</identifier><identifier>DOI: 10.1098/rsta.2009.0120</identifier><identifier>PMID: 19805449</identifier><language>eng</language><publisher>England: The Royal Society</publisher><subject>Calibration ; Cognitive Perception ; Data analysis ; Data Interpretation, Statistical ; Datasets ; Graphics ; Housing ; Housing - statistics &amp; numerical data ; Humans ; Inference ; Modeling ; Models, Theoretical ; Null hypothesis ; Permutation Tests ; Photographic plates ; Rotation Tests ; Simulation ; Statistical Graphics ; Statistics ; Visual Data Mining</subject><ispartof>Philosophical transactions of the Royal Society of London. Series A: Mathematical, physical, and engineering sciences, 2009-11, Vol.367 (1906), p.4361-4383</ispartof><rights>Copyright 2009 The Royal Society</rights><rights>2009 The Royal Society</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c637t-3f22f804cbedb2109ecc6ab6a70a34c2595326e1219f77cc2a55d70a117b010f3</citedby><cites>FETCH-LOGICAL-c637t-3f22f804cbedb2109ecc6ab6a70a34c2595326e1219f77cc2a55d70a117b010f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/40485732$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/40485732$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>314,780,784,832,27922,27923,58019,58252</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/19805449$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Buja, Andreas</creatorcontrib><creatorcontrib>Cook, Dianne</creatorcontrib><creatorcontrib>Hofmann, Heike</creatorcontrib><creatorcontrib>Lawrence, Michael</creatorcontrib><creatorcontrib>Lee, Eun-Kyung</creatorcontrib><creatorcontrib>Swayne, Deborah F.</creatorcontrib><creatorcontrib>Wickham, Hadley</creatorcontrib><title>Statistical inference for exploratory data analysis and model diagnostics</title><title>Philosophical transactions of the Royal Society of London. Series A: Mathematical, physical, and engineering sciences</title><addtitle>Proc. R. Soc. A</addtitle><addtitle>Proc. R. Soc. A</addtitle><description>We propose to furnish visual statistical methods with an inferential framework and protocol, modelled on confirmatory statistical testing. In this framework, plots take on the role of test statistics, and human cognition the role of statistical tests. Statistical significance of 'discoveries' is measured by having the human viewer compare the plot of the real dataset with collections of plots of simulated datasets. A simple but rigorous protocol that provides inferential validity is modelled after the 'lineup' popular from criminal legal procedures. Another protocol modelled after the 'Rorschach' inkblot test, well known from (pop-)psychology, will help analysts acclimatize to random variability before being exposed to the plot of the real data. The proposed protocols will be useful for exploratory data analysis, with reference datasets simulated by using a null assumption that structure is absent. The framework is also useful for model diagnostics in which case reference datasets are simulated from the model in question. This latter point follows up on previous proposals. Adopting the protocols will mean an adjustment in working procedures for data analysts, adding more rigour, and teachers might find that incorporating these protocols into the curriculum improves their students' statistical thinking.</description><subject>Calibration</subject><subject>Cognitive Perception</subject><subject>Data analysis</subject><subject>Data Interpretation, Statistical</subject><subject>Datasets</subject><subject>Graphics</subject><subject>Housing</subject><subject>Housing - statistics &amp; numerical data</subject><subject>Humans</subject><subject>Inference</subject><subject>Modeling</subject><subject>Models, Theoretical</subject><subject>Null hypothesis</subject><subject>Permutation Tests</subject><subject>Photographic plates</subject><subject>Rotation Tests</subject><subject>Simulation</subject><subject>Statistical Graphics</subject><subject>Statistics</subject><subject>Visual Data Mining</subject><issn>1364-503X</issn><issn>1471-2962</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9UU1v1DAUjBCIfsCVGyg3Tln8bMeOb1QrWoqKkGhB3Cyv47TeZuPUdqDh1-NsVosqRE-2NPPevJnJsleAFoBE9c6HqBYYIbFAgNGT7BAohwILhp-mP2G0KBH5cZAdhbBGCICV-Hl2AKJCJaXiMDu_jCraEK1WbW67xnjTaZM3zufmvm-dV9H5Ma9VVLnqVDsGG9KnzjeuNm1eW3XduWk8vMieNaoN5uXuPc6-nX64Wn4sLr6cnS9PLgrNCI8FaTBuKkT1ytQrnDwYrZlaMcWRIlTjUpQEMwMYRMO51liVZZ0wAL5CgBpynL2d9_be3Q0mRLmxQZu2VZ1xQ5CcEAaMC0jMxczU3oXgTSN7bzfKjxKQnNKTU3pySk9O6aWBN7vVw2pj6r_0XVyJcDsTvBuTR6etiaNcu8GnaIL8enl18pMwbkEgJlFFAHFKgMvftp-1EihtCIORW8pD_X_PIY-p_dfE63lqHVJ1ew8U0arkBCe8mPHUurnf48rfSsYJL-X3ikqxPP0En5MCS_z3M__GXt_8st7IB-ds1bXrouni1t7WGE0dyGZoW9nXU2Xw6Ao39rts9sPkD8Jm4Pw</recordid><startdate>20091113</startdate><enddate>20091113</enddate><creator>Buja, Andreas</creator><creator>Cook, Dianne</creator><creator>Hofmann, Heike</creator><creator>Lawrence, Michael</creator><creator>Lee, Eun-Kyung</creator><creator>Swayne, Deborah F.</creator><creator>Wickham, Hadley</creator><general>The Royal Society</general><general>The Royal Society Publishing</general><scope>BSCLL</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20091113</creationdate><title>Statistical inference for exploratory data analysis and model diagnostics</title><author>Buja, Andreas ; Cook, Dianne ; Hofmann, Heike ; Lawrence, Michael ; Lee, Eun-Kyung ; Swayne, Deborah F. ; Wickham, Hadley</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c637t-3f22f804cbedb2109ecc6ab6a70a34c2595326e1219f77cc2a55d70a117b010f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Calibration</topic><topic>Cognitive Perception</topic><topic>Data analysis</topic><topic>Data Interpretation, Statistical</topic><topic>Datasets</topic><topic>Graphics</topic><topic>Housing</topic><topic>Housing - statistics &amp; numerical data</topic><topic>Humans</topic><topic>Inference</topic><topic>Modeling</topic><topic>Models, Theoretical</topic><topic>Null hypothesis</topic><topic>Permutation Tests</topic><topic>Photographic plates</topic><topic>Rotation Tests</topic><topic>Simulation</topic><topic>Statistical Graphics</topic><topic>Statistics</topic><topic>Visual Data Mining</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Buja, Andreas</creatorcontrib><creatorcontrib>Cook, Dianne</creatorcontrib><creatorcontrib>Hofmann, Heike</creatorcontrib><creatorcontrib>Lawrence, Michael</creatorcontrib><creatorcontrib>Lee, Eun-Kyung</creatorcontrib><creatorcontrib>Swayne, Deborah F.</creatorcontrib><creatorcontrib>Wickham, Hadley</creatorcontrib><collection>Istex</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Philosophical transactions of the Royal Society of London. Series A: Mathematical, physical, and engineering sciences</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Buja, Andreas</au><au>Cook, Dianne</au><au>Hofmann, Heike</au><au>Lawrence, Michael</au><au>Lee, Eun-Kyung</au><au>Swayne, Deborah F.</au><au>Wickham, Hadley</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Statistical inference for exploratory data analysis and model diagnostics</atitle><jtitle>Philosophical transactions of the Royal Society of London. Series A: Mathematical, physical, and engineering sciences</jtitle><stitle>Proc. R. Soc. A</stitle><addtitle>Proc. R. Soc. A</addtitle><date>2009-11-13</date><risdate>2009</risdate><volume>367</volume><issue>1906</issue><spage>4361</spage><epage>4383</epage><pages>4361-4383</pages><issn>1364-503X</issn><eissn>1471-2962</eissn><abstract>We propose to furnish visual statistical methods with an inferential framework and protocol, modelled on confirmatory statistical testing. In this framework, plots take on the role of test statistics, and human cognition the role of statistical tests. Statistical significance of 'discoveries' is measured by having the human viewer compare the plot of the real dataset with collections of plots of simulated datasets. A simple but rigorous protocol that provides inferential validity is modelled after the 'lineup' popular from criminal legal procedures. Another protocol modelled after the 'Rorschach' inkblot test, well known from (pop-)psychology, will help analysts acclimatize to random variability before being exposed to the plot of the real data. The proposed protocols will be useful for exploratory data analysis, with reference datasets simulated by using a null assumption that structure is absent. The framework is also useful for model diagnostics in which case reference datasets are simulated from the model in question. This latter point follows up on previous proposals. Adopting the protocols will mean an adjustment in working procedures for data analysts, adding more rigour, and teachers might find that incorporating these protocols into the curriculum improves their students' statistical thinking.</abstract><cop>England</cop><pub>The Royal Society</pub><pmid>19805449</pmid><doi>10.1098/rsta.2009.0120</doi><tpages>23</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1364-503X
ispartof Philosophical transactions of the Royal Society of London. Series A: Mathematical, physical, and engineering sciences, 2009-11, Vol.367 (1906), p.4361-4383
issn 1364-503X
1471-2962
language eng
recordid cdi_crossref_primary_10_1098_rsta_2009_0120
source MEDLINE; JSTOR Mathematics & Statistics; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry
subjects Calibration
Cognitive Perception
Data analysis
Data Interpretation, Statistical
Datasets
Graphics
Housing
Housing - statistics & numerical data
Humans
Inference
Modeling
Models, Theoretical
Null hypothesis
Permutation Tests
Photographic plates
Rotation Tests
Simulation
Statistical Graphics
Statistics
Visual Data Mining
title Statistical inference for exploratory data analysis and model diagnostics
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T07%3A58%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Statistical%20inference%20for%20exploratory%20data%20analysis%20and%20model%20diagnostics&rft.jtitle=Philosophical%20transactions%20of%20the%20Royal%20Society%20of%20London.%20Series%20A:%20Mathematical,%20physical,%20and%20engineering%20sciences&rft.au=Buja,%20Andreas&rft.date=2009-11-13&rft.volume=367&rft.issue=1906&rft.spage=4361&rft.epage=4383&rft.pages=4361-4383&rft.issn=1364-503X&rft.eissn=1471-2962&rft_id=info:doi/10.1098/rsta.2009.0120&rft_dat=%3Cjstor_cross%3E40485732%3C/jstor_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=733616791&rft_id=info:pmid/19805449&rft_jstor_id=40485732&rfr_iscdi=true