On the marginal likelihood and cross-validation

Summary In Bayesian statistics, the marginal likelihood, also known as the evidence, is used to evaluate model fit as it quantifies the joint probability of the data under the prior. In contrast, non-Bayesian models are typically compared using cross-validation on held-out data, either through $k$-f...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Biometrika 2020-06, Vol.107 (2), p.489-496
Hauptverfasser:	Fong, E, Holmes, C C
Format:	Artikel
Sprache:	eng
Schlagworte:	Bayesian analysis Mathematical models Statistical analysis Test sets
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	496
container_issue	2
container_start_page	489
container_title	Biometrika
container_volume	107
creator	Fong, E Holmes, C C
description	Summary In Bayesian statistics, the marginal likelihood, also known as the evidence, is used to evaluate model fit as it quantifies the joint probability of the data under the prior. In contrast, non-Bayesian models are typically compared using cross-validation on held-out data, either through $k$-fold partitioning or leave-$p$-out subsampling. We show that the marginal likelihood is formally equivalent to exhaustive leave-$p$-out crossvalidation averaged over all values of $p$ and all held-out test sets when using the log posterior predictive probability as the scoring rule. Moreover, the log posterior predictive score is the only coherent scoring rule under data exchangeability. This offers new insight into the marginal likelihood and cross-validation, and highlights the potential sensitivity of the marginal likelihood to the choice of the prior. We suggest an alternative approach using cumulative cross-validation following a preparatory training phase. Our work has connections to prequential analysis and intrinsic Bayes factors, but is motivated in a different way.
doi_str_mv	10.1093/biomet/asz077
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2429815571</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/biomet/asz077</oup_id><sourcerecordid>2429815571</sourcerecordid><originalsourceid>FETCH-LOGICAL-c403t-3ea5cb75d084377babbd73e25b2ab48351a9d5f560d3a24e7feecad68fac94493</originalsourceid><addsrcrecordid>eNqFkE1LxDAQQIMoWFeP3gtevMRNmq_2KIuuwsJe9BwmTepm7TY1aQX99Xatd0_DwGN48xC6puSOkootjQ8HNywhfROlTlBGueSYCUpOUUYIkZhxzs_RRUr74yqFzNBy2-XDzuUHiG--gzZv_btr_S4Em0Nn8zqGlPAntN7C4EN3ic4aaJO7-psL9Pr48LJ6wpvt-nl1v8E1J2zAzIGojRKWlJwpZcAYq5grhCnA8HKSgsqKRkhiGRTcqca5GqwsG6grziu2QDfz3T6Gj9GlQe_DGCfBpAteVCUVQtGJwjP1qxldo_vop1e-NCX62ETPTfTcZOJvZz6M_T_oD_4-ZJg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2429815571</pqid></control><display><type>article</type><title>On the marginal likelihood and cross-validation</title><source>Oxford University Press Journals All Titles (1996-Current)</source><creator>Fong, E ; Holmes, C C</creator><creatorcontrib>Fong, E ; Holmes, C C</creatorcontrib><description>Summary In Bayesian statistics, the marginal likelihood, also known as the evidence, is used to evaluate model fit as it quantifies the joint probability of the data under the prior. In contrast, non-Bayesian models are typically compared using cross-validation on held-out data, either through $k$-fold partitioning or leave-$p$-out subsampling. We show that the marginal likelihood is formally equivalent to exhaustive leave-$p$-out crossvalidation averaged over all values of $p$ and all held-out test sets when using the log posterior predictive probability as the scoring rule. Moreover, the log posterior predictive score is the only coherent scoring rule under data exchangeability. This offers new insight into the marginal likelihood and cross-validation, and highlights the potential sensitivity of the marginal likelihood to the choice of the prior. We suggest an alternative approach using cumulative cross-validation following a preparatory training phase. Our work has connections to prequential analysis and intrinsic Bayes factors, but is motivated in a different way.</description><identifier>ISSN: 0006-3444</identifier><identifier>EISSN: 1464-3510</identifier><identifier>DOI: 10.1093/biomet/asz077</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Bayesian analysis ; Mathematical models ; Statistical analysis ; Test sets</subject><ispartof>Biometrika, 2020-06, Vol.107 (2), p.489-496</ispartof><rights>2020 Biometrika Trust 2020</rights><rights>2020 Biometrika Trust</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c403t-3ea5cb75d084377babbd73e25b2ab48351a9d5f560d3a24e7feecad68fac94493</citedby><cites>FETCH-LOGICAL-c403t-3ea5cb75d084377babbd73e25b2ab48351a9d5f560d3a24e7feecad68fac94493</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,1584,27923,27924</link.rule.ids></links><search><creatorcontrib>Fong, E</creatorcontrib><creatorcontrib>Holmes, C C</creatorcontrib><title>On the marginal likelihood and cross-validation</title><title>Biometrika</title><description>Summary In Bayesian statistics, the marginal likelihood, also known as the evidence, is used to evaluate model fit as it quantifies the joint probability of the data under the prior. In contrast, non-Bayesian models are typically compared using cross-validation on held-out data, either through $k$-fold partitioning or leave-$p$-out subsampling. We show that the marginal likelihood is formally equivalent to exhaustive leave-$p$-out crossvalidation averaged over all values of $p$ and all held-out test sets when using the log posterior predictive probability as the scoring rule. Moreover, the log posterior predictive score is the only coherent scoring rule under data exchangeability. This offers new insight into the marginal likelihood and cross-validation, and highlights the potential sensitivity of the marginal likelihood to the choice of the prior. We suggest an alternative approach using cumulative cross-validation following a preparatory training phase. Our work has connections to prequential analysis and intrinsic Bayes factors, but is motivated in a different way.</description><subject>Bayesian analysis</subject><subject>Mathematical models</subject><subject>Statistical analysis</subject><subject>Test sets</subject><issn>0006-3444</issn><issn>1464-3510</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><recordid>eNqFkE1LxDAQQIMoWFeP3gtevMRNmq_2KIuuwsJe9BwmTepm7TY1aQX99Xatd0_DwGN48xC6puSOkootjQ8HNywhfROlTlBGueSYCUpOUUYIkZhxzs_RRUr74yqFzNBy2-XDzuUHiG--gzZv_btr_S4Em0Nn8zqGlPAntN7C4EN3ic4aaJO7-psL9Pr48LJ6wpvt-nl1v8E1J2zAzIGojRKWlJwpZcAYq5grhCnA8HKSgsqKRkhiGRTcqca5GqwsG6grziu2QDfz3T6Gj9GlQe_DGCfBpAteVCUVQtGJwjP1qxldo_vop1e-NCX62ETPTfTcZOJvZz6M_T_oD_4-ZJg</recordid><startdate>20200601</startdate><enddate>20200601</enddate><creator>Fong, E</creator><creator>Holmes, C C</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>TOX</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope></search><sort><creationdate>20200601</creationdate><title>On the marginal likelihood and cross-validation</title><author>Fong, E ; Holmes, C C</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c403t-3ea5cb75d084377babbd73e25b2ab48351a9d5f560d3a24e7feecad68fac94493</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Bayesian analysis</topic><topic>Mathematical models</topic><topic>Statistical analysis</topic><topic>Test sets</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fong, E</creatorcontrib><creatorcontrib>Holmes, C C</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><jtitle>Biometrika</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fong, E</au><au>Holmes, C C</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>On the marginal likelihood and cross-validation</atitle><jtitle>Biometrika</jtitle><date>2020-06-01</date><risdate>2020</risdate><volume>107</volume><issue>2</issue><spage>489</spage><epage>496</epage><pages>489-496</pages><issn>0006-3444</issn><eissn>1464-3510</eissn><abstract>Summary In Bayesian statistics, the marginal likelihood, also known as the evidence, is used to evaluate model fit as it quantifies the joint probability of the data under the prior. In contrast, non-Bayesian models are typically compared using cross-validation on held-out data, either through $k$-fold partitioning or leave-$p$-out subsampling. We show that the marginal likelihood is formally equivalent to exhaustive leave-$p$-out crossvalidation averaged over all values of $p$ and all held-out test sets when using the log posterior predictive probability as the scoring rule. Moreover, the log posterior predictive score is the only coherent scoring rule under data exchangeability. This offers new insight into the marginal likelihood and cross-validation, and highlights the potential sensitivity of the marginal likelihood to the choice of the prior. We suggest an alternative approach using cumulative cross-validation following a preparatory training phase. Our work has connections to prequential analysis and intrinsic Bayes factors, but is motivated in a different way.</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><doi>10.1093/biomet/asz077</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0006-3444
ispartof	Biometrika, 2020-06, Vol.107 (2), p.489-496
issn	0006-3444 1464-3510
language	eng
recordid	cdi_proquest_journals_2429815571
source	Oxford University Press Journals All Titles (1996-Current)
subjects	Bayesian analysis Mathematical models Statistical analysis Test sets
title	On the marginal likelihood and cross-validation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T22%3A50%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=On%20the%20marginal%20likelihood%20and%20cross-validation&rft.jtitle=Biometrika&rft.au=Fong,%20E&rft.date=2020-06-01&rft.volume=107&rft.issue=2&rft.spage=489&rft.epage=496&rft.pages=489-496&rft.issn=0006-3444&rft.eissn=1464-3510&rft_id=info:doi/10.1093/biomet/asz077&rft_dat=%3Cproquest_cross%3E2429815571%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2429815571&rft_id=info:pmid/&rft_oup_id=10.1093/biomet/asz077&rfr_iscdi=true