Bayesian Mixture Labeling by Highest Posterior Density

A fundamental problem for Bayesian mixture model analysis is label switching, which occurs as a result of the nonidentifiability of the mixture components under symmetric priors. We propose two labeling methods to solve this problem. The first method, denoted by PM(ALG), is based on the posterior mo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of the American Statistical Association 2009-06, Vol.104 (486), p.758-767
Hauptverfasser: Yao, Weixin, Lindsay, Bruce G.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 767
container_issue 486
container_start_page 758
container_title Journal of the American Statistical Association
container_volume 104
creator Yao, Weixin
Lindsay, Bruce G.
description A fundamental problem for Bayesian mixture model analysis is label switching, which occurs as a result of the nonidentifiability of the mixture components under symmetric priors. We propose two labeling methods to solve this problem. The first method, denoted by PM(ALG), is based on the posterior modes and an ascending algorithm generically denoted ALG. We use each Markov chain Monte Carlo sample as the starting point in an ascending algorithm, and label the sample based on the mode of the posterior to which it converges. Our natural assumption here is that the samples converged to the same mode should have the same labels. The PM(ALG) labeling method has some computational advantages over other popular labeling methods. Additionally, it automatically matches the "ideal" labels in the highest posterior density credible regions. The second method does labeling by maximizing the normal likelihood of the labeled Gibbs samples. Using a Monte Carlo simulation study and a real dataset, we demonstrate the success of our new methods in dealing with the label switching problem.
doi_str_mv 10.1198/jasa.2009.0237
format Article
fullrecord <record><control><sourceid>jstor_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1198_jasa_2009_0237</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>40592220</jstor_id><sourcerecordid>40592220</sourcerecordid><originalsourceid>FETCH-LOGICAL-c502t-f8ec86de3d46ad70a4a761cfbc11d646fc3e4437afc95830e007ebb9e279220b3</originalsourceid><addsrcrecordid>eNp1kEtLxDAUhYMoOD627oQi6K5jnk271PEJI7pQcBdu01QzdJoxadH-e1NmdCF4N3dxvns49yB0RPCUkCI_X0CAKcW4mGLK5BaaEMFkSiV_3UYTTDKaEi6KXbQXwgLHkXk-QdklDCZYaJMH-9X13iRzKE1j27ekHJI7-_ZuQpc8udAZb51PrkwbbDccoJ0ammAON3sfvdxcP8_u0vnj7f3sYp5qgWmX1rnReVYZVvEMKomBg8yIrktNSJXxrNbMcM4k1LoQOcMmpjJlWRgqC0pxyfbR2dp35d1HH6OopQ3aNA20xvVBMUkyyQWJ4MkfcOF638ZsKjaQx3epiNB0DWnvQvCmVitvl-AHRbAaO1Rjh2rsUI0dxoPTjSsEDU3todU2_F5RIojAeDQ-XnOL0Dn_q3Ms4hsUR71Y67atnV_Cp_NNpToYGud_TNk_Gb4B4FeOCw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>274800025</pqid></control><display><type>article</type><title>Bayesian Mixture Labeling by Highest Posterior Density</title><source>JSTOR Mathematics &amp; Statistics</source><source>JSTOR Archive Collection A-Z Listing</source><source>Taylor &amp; Francis Journals Complete</source><creator>Yao, Weixin ; Lindsay, Bruce G.</creator><creatorcontrib>Yao, Weixin ; Lindsay, Bruce G.</creatorcontrib><description>A fundamental problem for Bayesian mixture model analysis is label switching, which occurs as a result of the nonidentifiability of the mixture components under symmetric priors. We propose two labeling methods to solve this problem. The first method, denoted by PM(ALG), is based on the posterior modes and an ascending algorithm generically denoted ALG. We use each Markov chain Monte Carlo sample as the starting point in an ascending algorithm, and label the sample based on the mode of the posterior to which it converges. Our natural assumption here is that the samples converged to the same mode should have the same labels. The PM(ALG) labeling method has some computational advantages over other popular labeling methods. Additionally, it automatically matches the "ideal" labels in the highest posterior density credible regions. The second method does labeling by maximizing the normal likelihood of the labeled Gibbs samples. Using a Monte Carlo simulation study and a real dataset, we demonstrate the success of our new methods in dealing with the label switching problem.</description><identifier>ISSN: 0162-1459</identifier><identifier>EISSN: 1537-274X</identifier><identifier>DOI: 10.1198/jasa.2009.0237</identifier><identifier>CODEN: JSTNAL</identifier><language>eng</language><publisher>Alexandria, VA: Taylor &amp; Francis</publisher><subject>Acidity ; Algorithms ; Applications ; Bayesian analysis ; Bayesian approach ; Bayesian method ; Burn in ; Computational methods ; Datasets ; Exact sciences and technology ; General topics ; Label switching ; Labeling ; Markov analysis ; Markov chain Monte Carlo ; Markov chains ; Markov processes ; Markovian processes ; Mathematics ; Maximum likelihood estimation ; Minor scales ; Mixture model ; Monte Carlo simulation ; Numerical analysis ; Numerical analysis. Scientific computation ; Numerical methods in probability and statistics ; Objective functions ; Posterior modes ; Probability and statistics ; Probability theory and stochastic processes ; Proportions ; Sampling ; Sciences and techniques of general use ; Statistical methods ; Statistics ; Theory and Methods</subject><ispartof>Journal of the American Statistical Association, 2009-06, Vol.104 (486), p.758-767</ispartof><rights>American Statistical Association and the American Society for Quality 2009</rights><rights>2009 American Statistical Association</rights><rights>2009 INIST-CNRS</rights><rights>Copyright American Statistical Association Jun 2009</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c502t-f8ec86de3d46ad70a4a761cfbc11d646fc3e4437afc95830e007ebb9e279220b3</citedby><cites>FETCH-LOGICAL-c502t-f8ec86de3d46ad70a4a761cfbc11d646fc3e4437afc95830e007ebb9e279220b3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/40592220$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/40592220$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>314,780,784,803,832,27924,27925,58017,58021,58250,58254,59647,60436</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=21515005$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Yao, Weixin</creatorcontrib><creatorcontrib>Lindsay, Bruce G.</creatorcontrib><title>Bayesian Mixture Labeling by Highest Posterior Density</title><title>Journal of the American Statistical Association</title><description>A fundamental problem for Bayesian mixture model analysis is label switching, which occurs as a result of the nonidentifiability of the mixture components under symmetric priors. We propose two labeling methods to solve this problem. The first method, denoted by PM(ALG), is based on the posterior modes and an ascending algorithm generically denoted ALG. We use each Markov chain Monte Carlo sample as the starting point in an ascending algorithm, and label the sample based on the mode of the posterior to which it converges. Our natural assumption here is that the samples converged to the same mode should have the same labels. The PM(ALG) labeling method has some computational advantages over other popular labeling methods. Additionally, it automatically matches the "ideal" labels in the highest posterior density credible regions. The second method does labeling by maximizing the normal likelihood of the labeled Gibbs samples. Using a Monte Carlo simulation study and a real dataset, we demonstrate the success of our new methods in dealing with the label switching problem.</description><subject>Acidity</subject><subject>Algorithms</subject><subject>Applications</subject><subject>Bayesian analysis</subject><subject>Bayesian approach</subject><subject>Bayesian method</subject><subject>Burn in</subject><subject>Computational methods</subject><subject>Datasets</subject><subject>Exact sciences and technology</subject><subject>General topics</subject><subject>Label switching</subject><subject>Labeling</subject><subject>Markov analysis</subject><subject>Markov chain Monte Carlo</subject><subject>Markov chains</subject><subject>Markov processes</subject><subject>Markovian processes</subject><subject>Mathematics</subject><subject>Maximum likelihood estimation</subject><subject>Minor scales</subject><subject>Mixture model</subject><subject>Monte Carlo simulation</subject><subject>Numerical analysis</subject><subject>Numerical analysis. Scientific computation</subject><subject>Numerical methods in probability and statistics</subject><subject>Objective functions</subject><subject>Posterior modes</subject><subject>Probability and statistics</subject><subject>Probability theory and stochastic processes</subject><subject>Proportions</subject><subject>Sampling</subject><subject>Sciences and techniques of general use</subject><subject>Statistical methods</subject><subject>Statistics</subject><subject>Theory and Methods</subject><issn>0162-1459</issn><issn>1537-274X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><recordid>eNp1kEtLxDAUhYMoOD627oQi6K5jnk271PEJI7pQcBdu01QzdJoxadH-e1NmdCF4N3dxvns49yB0RPCUkCI_X0CAKcW4mGLK5BaaEMFkSiV_3UYTTDKaEi6KXbQXwgLHkXk-QdklDCZYaJMH-9X13iRzKE1j27ekHJI7-_ZuQpc8udAZb51PrkwbbDccoJ0ammAON3sfvdxcP8_u0vnj7f3sYp5qgWmX1rnReVYZVvEMKomBg8yIrktNSJXxrNbMcM4k1LoQOcMmpjJlWRgqC0pxyfbR2dp35d1HH6OopQ3aNA20xvVBMUkyyQWJ4MkfcOF638ZsKjaQx3epiNB0DWnvQvCmVitvl-AHRbAaO1Rjh2rsUI0dxoPTjSsEDU3todU2_F5RIojAeDQ-XnOL0Dn_q3Ms4hsUR71Y67atnV_Cp_NNpToYGud_TNk_Gb4B4FeOCw</recordid><startdate>20090601</startdate><enddate>20090601</enddate><creator>Yao, Weixin</creator><creator>Lindsay, Bruce G.</creator><general>Taylor &amp; Francis</general><general>American Statistical Association</general><general>Taylor &amp; Francis Ltd</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8BJ</scope><scope>FQK</scope><scope>JBE</scope><scope>K9.</scope></search><sort><creationdate>20090601</creationdate><title>Bayesian Mixture Labeling by Highest Posterior Density</title><author>Yao, Weixin ; Lindsay, Bruce G.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c502t-f8ec86de3d46ad70a4a761cfbc11d646fc3e4437afc95830e007ebb9e279220b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Acidity</topic><topic>Algorithms</topic><topic>Applications</topic><topic>Bayesian analysis</topic><topic>Bayesian approach</topic><topic>Bayesian method</topic><topic>Burn in</topic><topic>Computational methods</topic><topic>Datasets</topic><topic>Exact sciences and technology</topic><topic>General topics</topic><topic>Label switching</topic><topic>Labeling</topic><topic>Markov analysis</topic><topic>Markov chain Monte Carlo</topic><topic>Markov chains</topic><topic>Markov processes</topic><topic>Markovian processes</topic><topic>Mathematics</topic><topic>Maximum likelihood estimation</topic><topic>Minor scales</topic><topic>Mixture model</topic><topic>Monte Carlo simulation</topic><topic>Numerical analysis</topic><topic>Numerical analysis. Scientific computation</topic><topic>Numerical methods in probability and statistics</topic><topic>Objective functions</topic><topic>Posterior modes</topic><topic>Probability and statistics</topic><topic>Probability theory and stochastic processes</topic><topic>Proportions</topic><topic>Sampling</topic><topic>Sciences and techniques of general use</topic><topic>Statistical methods</topic><topic>Statistics</topic><topic>Theory and Methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yao, Weixin</creatorcontrib><creatorcontrib>Lindsay, Bruce G.</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>International Bibliography of the Social Sciences (IBSS)</collection><collection>International Bibliography of the Social Sciences</collection><collection>International Bibliography of the Social Sciences</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><jtitle>Journal of the American Statistical Association</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yao, Weixin</au><au>Lindsay, Bruce G.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Bayesian Mixture Labeling by Highest Posterior Density</atitle><jtitle>Journal of the American Statistical Association</jtitle><date>2009-06-01</date><risdate>2009</risdate><volume>104</volume><issue>486</issue><spage>758</spage><epage>767</epage><pages>758-767</pages><issn>0162-1459</issn><eissn>1537-274X</eissn><coden>JSTNAL</coden><abstract>A fundamental problem for Bayesian mixture model analysis is label switching, which occurs as a result of the nonidentifiability of the mixture components under symmetric priors. We propose two labeling methods to solve this problem. The first method, denoted by PM(ALG), is based on the posterior modes and an ascending algorithm generically denoted ALG. We use each Markov chain Monte Carlo sample as the starting point in an ascending algorithm, and label the sample based on the mode of the posterior to which it converges. Our natural assumption here is that the samples converged to the same mode should have the same labels. The PM(ALG) labeling method has some computational advantages over other popular labeling methods. Additionally, it automatically matches the "ideal" labels in the highest posterior density credible regions. The second method does labeling by maximizing the normal likelihood of the labeled Gibbs samples. Using a Monte Carlo simulation study and a real dataset, we demonstrate the success of our new methods in dealing with the label switching problem.</abstract><cop>Alexandria, VA</cop><pub>Taylor &amp; Francis</pub><doi>10.1198/jasa.2009.0237</doi><tpages>10</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0162-1459
ispartof Journal of the American Statistical Association, 2009-06, Vol.104 (486), p.758-767
issn 0162-1459
1537-274X
language eng
recordid cdi_crossref_primary_10_1198_jasa_2009_0237
source JSTOR Mathematics & Statistics; JSTOR Archive Collection A-Z Listing; Taylor & Francis Journals Complete
subjects Acidity
Algorithms
Applications
Bayesian analysis
Bayesian approach
Bayesian method
Burn in
Computational methods
Datasets
Exact sciences and technology
General topics
Label switching
Labeling
Markov analysis
Markov chain Monte Carlo
Markov chains
Markov processes
Markovian processes
Mathematics
Maximum likelihood estimation
Minor scales
Mixture model
Monte Carlo simulation
Numerical analysis
Numerical analysis. Scientific computation
Numerical methods in probability and statistics
Objective functions
Posterior modes
Probability and statistics
Probability theory and stochastic processes
Proportions
Sampling
Sciences and techniques of general use
Statistical methods
Statistics
Theory and Methods
title Bayesian Mixture Labeling by Highest Posterior Density
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T08%3A02%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Bayesian%20Mixture%20Labeling%20by%20Highest%20Posterior%20Density&rft.jtitle=Journal%20of%20the%20American%20Statistical%20Association&rft.au=Yao,%20Weixin&rft.date=2009-06-01&rft.volume=104&rft.issue=486&rft.spage=758&rft.epage=767&rft.pages=758-767&rft.issn=0162-1459&rft.eissn=1537-274X&rft.coden=JSTNAL&rft_id=info:doi/10.1198/jasa.2009.0237&rft_dat=%3Cjstor_cross%3E40592220%3C/jstor_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=274800025&rft_id=info:pmid/&rft_jstor_id=40592220&rfr_iscdi=true