Repulsion, Chaos and Equilibrium in Mixture Models

Mixture models are commonly used in applications with heterogeneity and overdispersion in the population, as they allow the identification of subpopulations. In the Bayesian framework, this entails the specification of suitable prior distributions for the weights and location parameters of the mixtu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2023-06
Hauptverfasser: Cremaschi, Andrea, Wertz, Timothy M, De Iorio, Maria
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Cremaschi, Andrea
Wertz, Timothy M
De Iorio, Maria
description Mixture models are commonly used in applications with heterogeneity and overdispersion in the population, as they allow the identification of subpopulations. In the Bayesian framework, this entails the specification of suitable prior distributions for the weights and location parameters of the mixture. Widely used are Bayesian semi-parametric models based on mixtures with infinite or random number of components, such as Dirichlet process mixtures or mixtures with random number of components. Key in this context is the choice of the kernel for cluster identification. Despite their popularity, the flexibility of these models and prior distributions often does not translate into interpretability of the identified clusters. To overcome this issue, clustering methods based on repulsive mixtures have been recently proposed. The basic idea is to include a repulsive term in the prior distribution of the atoms of the mixture, which favours mixture locations far apart. This approach is increasingly popular and allows one to produce well-separated clusters, thus facilitating the interpretation of the results. However, the resulting models are usually not easy to handle due to the introduction of unknown normalising constants. Exploiting results from statistical mechanics, we propose in this work a novel class of repulsive prior distributions based on Gibbs measures. Specifically, we use Gibbs measures associated to joint distributions of eigenvalues of random matrices, which naturally possess a repulsive property. The proposed framework greatly simplifies the computations needed for the use of repulsive mixtures due to the availability of the normalising constant in closed form. We investigate theoretical properties of such class of prior distributions, and illustrate the novel class of priors and their properties, as well as their clustering performance, on benchmark datasets.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2828074464</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2828074464</sourcerecordid><originalsourceid>FETCH-proquest_journals_28280744643</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQwCkotKM0pzszP01FwzkjML1ZIzEtRcC0szczJTCrKLM1VyMxT8M2sKCktSlXwzU9JzSnmYWBNS8wpTuWF0twMym6uIc4eugVF-YWlqcUl8Vn5pUV5QKl4IwsjCwNzExMzE2PiVAEAua8zkA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2828074464</pqid></control><display><type>article</type><title>Repulsion, Chaos and Equilibrium in Mixture Models</title><source>Free E- Journals</source><creator>Cremaschi, Andrea ; Wertz, Timothy M ; De Iorio, Maria</creator><creatorcontrib>Cremaschi, Andrea ; Wertz, Timothy M ; De Iorio, Maria</creatorcontrib><description>Mixture models are commonly used in applications with heterogeneity and overdispersion in the population, as they allow the identification of subpopulations. In the Bayesian framework, this entails the specification of suitable prior distributions for the weights and location parameters of the mixture. Widely used are Bayesian semi-parametric models based on mixtures with infinite or random number of components, such as Dirichlet process mixtures or mixtures with random number of components. Key in this context is the choice of the kernel for cluster identification. Despite their popularity, the flexibility of these models and prior distributions often does not translate into interpretability of the identified clusters. To overcome this issue, clustering methods based on repulsive mixtures have been recently proposed. The basic idea is to include a repulsive term in the prior distribution of the atoms of the mixture, which favours mixture locations far apart. This approach is increasingly popular and allows one to produce well-separated clusters, thus facilitating the interpretation of the results. However, the resulting models are usually not easy to handle due to the introduction of unknown normalising constants. Exploiting results from statistical mechanics, we propose in this work a novel class of repulsive prior distributions based on Gibbs measures. Specifically, we use Gibbs measures associated to joint distributions of eigenvalues of random matrices, which naturally possess a repulsive property. The proposed framework greatly simplifies the computations needed for the use of repulsive mixtures due to the availability of the normalising constant in closed form. We investigate theoretical properties of such class of prior distributions, and illustrate the novel class of priors and their properties, as well as their clustering performance, on benchmark datasets.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Bayesian analysis ; Clustering ; Dirichlet problem ; Eigenvalues ; Heterogeneity ; Probabilistic models ; Random numbers ; Statistical mechanics</subject><ispartof>arXiv.org, 2023-06</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Cremaschi, Andrea</creatorcontrib><creatorcontrib>Wertz, Timothy M</creatorcontrib><creatorcontrib>De Iorio, Maria</creatorcontrib><title>Repulsion, Chaos and Equilibrium in Mixture Models</title><title>arXiv.org</title><description>Mixture models are commonly used in applications with heterogeneity and overdispersion in the population, as they allow the identification of subpopulations. In the Bayesian framework, this entails the specification of suitable prior distributions for the weights and location parameters of the mixture. Widely used are Bayesian semi-parametric models based on mixtures with infinite or random number of components, such as Dirichlet process mixtures or mixtures with random number of components. Key in this context is the choice of the kernel for cluster identification. Despite their popularity, the flexibility of these models and prior distributions often does not translate into interpretability of the identified clusters. To overcome this issue, clustering methods based on repulsive mixtures have been recently proposed. The basic idea is to include a repulsive term in the prior distribution of the atoms of the mixture, which favours mixture locations far apart. This approach is increasingly popular and allows one to produce well-separated clusters, thus facilitating the interpretation of the results. However, the resulting models are usually not easy to handle due to the introduction of unknown normalising constants. Exploiting results from statistical mechanics, we propose in this work a novel class of repulsive prior distributions based on Gibbs measures. Specifically, we use Gibbs measures associated to joint distributions of eigenvalues of random matrices, which naturally possess a repulsive property. The proposed framework greatly simplifies the computations needed for the use of repulsive mixtures due to the availability of the normalising constant in closed form. We investigate theoretical properties of such class of prior distributions, and illustrate the novel class of priors and their properties, as well as their clustering performance, on benchmark datasets.</description><subject>Bayesian analysis</subject><subject>Clustering</subject><subject>Dirichlet problem</subject><subject>Eigenvalues</subject><subject>Heterogeneity</subject><subject>Probabilistic models</subject><subject>Random numbers</subject><subject>Statistical mechanics</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQwCkotKM0pzszP01FwzkjML1ZIzEtRcC0szczJTCrKLM1VyMxT8M2sKCktSlXwzU9JzSnmYWBNS8wpTuWF0twMym6uIc4eugVF-YWlqcUl8Vn5pUV5QKl4IwsjCwNzExMzE2PiVAEAua8zkA</recordid><startdate>20230619</startdate><enddate>20230619</enddate><creator>Cremaschi, Andrea</creator><creator>Wertz, Timothy M</creator><creator>De Iorio, Maria</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230619</creationdate><title>Repulsion, Chaos and Equilibrium in Mixture Models</title><author>Cremaschi, Andrea ; Wertz, Timothy M ; De Iorio, Maria</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28280744643</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Bayesian analysis</topic><topic>Clustering</topic><topic>Dirichlet problem</topic><topic>Eigenvalues</topic><topic>Heterogeneity</topic><topic>Probabilistic models</topic><topic>Random numbers</topic><topic>Statistical mechanics</topic><toplevel>online_resources</toplevel><creatorcontrib>Cremaschi, Andrea</creatorcontrib><creatorcontrib>Wertz, Timothy M</creatorcontrib><creatorcontrib>De Iorio, Maria</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cremaschi, Andrea</au><au>Wertz, Timothy M</au><au>De Iorio, Maria</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Repulsion, Chaos and Equilibrium in Mixture Models</atitle><jtitle>arXiv.org</jtitle><date>2023-06-19</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Mixture models are commonly used in applications with heterogeneity and overdispersion in the population, as they allow the identification of subpopulations. In the Bayesian framework, this entails the specification of suitable prior distributions for the weights and location parameters of the mixture. Widely used are Bayesian semi-parametric models based on mixtures with infinite or random number of components, such as Dirichlet process mixtures or mixtures with random number of components. Key in this context is the choice of the kernel for cluster identification. Despite their popularity, the flexibility of these models and prior distributions often does not translate into interpretability of the identified clusters. To overcome this issue, clustering methods based on repulsive mixtures have been recently proposed. The basic idea is to include a repulsive term in the prior distribution of the atoms of the mixture, which favours mixture locations far apart. This approach is increasingly popular and allows one to produce well-separated clusters, thus facilitating the interpretation of the results. However, the resulting models are usually not easy to handle due to the introduction of unknown normalising constants. Exploiting results from statistical mechanics, we propose in this work a novel class of repulsive prior distributions based on Gibbs measures. Specifically, we use Gibbs measures associated to joint distributions of eigenvalues of random matrices, which naturally possess a repulsive property. The proposed framework greatly simplifies the computations needed for the use of repulsive mixtures due to the availability of the normalising constant in closed form. We investigate theoretical properties of such class of prior distributions, and illustrate the novel class of priors and their properties, as well as their clustering performance, on benchmark datasets.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2023-06
issn 2331-8422
language eng
recordid cdi_proquest_journals_2828074464
source Free E- Journals
subjects Bayesian analysis
Clustering
Dirichlet problem
Eigenvalues
Heterogeneity
Probabilistic models
Random numbers
Statistical mechanics
title Repulsion, Chaos and Equilibrium in Mixture Models
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T20%3A28%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Repulsion,%20Chaos%20and%20Equilibrium%20in%20Mixture%20Models&rft.jtitle=arXiv.org&rft.au=Cremaschi,%20Andrea&rft.date=2023-06-19&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2828074464%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2828074464&rft_id=info:pmid/&rfr_iscdi=true