Multinomial belief networks for healthcare data

Healthcare data from patient or population cohorts are often characterized by sparsity, high missingness and relatively small sample sizes. In addition, being able to quantify uncertainty is often important in a medical context. To address these analytical requirements we propose a deep generative B...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Donker, H. C, Neijzen, D, de Jong, J, Lunter, G. A
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Learning Statistics - Applications Statistics - Machine Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Donker, H. C Neijzen, D de Jong, J Lunter, G. A
description	Healthcare data from patient or population cohorts are often characterized by sparsity, high missingness and relatively small sample sizes. In addition, being able to quantify uncertainty is often important in a medical context. To address these analytical requirements we propose a deep generative Bayesian model for multinomial count data. We develop a collapsed Gibbs sampling procedure that takes advantage of a series of augmentation relations, inspired by the Zhou$\unicode{x2013}$Cong$\unicode{x2013}$Chen model. We visualise the model's ability to identify coherent substructures in the data using a dataset of handwritten digits. We then apply it to a large experimental dataset of DNA mutations in cancer and show that we can identify biologically meaningful clusters of mutational signatures in a fully data-driven way.
doi_str_mv	10.48550/arxiv.2311.16909
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2311_16909</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2311_16909</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2311_169093</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjY01DM0szSw5GTQ9y3NKcnMy8_NTMxRSErNyUxNU8hLLSnPL8ouVkjLL1LISE3MKclITixKVUhJLEnkYWBNS8wpTuWF0twM8m6uIc4eumCj4wuKMnMTiyrjQVbEg60wJqwCAD28MMQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Multinomial belief networks for healthcare data</title><source>arXiv.org</source><creator>Donker, H. C ; Neijzen, D ; de Jong, J ; Lunter, G. A</creator><creatorcontrib>Donker, H. C ; Neijzen, D ; de Jong, J ; Lunter, G. A</creatorcontrib><description>Healthcare data from patient or population cohorts are often characterized by sparsity, high missingness and relatively small sample sizes. In addition, being able to quantify uncertainty is often important in a medical context. To address these analytical requirements we propose a deep generative Bayesian model for multinomial count data. We develop a collapsed Gibbs sampling procedure that takes advantage of a series of augmentation relations, inspired by the Zhou$\unicode{x2013}$Cong$\unicode{x2013}$Chen model. We visualise the model's ability to identify coherent substructures in the data using a dataset of handwritten digits. We then apply it to a large experimental dataset of DNA mutations in cancer and show that we can identify biologically meaningful clusters of mutational signatures in a fully data-driven way.</description><identifier>DOI: 10.48550/arxiv.2311.16909</identifier><language>eng</language><subject>Computer Science - Learning ; Statistics - Applications ; Statistics - Machine Learning</subject><creationdate>2023-11</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2311.16909$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2311.16909$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Donker, H. C</creatorcontrib><creatorcontrib>Neijzen, D</creatorcontrib><creatorcontrib>de Jong, J</creatorcontrib><creatorcontrib>Lunter, G. A</creatorcontrib><title>Multinomial belief networks for healthcare data</title><description>Healthcare data from patient or population cohorts are often characterized by sparsity, high missingness and relatively small sample sizes. In addition, being able to quantify uncertainty is often important in a medical context. To address these analytical requirements we propose a deep generative Bayesian model for multinomial count data. We develop a collapsed Gibbs sampling procedure that takes advantage of a series of augmentation relations, inspired by the Zhou$\unicode{x2013}$Cong$\unicode{x2013}$Chen model. We visualise the model's ability to identify coherent substructures in the data using a dataset of handwritten digits. We then apply it to a large experimental dataset of DNA mutations in cancer and show that we can identify biologically meaningful clusters of mutational signatures in a fully data-driven way.</description><subject>Computer Science - Learning</subject><subject>Statistics - Applications</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjY01DM0szSw5GTQ9y3NKcnMy8_NTMxRSErNyUxNU8hLLSnPL8ouVkjLL1LISE3MKclITixKVUhJLEnkYWBNS8wpTuWF0twM8m6uIc4eumCj4wuKMnMTiyrjQVbEg60wJqwCAD28MMQ</recordid><startdate>20231128</startdate><enddate>20231128</enddate><creator>Donker, H. C</creator><creator>Neijzen, D</creator><creator>de Jong, J</creator><creator>Lunter, G. A</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20231128</creationdate><title>Multinomial belief networks for healthcare data</title><author>Donker, H. C ; Neijzen, D ; de Jong, J ; Lunter, G. A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2311_169093</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Learning</topic><topic>Statistics - Applications</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Donker, H. C</creatorcontrib><creatorcontrib>Neijzen, D</creatorcontrib><creatorcontrib>de Jong, J</creatorcontrib><creatorcontrib>Lunter, G. A</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Donker, H. C</au><au>Neijzen, D</au><au>de Jong, J</au><au>Lunter, G. A</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multinomial belief networks for healthcare data</atitle><date>2023-11-28</date><risdate>2023</risdate><abstract>Healthcare data from patient or population cohorts are often characterized by sparsity, high missingness and relatively small sample sizes. In addition, being able to quantify uncertainty is often important in a medical context. To address these analytical requirements we propose a deep generative Bayesian model for multinomial count data. We develop a collapsed Gibbs sampling procedure that takes advantage of a series of augmentation relations, inspired by the Zhou$\unicode{x2013}$Cong$\unicode{x2013}$Chen model. We visualise the model's ability to identify coherent substructures in the data using a dataset of handwritten digits. We then apply it to a large experimental dataset of DNA mutations in cancer and show that we can identify biologically meaningful clusters of mutational signatures in a fully data-driven way.</abstract><doi>10.48550/arxiv.2311.16909</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2311.16909
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2311_16909
source	arXiv.org
subjects	Computer Science - Learning Statistics - Applications Statistics - Machine Learning
title	Multinomial belief networks for healthcare data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-15T12%3A48%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multinomial%20belief%20networks%20for%20healthcare%20data&rft.au=Donker,%20H.%20C&rft.date=2023-11-28&rft_id=info:doi/10.48550/arxiv.2311.16909&rft_dat=%3Carxiv_GOX%3E2311_16909%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true