Cross-Matched Interval Prevalence of High Dimensional Point Clouds

Topological Data Analysis (TDA) has been applied with success to solve problems across many scientific disciplines. However, in the setting of a point cloud $X$ sampled from a shape $\mathcal{S}$ of low intrinsic dimension embedded within high ambient dimension $\mathbb{R}^D$, persistent homology, a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Mousley, Jonathan M, Bendich, Paul
Format:	Artikel
Sprache:	eng
Schlagworte:	Mathematics - Algebraic Topology
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Mousley, Jonathan M Bendich, Paul
description	Topological Data Analysis (TDA) has been applied with success to solve problems across many scientific disciplines. However, in the setting of a point cloud $X$ sampled from a shape $\mathcal{S}$ of low intrinsic dimension embedded within high ambient dimension $\mathbb{R}^D$, persistent homology, a key element to many TDA pipelines, suffers from two problems. First, when relatively small amounts of noise are introduced to the point cloud, persistent homology is unable to recover the true shape of $\mathcal{S}$. Secondly, the computational complexity of persistent homology scales poorly with the size of a point cloud. Although there is recent work that addresses the first issue via topological bootstrapping methods and topological prevalence, these new techniques still fall victim to the second issue. Here we introduce the cross-matched prevalence image (CMPI), an image which approximates the topological prevalent information of said point cloud, requiring only computations of persistent homology on the scale of samples of the point cloud and not the entire point cloud itself. We compute the CMPI for high dimensional synthetic data, demonstrating that it performs similarly in noise robustness experiments and accurately captures prevalent topological features as compared to previous topological bootstrapping methods.
doi_str_mv	10.48550/arxiv.2411.09797
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2411_09797</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2411_09797</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2411_097973</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DOwNLc052Rwci7KLy7W9U0sSc5ITVHwzCtJLSpLzFEIKEoFUql5yakK-WkKHpnpGQoumbmpecWZ-Xkg6fzMvBIF55z80pRiHgbWtMSc4lReKM3NIO_mGuLsoQu2Lb6gKDM3sagyHmRrPNhWY8IqAFQ6N0E</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Cross-Matched Interval Prevalence of High Dimensional Point Clouds</title><source>arXiv.org</source><creator>Mousley, Jonathan M ; Bendich, Paul</creator><creatorcontrib>Mousley, Jonathan M ; Bendich, Paul</creatorcontrib><description>Topological Data Analysis (TDA) has been applied with success to solve problems across many scientific disciplines. However, in the setting of a point cloud $X$ sampled from a shape $\mathcal{S}$ of low intrinsic dimension embedded within high ambient dimension $\mathbb{R}^D$, persistent homology, a key element to many TDA pipelines, suffers from two problems. First, when relatively small amounts of noise are introduced to the point cloud, persistent homology is unable to recover the true shape of $\mathcal{S}$. Secondly, the computational complexity of persistent homology scales poorly with the size of a point cloud. Although there is recent work that addresses the first issue via topological bootstrapping methods and topological prevalence, these new techniques still fall victim to the second issue. Here we introduce the cross-matched prevalence image (CMPI), an image which approximates the topological prevalent information of said point cloud, requiring only computations of persistent homology on the scale of samples of the point cloud and not the entire point cloud itself. We compute the CMPI for high dimensional synthetic data, demonstrating that it performs similarly in noise robustness experiments and accurately captures prevalent topological features as compared to previous topological bootstrapping methods.</description><identifier>DOI: 10.48550/arxiv.2411.09797</identifier><language>eng</language><subject>Mathematics - Algebraic Topology</subject><creationdate>2024-11</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2411.09797$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2411.09797$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Mousley, Jonathan M</creatorcontrib><creatorcontrib>Bendich, Paul</creatorcontrib><title>Cross-Matched Interval Prevalence of High Dimensional Point Clouds</title><description>Topological Data Analysis (TDA) has been applied with success to solve problems across many scientific disciplines. However, in the setting of a point cloud $X$ sampled from a shape $\mathcal{S}$ of low intrinsic dimension embedded within high ambient dimension $\mathbb{R}^D$, persistent homology, a key element to many TDA pipelines, suffers from two problems. First, when relatively small amounts of noise are introduced to the point cloud, persistent homology is unable to recover the true shape of $\mathcal{S}$. Secondly, the computational complexity of persistent homology scales poorly with the size of a point cloud. Although there is recent work that addresses the first issue via topological bootstrapping methods and topological prevalence, these new techniques still fall victim to the second issue. Here we introduce the cross-matched prevalence image (CMPI), an image which approximates the topological prevalent information of said point cloud, requiring only computations of persistent homology on the scale of samples of the point cloud and not the entire point cloud itself. We compute the CMPI for high dimensional synthetic data, demonstrating that it performs similarly in noise robustness experiments and accurately captures prevalent topological features as compared to previous topological bootstrapping methods.</description><subject>Mathematics - Algebraic Topology</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DOwNLc052Rwci7KLy7W9U0sSc5ITVHwzCtJLSpLzFEIKEoFUql5yakK-WkKHpnpGQoumbmpecWZ-Xkg6fzMvBIF55z80pRiHgbWtMSc4lReKM3NIO_mGuLsoQu2Lb6gKDM3sagyHmRrPNhWY8IqAFQ6N0E</recordid><startdate>20241114</startdate><enddate>20241114</enddate><creator>Mousley, Jonathan M</creator><creator>Bendich, Paul</creator><scope>AKZ</scope><scope>GOX</scope></search><sort><creationdate>20241114</creationdate><title>Cross-Matched Interval Prevalence of High Dimensional Point Clouds</title><author>Mousley, Jonathan M ; Bendich, Paul</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2411_097973</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Mathematics - Algebraic Topology</topic><toplevel>online_resources</toplevel><creatorcontrib>Mousley, Jonathan M</creatorcontrib><creatorcontrib>Bendich, Paul</creatorcontrib><collection>arXiv Mathematics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mousley, Jonathan M</au><au>Bendich, Paul</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cross-Matched Interval Prevalence of High Dimensional Point Clouds</atitle><date>2024-11-14</date><risdate>2024</risdate><abstract>Topological Data Analysis (TDA) has been applied with success to solve problems across many scientific disciplines. However, in the setting of a point cloud $X$ sampled from a shape $\mathcal{S}$ of low intrinsic dimension embedded within high ambient dimension $\mathbb{R}^D$, persistent homology, a key element to many TDA pipelines, suffers from two problems. First, when relatively small amounts of noise are introduced to the point cloud, persistent homology is unable to recover the true shape of $\mathcal{S}$. Secondly, the computational complexity of persistent homology scales poorly with the size of a point cloud. Although there is recent work that addresses the first issue via topological bootstrapping methods and topological prevalence, these new techniques still fall victim to the second issue. Here we introduce the cross-matched prevalence image (CMPI), an image which approximates the topological prevalent information of said point cloud, requiring only computations of persistent homology on the scale of samples of the point cloud and not the entire point cloud itself. We compute the CMPI for high dimensional synthetic data, demonstrating that it performs similarly in noise robustness experiments and accurately captures prevalent topological features as compared to previous topological bootstrapping methods.</abstract><doi>10.48550/arxiv.2411.09797</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2411.09797
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2411_09797
source	arXiv.org
subjects	Mathematics - Algebraic Topology
title	Cross-Matched Interval Prevalence of High Dimensional Point Clouds
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T14%3A30%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cross-Matched%20Interval%20Prevalence%20of%20High%20Dimensional%20Point%20Clouds&rft.au=Mousley,%20Jonathan%20M&rft.date=2024-11-14&rft_id=info:doi/10.48550/arxiv.2411.09797&rft_dat=%3Carxiv_GOX%3E2411_09797%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true