Cross-Matched Interval Prevalence of High Dimensional Point Clouds

Topological Data Analysis (TDA) has been applied with success to solve problems across many scientific disciplines. However, in the setting of a point cloud $X$ sampled from a shape $\mathcal{S}$ of low intrinsic dimension embedded within high ambient dimension $\mathbb{R}^D$, persistent homology, a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Mousley, Jonathan M, Bendich, Paul
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Mousley, Jonathan M
Bendich, Paul
description Topological Data Analysis (TDA) has been applied with success to solve problems across many scientific disciplines. However, in the setting of a point cloud $X$ sampled from a shape $\mathcal{S}$ of low intrinsic dimension embedded within high ambient dimension $\mathbb{R}^D$, persistent homology, a key element to many TDA pipelines, suffers from two problems. First, when relatively small amounts of noise are introduced to the point cloud, persistent homology is unable to recover the true shape of $\mathcal{S}$. Secondly, the computational complexity of persistent homology scales poorly with the size of a point cloud. Although there is recent work that addresses the first issue via topological bootstrapping methods and topological prevalence, these new techniques still fall victim to the second issue. Here we introduce the cross-matched prevalence image (CMPI), an image which approximates the topological prevalent information of said point cloud, requiring only computations of persistent homology on the scale of samples of the point cloud and not the entire point cloud itself. We compute the CMPI for high dimensional synthetic data, demonstrating that it performs similarly in noise robustness experiments and accurately captures prevalent topological features as compared to previous topological bootstrapping methods.
doi_str_mv 10.48550/arxiv.2411.09797
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2411_09797</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2411_09797</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2411_097973</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DOwNLc052Rwci7KLy7W9U0sSc5ITVHwzCtJLSpLzFEIKEoFUql5yakK-WkKHpnpGQoumbmpecWZ-Xkg6fzMvBIF55z80pRiHgbWtMSc4lReKM3NIO_mGuLsoQu2Lb6gKDM3sagyHmRrPNhWY8IqAFQ6N0E</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Cross-Matched Interval Prevalence of High Dimensional Point Clouds</title><source>arXiv.org</source><creator>Mousley, Jonathan M ; Bendich, Paul</creator><creatorcontrib>Mousley, Jonathan M ; Bendich, Paul</creatorcontrib><description>Topological Data Analysis (TDA) has been applied with success to solve problems across many scientific disciplines. However, in the setting of a point cloud $X$ sampled from a shape $\mathcal{S}$ of low intrinsic dimension embedded within high ambient dimension $\mathbb{R}^D$, persistent homology, a key element to many TDA pipelines, suffers from two problems. First, when relatively small amounts of noise are introduced to the point cloud, persistent homology is unable to recover the true shape of $\mathcal{S}$. Secondly, the computational complexity of persistent homology scales poorly with the size of a point cloud. Although there is recent work that addresses the first issue via topological bootstrapping methods and topological prevalence, these new techniques still fall victim to the second issue. Here we introduce the cross-matched prevalence image (CMPI), an image which approximates the topological prevalent information of said point cloud, requiring only computations of persistent homology on the scale of samples of the point cloud and not the entire point cloud itself. We compute the CMPI for high dimensional synthetic data, demonstrating that it performs similarly in noise robustness experiments and accurately captures prevalent topological features as compared to previous topological bootstrapping methods.</description><identifier>DOI: 10.48550/arxiv.2411.09797</identifier><language>eng</language><subject>Mathematics - Algebraic Topology</subject><creationdate>2024-11</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2411.09797$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2411.09797$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Mousley, Jonathan M</creatorcontrib><creatorcontrib>Bendich, Paul</creatorcontrib><title>Cross-Matched Interval Prevalence of High Dimensional Point Clouds</title><description>Topological Data Analysis (TDA) has been applied with success to solve problems across many scientific disciplines. However, in the setting of a point cloud $X$ sampled from a shape $\mathcal{S}$ of low intrinsic dimension embedded within high ambient dimension $\mathbb{R}^D$, persistent homology, a key element to many TDA pipelines, suffers from two problems. First, when relatively small amounts of noise are introduced to the point cloud, persistent homology is unable to recover the true shape of $\mathcal{S}$. Secondly, the computational complexity of persistent homology scales poorly with the size of a point cloud. Although there is recent work that addresses the first issue via topological bootstrapping methods and topological prevalence, these new techniques still fall victim to the second issue. Here we introduce the cross-matched prevalence image (CMPI), an image which approximates the topological prevalent information of said point cloud, requiring only computations of persistent homology on the scale of samples of the point cloud and not the entire point cloud itself. We compute the CMPI for high dimensional synthetic data, demonstrating that it performs similarly in noise robustness experiments and accurately captures prevalent topological features as compared to previous topological bootstrapping methods.</description><subject>Mathematics - Algebraic Topology</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DOwNLc052Rwci7KLy7W9U0sSc5ITVHwzCtJLSpLzFEIKEoFUql5yakK-WkKHpnpGQoumbmpecWZ-Xkg6fzMvBIF55z80pRiHgbWtMSc4lReKM3NIO_mGuLsoQu2Lb6gKDM3sagyHmRrPNhWY8IqAFQ6N0E</recordid><startdate>20241114</startdate><enddate>20241114</enddate><creator>Mousley, Jonathan M</creator><creator>Bendich, Paul</creator><scope>AKZ</scope><scope>GOX</scope></search><sort><creationdate>20241114</creationdate><title>Cross-Matched Interval Prevalence of High Dimensional Point Clouds</title><author>Mousley, Jonathan M ; Bendich, Paul</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2411_097973</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Mathematics - Algebraic Topology</topic><toplevel>online_resources</toplevel><creatorcontrib>Mousley, Jonathan M</creatorcontrib><creatorcontrib>Bendich, Paul</creatorcontrib><collection>arXiv Mathematics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mousley, Jonathan M</au><au>Bendich, Paul</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cross-Matched Interval Prevalence of High Dimensional Point Clouds</atitle><date>2024-11-14</date><risdate>2024</risdate><abstract>Topological Data Analysis (TDA) has been applied with success to solve problems across many scientific disciplines. However, in the setting of a point cloud $X$ sampled from a shape $\mathcal{S}$ of low intrinsic dimension embedded within high ambient dimension $\mathbb{R}^D$, persistent homology, a key element to many TDA pipelines, suffers from two problems. First, when relatively small amounts of noise are introduced to the point cloud, persistent homology is unable to recover the true shape of $\mathcal{S}$. Secondly, the computational complexity of persistent homology scales poorly with the size of a point cloud. Although there is recent work that addresses the first issue via topological bootstrapping methods and topological prevalence, these new techniques still fall victim to the second issue. Here we introduce the cross-matched prevalence image (CMPI), an image which approximates the topological prevalent information of said point cloud, requiring only computations of persistent homology on the scale of samples of the point cloud and not the entire point cloud itself. We compute the CMPI for high dimensional synthetic data, demonstrating that it performs similarly in noise robustness experiments and accurately captures prevalent topological features as compared to previous topological bootstrapping methods.</abstract><doi>10.48550/arxiv.2411.09797</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2411.09797
ispartof
issn
language eng
recordid cdi_arxiv_primary_2411_09797
source arXiv.org
subjects Mathematics - Algebraic Topology
title Cross-Matched Interval Prevalence of High Dimensional Point Clouds
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T14%3A30%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cross-Matched%20Interval%20Prevalence%20of%20High%20Dimensional%20Point%20Clouds&rft.au=Mousley,%20Jonathan%20M&rft.date=2024-11-14&rft_id=info:doi/10.48550/arxiv.2411.09797&rft_dat=%3Carxiv_GOX%3E2411_09797%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true