Hypergraphs for multiscale cycles in structured data
Scientific data has been growing in both size and complexity across the modern physical, engineering, life and social sciences. Spatial structure, for example, is a hallmark of many of the most important real-world complex systems, but its analysis is fraught with statistical challenges. Topological...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Scientific data has been growing in both size and complexity across the
modern physical, engineering, life and social sciences. Spatial structure, for
example, is a hallmark of many of the most important real-world complex
systems, but its analysis is fraught with statistical challenges. Topological
data analysis can provide a powerful computational window on complex systems.
Here we present a framework to extend and interpret persistent homology
summaries to analyse spatial data across multiple scales. We introduce
hyperTDA, a topological pipeline that unifies local (e.g. geodesic) and global
(e.g. Euclidean) metrics without losing spatial information, even in the
presence of noise. Homology generators offer an elegant and flexible
description of spatial structures and can capture the information computed by
persistent homology in an interpretable way. Here the information computed by
persistent homology is transformed into a weighted hypergraph, where hyperedges
correspond to homology generators. We consider different choices of generators
(e.g. matroid or minimal) and find that centrality and community detection are
robust to either choice. We compare hyperTDA to existing geometric measures and
validate its robustness to noise. We demonstrate the power of computing
higher-order topological structures on spatial curves arising frequently in
ecology, biophysics, and biology, but also in high-dimensional financial
datasets. We find that hyperTDA can select between synthetic trajectories from
the landmark 2020 AnDi challenge and quantifies movements of different animal
species, even when data is limited. |
---|---|
DOI: | 10.48550/arxiv.2210.07545 |