A Greedy and Optimistic Approach to Clustering with a Specified Uncertainty of Covariates
In this study, we examine a clustering problem in which the covariates of each individual element in a dataset are associated with an uncertainty specific to that element. More specifically, we consider a clustering approach in which a pre-processing applying a non-linear transformation to the covar...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this study, we examine a clustering problem in which the covariates of
each individual element in a dataset are associated with an uncertainty
specific to that element. More specifically, we consider a clustering approach
in which a pre-processing applying a non-linear transformation to the
covariates is used to capture the hidden data structure. To this end, we
approximate the sets representing the propagated uncertainty for the
pre-processed features empirically. To exploit the empirical uncertainty sets,
we propose a greedy and optimistic clustering (GOC) algorithm that finds better
feature candidates over such sets, yielding more condensed clusters. As an
important application, we apply the GOC algorithm to synthetic datasets of the
orbital properties of stars generated through our numerical simulation
mimicking the formation process of the Milky Way. The GOC algorithm
demonstrates an improved performance in finding sibling stars originating from
the same dwarf galaxy. These realistic datasets have also been made publicly
available. |
---|---|
DOI: | 10.48550/arxiv.2204.08205 |