Designing Feature Vector Representations: A case study from Chemistry
We present a case study investigating feature descriptors in the context of the analysis of chemical multivariate ensemble data. The data of each ensemble member consists of three parts: the design parameters for each ensemble member, field data resulting from the numerical simulations, and physical...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present a case study investigating feature descriptors in the context of
the analysis of chemical multivariate ensemble data. The data of each ensemble
member consists of three parts: the design parameters for each ensemble member,
field data resulting from the numerical simulations, and physical properties of
the molecules. Since feature-based methods have the potential to reduce the
data complexity and facilitate comparison and clustering, we are focusing on
such methods. However, there are many options to design the feature vector
representation and there is no obvious preference. To get a better
understanding of the different representations, we analyze their similarities
and differences. Thereby, we focus on three characteristics derived from the
representations: the distribution of pairwise distances, the clustering
tendency, and the rank-order of the pairwise distances. The results of our
investigations partially confirmed expected behavior, but also provided some
surprising observations that can be used for the future development of feature
representations in the chemical domain. |
---|---|
DOI: | 10.48550/arxiv.2212.03731 |