Galaxy Morphological Classification with Manifold Learning

This paper describes applying manifold learning, the novel technique of dimensionality reduction, to the images of the Galaxy Zoo DECaLs database with the purpose of building an unsupervised learning model for galaxy morphological classification. The manifold learning method assumes that data points...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-12
Hauptverfasser: Semenov, Vasyl, Tymchyshyn, Vitalii, Bezguba, Volodymyr, Tsizh, Maksym
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper describes applying manifold learning, the novel technique of dimensionality reduction, to the images of the Galaxy Zoo DECaLs database with the purpose of building an unsupervised learning model for galaxy morphological classification. The manifold learning method assumes that data points can be projected from a manifold in high-dimensional space to a lower-dimensional Euclidean one while maintaining proximity between the points. In our case, data points are photos of galaxies from the Galaxy Zoo DECaLs database, which consists of more than 300,000 human-labeled galaxies of different morphological types. The dimensionality of such data points is equal to the number of pixels in a photo, so dimensionality reduction becomes a handy idea to help one with the successive clusterization of the data. We perform it using Locally Linear Embedding, a manifold learning algorithm, designed to deal with complex high-dimensional manifolds where the data points are originally located. After the dimensionality reduction, we perform the classification procedure on the dataset. In particular, we train our model to distinguish between round and cigar-shaped elliptical galaxies, smooth and featured spiral galaxies, and galaxies with and without disks viewed edge-on. In each of these cases, the number of classes is pre-determined. The last step in our pipeline is k-means clustering by silhouette or elbow method in lower-dimensional space. In the final case of unsupervised classification of the whole dataset, we determine that the optimal number of morphological classes of galaxies coincides with the number of classes defined by human astronomers, further confirming the feasibility and efficiency of manifold learning for this task.
ISSN:2331-8422